fix optimizer reset for relora sft (#1414)

* fix optimizer reset

* set states to reset for 8bit optimizers and handle quantile runtime error for embeddings

* fix relora test to check grad_norm

* use flash attn for relora and tweak hyperparams for test

* fix messages field for test dataset
This commit is contained in:
Wing Lian
2024-12-03 08:58:23 -05:00
committed by GitHub
parent 81ef3e45f7
commit 1ef70312ba
4 changed files with 64 additions and 30 deletions

View File

@@ -2,4 +2,3 @@ pre-commit
black
mypy
types-requests
tbparse