support for mamba (#915)

* support for mamba

* more mamba fixes

* use fork for mamba kwargs fix

* grad checkpointing doesn't work

* fix extras for mamaba

* mamba loss fix

* use fp32 and remove verbose logging

* mamba fixes

* fix collator for mamba

* set model_type on training_args

* don't save safetensors for mamba

* update mamba config to disable safetensor checkpooints, install for tests

* no evals for mamba tests

* handle save_pretrained

* handle unused safetensors arg
This commit is contained in:
Wing Lian
2023-12-09 12:10:41 -05:00
committed by GitHub
parent d339beb9d9
commit 40a6362c92
12 changed files with 447 additions and 24 deletions

View File

@@ -51,5 +51,8 @@ setup(
"deepspeed": [
"deepspeed",
],
"mamba-ssm": [
"mamba-ssm==1.0.1",
],
},
)