FDSP + QLoRA (#1378)

* wip qlora + fsdp fixes

* more fixes

* make sure to load the lora 🤦

* only setup quantized meta on non-zero rank:

* only run setup_quantized_peft_meta_for_training for qlora+fsdp

* more fixes for qlora+fsdp

* chore: lint

* add example yml

* support mistral too

* fix for model_type and add mixtral support too

* set cpu_offload: false to reduce vram, constrain new accleerator logic to qlora + fsdp

* refactor for duplicate code
This commit is contained in:
Wing Lian
2024-03-08 14:31:01 -05:00
committed by GitHub
parent 638c2dafb5
commit 9b6ee83a73
8 changed files with 502 additions and 9 deletions

View File

@@ -3,7 +3,7 @@ packaging==23.2
peft==0.9.0
transformers==4.38.2
tokenizers==0.15.0
bitsandbytes>=0.41.1
bitsandbytes>=0.43.0
accelerate==0.26.1
deepspeed==0.13.1
pydantic==2.6.3
@@ -40,3 +40,4 @@ gcsfs
# adlfs
trl>=0.7.9
fastcore>=1.5.29