FDSP + QLoRA (#1378)
* wip qlora + fsdp fixes
* more fixes
* make sure to load the lora 🤦
* only setup quantized meta on non-zero rank:
* only run setup_quantized_peft_meta_for_training for qlora+fsdp
* more fixes for qlora+fsdp
* chore: lint
* add example yml
* support mistral too
* fix for model_type and add mixtral support too
* set cpu_offload: false to reduce vram, constrain new accleerator logic to qlora + fsdp
* refactor for duplicate code
This commit is contained in:
@@ -3,7 +3,7 @@ packaging==23.2
|
||||
peft==0.9.0
|
||||
transformers==4.38.2
|
||||
tokenizers==0.15.0
|
||||
bitsandbytes>=0.41.1
|
||||
bitsandbytes>=0.43.0
|
||||
accelerate==0.26.1
|
||||
deepspeed==0.13.1
|
||||
pydantic==2.6.3
|
||||
@@ -40,3 +40,4 @@ gcsfs
|
||||
# adlfs
|
||||
|
||||
trl>=0.7.9
|
||||
fastcore>=1.5.29
|
||||
|
||||
Reference in New Issue
Block a user