FDSP + QLoRA (#1378)

* wip qlora + fsdp fixes * more fixes * make sure to load the lora 🤦 * only setup quantized meta on non-zero rank: * only run setup_quantized_peft_meta_for_training for qlora+fsdp * more fixes for qlora+fsdp * chore: lint * add example yml * support mistral too * fix for model_type and add mixtral support too * set cpu_offload: false to reduce vram, constrain new accleerator logic to qlora + fsdp * refactor for duplicate code
2024-03-08 14:31:01 -05:00
parent 638c2dafb5
commit 9b6ee83a73
8 changed files with 502 additions and 9 deletions
--- a/requirements.txt
+++ b/requirements.txt
@@ -3,7 +3,7 @@ packaging==23.2
 peft==0.9.0
 transformers==4.38.2
 tokenizers==0.15.0
-bitsandbytes>=0.41.1
+bitsandbytes>=0.43.0
 accelerate==0.26.1
 deepspeed==0.13.1
 pydantic==2.6.3
@@ -40,3 +40,4 @@ gcsfs
 # adlfs

 trl>=0.7.9
+fastcore>=1.5.29