ScatterMoE LoRA support (#3410)
* scattermoe lora support * fsdp, bf16, dim fixes * expert weights aren't needed in save for bwd since they are frozen * use sonicmoe optim options * update save model from upstream * fixes per code review feedback and add tests * revert removal of CP fix * misc fixes
This commit is contained in:
@@ -18,7 +18,7 @@ datasets==4.5.0
|
||||
deepspeed>=0.18.3
|
||||
trl==0.28.0
|
||||
hf_xet==1.2.0
|
||||
kernels==0.11.5
|
||||
kernels==0.12.1
|
||||
|
||||
trackio>=0.16.1
|
||||
typing-extensions>=4.15.0
|
||||
|
||||
Reference in New Issue
Block a user