ScatterMoE LoRA support (#3410)

* scattermoe lora support

* fsdp, bf16, dim fixes

* expert weights aren't needed in save for bwd since they are frozen

* use sonicmoe optim options

* update save model from upstream

* fixes per code review feedback and add tests

* revert removal of CP fix

* misc fixes
This commit is contained in:
Wing Lian
2026-02-24 14:59:55 -05:00
committed by GitHub
parent 08441fed17
commit 68f1b7004c
17 changed files with 4146 additions and 29 deletions

View File

@@ -18,7 +18,7 @@ datasets==4.5.0
deepspeed>=0.18.3
trl==0.28.0
hf_xet==1.2.0
kernels==0.11.5
kernels==0.12.1
trackio>=0.16.1
typing-extensions>=4.15.0