FSDP2 support (#2469)

* fsdp2 support

* use accelerate release 1.6.0

* allow 8bit optims with fsdp2

* liger + torch compile fix

* add fsdp2 e2e tests

* use transformers commit with fsdp2 support

* skip zero3 tests for this PR for now

* fix fsdp2 config for ci

* make sure both flex and flash attn work with fsdp2, skip fix untrained tokens

* okay, actually use fdsp2...

* more fixes to flex for fsdp2

* make sure to patch all the loaded models

* additional validation for fsdp2, bump dep versions
This commit is contained in:
Wing Lian
2025-04-06 17:08:01 -04:00
committed by GitHub
parent a8f38c367c
commit 5f4af3665d
9 changed files with 316 additions and 39 deletions

View File

@@ -12,12 +12,12 @@ liger-kernel==0.5.5
packaging==23.2
peft==0.15.0
transformers==4.50.3
transformers==4.51.0
tokenizers>=0.21.1
accelerate==1.5.2
accelerate==1.6.0
datasets==3.5.0
deepspeed==0.15.4
trl==0.16.0
deepspeed>=0.15.4
trl==0.16.1
optimum==1.16.2
hf_transfer