Files
axolotl/examples/alst/llama3-8b-fsdp2-alst.yaml
Wing Lian f7ea140838 TiledMLP support for FSDP2 (#2950)
* make TiledMLP work with FSDP

* cleanup/gc at start of train to prevent large VRAM spike

* chore: lint

* generic function for non-deepspeed training

* unify patch to fix imports

* update readme for ALST and add examples

* make deepspeed attribute on params check more robust

* update with new info from PR review
2025-07-25 07:15:03 -04:00

1.4 KiB