axolotl/tests/utils/test_train.py at 79ddaebe9a6af7efefebdbb54772d11d09561786

Files

Wing Lian ecbe8b2b61 [GPT-OSS] improve FSDP shard merging and documentation for GPT-OSS (#3073 )

* improve fsdp shard merging

* improve logging

* update information on merging and inferencing GPT-OSS

* cleanup readme

* automate cleanup of FSDP prefix

* import GRPO only if necessary

* only modify config.json on rank0

* merge final checkpoint at end of training

* prevent circular import

* Fix saving for sharded state dict

* devx, move merged to output dir

* move import back to top

* Fix stuck merge

* fix conditionals from pr feedback and add test

2025-08-15 21:25:01 -04:00

766 B

Raw Blame History

View Raw

766 B Raw Blame History

766 B

Raw Blame History