NanoCode012
631268a0ca
revert renaming of deepspeed stage3 args that use auto ( #2964 ) [skip ci]
...
* Revert "fix deprecate deepspeed stage3_gather_16bit_weights_on_model_save arg…"
This reverts commit e207762928 .
* don't revert the values that don't use 'auto'
---------
Co-authored-by: Wing Lian <wing@axolotl.ai >
2025-07-22 09:59:47 -04:00
Wing Lian
e207762928
fix deprecate deepspeed stage3_gather_16bit_weights_on_model_save arg ( #2956 ) [skip ci]
...
* fix deprecate deepspeed stage3_gather_16bit_weights_on_model_save arg
* replace the rest of the migrated deepspeed params
2025-07-21 11:41:31 -04:00
Wing Lian
d3c45d27b5
fix zero3 ( #1994 )
2024-10-28 07:32:49 -04:00
Seungduk Kim
b0ee9ec734
Set gradient_clipping to auto in DeepSpeed configs ( #1382 ) [skip ci]
2024-03-10 20:50:12 -04:00
Wing Lian
e923e62d24
more checks and fixes for deepspeed and fsdp ( #1208 ) [skip ci]
2024-01-25 20:01:45 -05:00
Wing Lian
54d2ac155b
Mixtral fixes 20240124 ( #1192 ) [skip ci]
...
* mixtral nccl fixes
* make sure to patch for z3
2024-01-24 14:59:57 -05:00