Commit Graph

4 Commits

Author SHA1 Message Date
NanoCode012
946b497c3f feat: add deepspeed 3 with cpuoffload (#1466)
* feat: add deepspeed 3 with cpuoffload

* make bf16 explicit, add param only offload variant

---------

Co-authored-by: Wing Lian <wing.lian@gmail.com>
2024-04-01 21:42:52 +09:00
Seungduk Kim
b0ee9ec734 Set gradient_clipping to auto in DeepSpeed configs (#1382) [skip ci] 2024-03-10 20:50:12 -04:00
Wing Lian
e923e62d24 more checks and fixes for deepspeed and fsdp (#1208) [skip ci] 2024-01-25 20:01:45 -05:00
Wing Lian
54d2ac155b Mixtral fixes 20240124 (#1192) [skip ci]
* mixtral nccl fixes

* make sure to patch for z3
2024-01-24 14:59:57 -05:00