Commit Graph

6 Commits

Author SHA1 Message Date
Haoxiang Wang
476a205cea Remove learning rate scheduler in deepspeed config to avoid conflict (#909) 2023-12-04 05:17:38 -05:00
Teknium
d3193beac3 Fix Deepspeed Zero3 Config (#791)
* Update zero3.json

Take away CPU Offload by default (Slows things down horribly, better off reducing batchsize), and changes LR Scheduler to a properly decaying one

* Update zero3.json

fix something
2023-10-27 21:57:02 -04:00
Wing Lian
c25ba7939b update README w deepspeed info (#605) 2023-09-22 00:15:52 -04:00
Wing Lian
3b18c963cc set auto for other params that hf trainer sets for ds. include zero1 json (#570) 2023-09-14 11:04:37 -04:00
Aman Gupta Karmani
1e07c162f1 set zero3 optimizer betas to auto so they inherit from HF trainer config (#507) 2023-08-30 08:10:33 -04:00
Wing Lian
bb53a165f5 add a basic ds zero3 config (#347)
better defaults for ds
2023-08-06 17:19:51 -04:00