Commit Graph

8 Commits

Author SHA1 Message Date
Wing Lian
5ea3aa31f0 Fix Deepspeed loading (#950)
* add check for zero3

* freeze parameters

* fixes for deepspeed loading

* fix model parameter check

* unfrozen parameters in example mixtral and logging when unfreezing
2023-12-13 16:03:23 -05:00
Haoxiang Wang
476a205cea Remove learning rate scheduler in deepspeed config to avoid conflict (#909) 2023-12-04 05:17:38 -05:00
Teknium
d3193beac3 Fix Deepspeed Zero3 Config (#791)
* Update zero3.json

Take away CPU Offload by default (Slows things down horribly, better off reducing batchsize), and changes LR Scheduler to a properly decaying one

* Update zero3.json

fix something
2023-10-27 21:57:02 -04:00
Wing Lian
c25ba7939b update README w deepspeed info (#605) 2023-09-22 00:15:52 -04:00
Wing Lian
3b18c963cc set auto for other params that hf trainer sets for ds. include zero1 json (#570) 2023-09-14 11:04:37 -04:00
Aman Gupta Karmani
1e07c162f1 set zero3 optimizer betas to auto so they inherit from HF trainer config (#507) 2023-08-30 08:10:33 -04:00
mhenrichsen
3fc9006298 Feat(deepspeed): Add zero2 config (#476)
* zero2 config

* config added

* linting

---------

Co-authored-by: mhenrichsen <some_email@hey.com>
2023-08-27 10:10:33 +09:00
Wing Lian
bb53a165f5 add a basic ds zero3 config (#347)
better defaults for ds
2023-08-06 17:19:51 -04:00