Commit Graph

6 Commits

Author SHA1 Message Date
Wing Lian
e65aeedce7 fix relative path for fixtures 2023-05-30 10:38:20 -04:00
Wing Lian
9190ada23a 8bit and deepspeed changes 2023-04-30 06:50:35 -04:00
Wing Lian
4dbef0941f update ds_config 2023-04-30 04:24:58 -04:00
Wing Lian
4f2584f2dc shuffle and split dataset after save/load 2023-04-24 09:41:35 -04:00
Wing Lian
d1aed4c8e5 deepspeed doesn't work with flash-attn, and the gpu savings w flash attn are better than the deepspeed headaches 2023-04-16 06:59:47 -04:00
Wing Lian
05fffb53b4 more logging, wandb fixes 2023-04-15 13:37:17 -04:00