Wing Lian
|
e65aeedce7
|
fix relative path for fixtures
|
2023-05-30 10:38:20 -04:00 |
|
Wing Lian
|
9190ada23a
|
8bit and deepspeed changes
|
2023-04-30 06:50:35 -04:00 |
|
Wing Lian
|
4dbef0941f
|
update ds_config
|
2023-04-30 04:24:58 -04:00 |
|
Wing Lian
|
4f2584f2dc
|
shuffle and split dataset after save/load
|
2023-04-24 09:41:35 -04:00 |
|
Wing Lian
|
d1aed4c8e5
|
deepspeed doesn't work with flash-attn, and the gpu savings w flash attn are better than the deepspeed headaches
|
2023-04-16 06:59:47 -04:00 |
|
Wing Lian
|
05fffb53b4
|
more logging, wandb fixes
|
2023-04-15 13:37:17 -04:00 |
|