Commit Graph

7 Commits

Author SHA1 Message Date
Wing Lian
77fca25f1b 4bit quantized support (wip) 2023-04-17 11:37:39 -04:00
Wing Lian
d1aed4c8e5 deepspeed doesn't work with flash-attn, and the gpu savings w flash attn are better than the deepspeed headaches 2023-04-16 06:59:47 -04:00
Wing Lian
05fffb53b4 more logging, wandb fixes 2023-04-15 13:37:17 -04:00
Wing Lian
b164725417 improve prepared dataset loading, fix inference 2023-04-15 12:14:52 -04:00
Wing Lian
949a27be21 more fixes and prep for llama training 2023-04-14 18:30:09 -04:00
Wing Lian
8d959a7e26 make it work with pythia in the cloud 2023-04-14 07:24:55 -04:00
Wing Lian
ce24f5e246 WIP for axolotl trainer 2023-04-14 00:20:05 -04:00