axolotl

Author	SHA1	Message	Date
Wing Lian	77fca25f1b	4bit quantized support (wip)	2023-04-17 11:37:39 -04:00
Wing Lian	d1aed4c8e5	deepspeed doesn't work with flash-attn, and the gpu savings w flash attn are better than the deepspeed headaches	2023-04-16 06:59:47 -04:00
Wing Lian	d060c803ce	add llama 7b config and fiz lora_fan_in_fan_out for llama (copy pasta bug)	2023-04-15 14:26:52 -04:00
Wing Lian	05fffb53b4	more logging, wandb fixes	2023-04-15 13:37:17 -04:00
Wing Lian	b164725417	improve prepared dataset loading, fix inference	2023-04-15 12:14:52 -04:00
Wing Lian	937f44f021	helpful info output	2023-04-15 00:03:43 -04:00
Wing Lian	80b2ed29d8	various bugfixes	2023-04-14 21:37:07 -04:00
Wing Lian	949a27be21	more fixes and prep for llama training	2023-04-14 18:30:09 -04:00
Wing Lian	f2a2029d0d	config chooser, update readme instructions, device config, llama flash attention, debug out the labels, fix config key checks, other bugfixes	2023-04-14 12:18:56 -04:00
Wing Lian	8d959a7e26	make it work with pythia in the cloud	2023-04-14 07:24:55 -04:00
Wing Lian	ce24f5e246	WIP for axolotl trainer	2023-04-14 00:20:05 -04:00