axolotl

Author	SHA1	Message	Date
Wing Lian	87e073d0de	fix lora target module, require explicit flash attention, fix min logging steps, don't use adam8bit for int4, hash prepared datasets, support hf hub datasets	2023-04-17 18:01:12 -04:00
Wing Lian	77fca25f1b	4bit quantized support (wip)	2023-04-17 11:37:39 -04:00
Wing Lian	d1aed4c8e5	deepspeed doesn't work with flash-attn, and the gpu savings w flash attn are better than the deepspeed headaches	2023-04-16 06:59:47 -04:00
Wing Lian	d060c803ce	add llama 7b config and fiz lora_fan_in_fan_out for llama (copy pasta bug)	2023-04-15 14:26:52 -04:00
Wing Lian	05fffb53b4	more logging, wandb fixes	2023-04-15 13:37:17 -04:00
Wing Lian	b164725417	improve prepared dataset loading, fix inference	2023-04-15 12:14:52 -04:00
Wing Lian	937f44f021	helpful info output	2023-04-15 00:03:43 -04:00
Wing Lian	80b2ed29d8	various bugfixes	2023-04-14 21:37:07 -04:00
Wing Lian	949a27be21	more fixes and prep for llama training	2023-04-14 18:30:09 -04:00
Wing Lian	f2a2029d0d	config chooser, update readme instructions, device config, llama flash attention, debug out the labels, fix config key checks, other bugfixes	2023-04-14 12:18:56 -04:00
Wing Lian	8d959a7e26	make it work with pythia in the cloud	2023-04-14 07:24:55 -04:00
Wing Lian	ce24f5e246	WIP for axolotl trainer	2023-04-14 00:20:05 -04:00