axolotl

Author	SHA1	Message	Date
Wing Lian	87e073d0de	fix lora target module, require explicit flash attention, fix min logging steps, don't use adam8bit for int4, hash prepared datasets, support hf hub datasets	2023-04-17 18:01:12 -04:00
Wing Lian	77fca25f1b	4bit quantized support (wip)	2023-04-17 11:37:39 -04:00
Wing Lian	12de7b7cf7	cleanup, prep for 4bit quant support	2023-04-16 11:06:41 -04:00
Wing Lian	d1aed4c8e5	deepspeed doesn't work with flash-attn, and the gpu savings w flash attn are better than the deepspeed headaches	2023-04-16 06:59:47 -04:00
Wing Lian	a4593832a9	fix logging	2023-04-15 23:12:48 -04:00
Wing Lian	23938015c8	prepare datasets only flag	2023-04-15 16:30:55 -04:00
Wing Lian	d33a975747	configure log level, add llama 7b config	2023-04-15 14:24:37 -04:00
Wing Lian	05fffb53b4	more logging, wandb fixes	2023-04-15 13:37:17 -04:00
Wing Lian	2df63ef815	refactor trainer setup to account for deepspeed integration	2023-04-15 12:16:42 -04:00
Wing Lian	b164725417	improve prepared dataset loading, fix inference	2023-04-15 12:14:52 -04:00
Wing Lian	937f44f021	helpful info output	2023-04-15 00:03:43 -04:00
Wing Lian	902dd0ab47	fix issue with completed model being empty see https://github.com/huggingface/peft/issues/286#issuecomment-1501617281	2023-04-14 23:57:55 -04:00
Wing Lian	80b2ed29d8	various bugfixes	2023-04-14 21:37:07 -04:00
Wing Lian	45f77dd51e	bettter handling of llama model import	2023-04-14 19:30:41 -04:00
Wing Lian	949a27be21	more fixes and prep for llama training	2023-04-14 18:30:09 -04:00
Wing Lian	f2a2029d0d	config chooser, update readme instructions, device config, llama flash attention, debug out the labels, fix config key checks, other bugfixes	2023-04-14 12:18:56 -04:00
Wing Lian	a6028d302e	black formatting	2023-04-14 07:25:52 -04:00
Wing Lian	8d959a7e26	make it work with pythia in the cloud	2023-04-14 07:24:55 -04:00
Wing Lian	ce24f5e246	WIP for axolotl trainer	2023-04-14 00:20:05 -04:00

1 2

69 Commits