axolotl

Author	SHA1	Message	Date
Wing Lian	8002ffb41f	Merge pull request #177 from NanoCode012/fix/landmark-patch Fix landmark attention patch	2023-06-12 08:27:12 -04:00
Wing Lian	74ef5cc083	Merge pull request #192 from OpenAccess-AI-Collective/sharegpt-custom-prompt misc fixes	2023-06-12 08:26:38 -04:00
Wing Lian	5e616d91c0	Merge branch 'main' into strip-peft-device-map	2023-06-12 08:25:54 -04:00
NanoCode012	8e568bbdae	Merge pull request #159 from AngainorDev/patch-1 Fix training over existing lora	2023-06-12 20:27:11 +09:00
Wing Lian	c7dee56b87	add typehints	2023-06-11 19:52:34 -04:00
Wing Lian	aac4b7691e	add new sharegpt, refactor prompt so it can be customized later, add exception if no data is processed	2023-06-11 19:42:25 -04:00
Wing Lian	c9a149f9e8	add check for attr	2023-06-11 10:11:17 -04:00
Wing Lian	14668fa54e	new validation for mpt w grad checkpoints	2023-06-11 09:26:10 -04:00
AngainorDev	b565ecf0a1	Fix strict and Lint	2023-06-11 15:23:38 +02:00
Wing Lian	fe0b76854e	match up gradient checkpointing when using lora w config	2023-06-11 09:20:40 -04:00
NanoCode012	974dc00a7d	Fix set mem_id for inference and refactor	2023-06-11 14:00:54 +09:00
NanoCode012	a6190c8094	Clean up landmark patching	2023-06-11 11:59:03 +09:00
NanoCode012	563b6d89e6	Fix undefined LlamaForCausalLM and del try except	2023-06-11 11:58:31 +09:00
Wing Lian	cd0a6f6027	peft no longer needs device_map	2023-06-10 22:50:09 -04:00
NanoCode012	e285e24f7f	Address PR suggestion Co-authored-by: Wing Lian <wing.lian@gmail.com>	2023-06-11 10:52:12 +09:00
NanoCode012	919727b4d7	Refactor landmark attention patch	2023-06-11 10:51:05 +09:00
Wing Lian	958da70376	fix formatting	2023-06-10 15:28:08 -04:00
Angainor Development	a808bf913f	Fix missing cfg.	2023-06-10 20:28:49 +02:00
Wing Lian	01248253a3	Merge pull request #182 from OpenAccess-AI-Collective/fix-llama-ref fix for local variable 'LlamaForCausalLM' referenced before assignment	2023-06-10 14:25:51 -04:00
Wing Lian	0c6f928601	address PR feedback	2023-06-10 14:23:56 -04:00
Wing Lian	eea2731a5e	add streaming dataset support for pretraining datasets	2023-06-10 14:23:56 -04:00
Wing Lian	ab5cd28acf	more gpt-neox long ctx fixes	2023-06-10 14:23:55 -04:00
Wing Lian	1a82082e91	fix bettertransformers save, force it to skip after saving correctly in callback	2023-06-10 14:23:55 -04:00
Wing Lian	1210dc8fd5	more tweaks to do pre-training with bettertransformers	2023-06-10 14:23:55 -04:00
Wing Lian	488a67d75a	experimental expansion of ctx len	2023-06-10 14:23:53 -04:00
Wing Lian	71a43f8479	add validation/warning for bettertransformers and torch version	2023-06-10 14:22:31 -04:00
Wing Lian	1edc30c786	add support for opimum bettertransformers	2023-06-10 14:22:30 -04:00
Wing Lian	14163c15d9	fix for local variable 'LlamaForCausalLM' referenced before assignment	2023-06-10 14:11:13 -04:00
Angainor Development	79e2a6f140	Merge branch 'main' into patch-1	2023-06-10 19:07:54 +02:00
Wing Lian	a03a7d7d8b	add support to extend context with xpos rope	2023-06-10 10:29:46 -04:00
Wing Lian	7f09106437	fix for max sequence len across different model types	2023-06-09 20:42:33 -04:00
NanoCode012	aefb2fc681	Fix backward compat for peft	2023-06-10 07:46:36 +09:00
Angainor Development	813cfa4c14	WIP: Rely on cfg.inference	2023-06-09 08:49:32 +02:00
NanoCode012	2a801b001a	Fix grad checkpoint and outputs param	2023-06-09 14:28:44 +09:00
NanoCode012	e44c9e0b3e	Fix patching via import instead of hijacking	2023-06-09 14:27:24 +09:00
NanoCode012	55b8542de8	Feat: Add landmark attention	2023-06-09 12:54:08 +09:00
Bruno Cabral	f4df266842	Disable Wandb	2023-06-08 21:02:02 -03:00
NanoCode012	2ef4634d45	Refactor out unmodified save_steps and eval_steps	2023-06-09 01:23:13 +09:00
NanoCode012	2cfe9e9b16	Set to use cfg.seed or 42 for backward compat	2023-06-09 01:02:36 +09:00
NanoCode012	bfd27ba55e	Fix failing test	2023-06-09 00:35:03 +09:00
NanoCode012	babf0fdb71	Validate falcon with fsdp	2023-06-09 00:29:04 +09:00
NanoCode012	df9528f865	Fix future deprecate prepare_model_for_int8_training	2023-06-08 21:42:10 +09:00
Angainor Development	193c73bce0	Fix training over existing lora When training with Lora, and starting with an existing lora weights, current code produces a model with 0 trainable params and training can't work. Adding the "is_trainable" param allows the loaded peft to be trained and fixes the bug.	2023-06-08 09:18:58 +02:00
Wing Lian	59bb2197ed	fix camel ai, add guanaco/oasst mapping for sharegpt	2023-06-07 09:51:29 -04:00
Wing Lian	4ac9e251b7	new prompters, misc fixes for output dir missing using fsdp, and changing max seq len	2023-06-05 22:41:00 -04:00
NanoCode012	3c71c8debe	Update doc for grad_accu and add validation tests for batch size	2023-06-01 06:13:47 +09:00
Wing Lian	5a631b305b	fix batch size calculation	2023-05-31 14:11:32 -04:00
Wing Lian	9b8585dc70	fix packing so that concatenated sequences reset the attention	2023-05-31 11:38:52 -04:00
Wing Lian	2d0ba3b818	Merge pull request #124 from OpenAccess-AI-Collective/xformers-fix copy xformers attn from ooba since we removed dep on alpaca_lora_4bit	2023-05-31 00:11:40 -04:00
Wing Lian	c7021e191f	Merge pull request #120 from OpenAccess-AI-Collective/model-from-path split up llama model loading so config can be loaded from base config and models can be loaded from a path	2023-05-31 00:08:38 -04:00

... 6 7 8 9 10 ...

571 Commits