axolotl

Author	SHA1	Message	Date
Wing Lian	8d20e0a3d3	initial wip to get sys prompt from dataset	2023-06-25 22:28:07 -04:00
Wing Lian	47d601fa23	optionally define whether to use_fast tokenizer	2023-06-25 10:19:49 -04:00
Wing Lian	cb9d3af5c0	add validation and tests for adamw hyperparam	2023-06-15 09:39:42 -04:00
Wing Lian	6d0ee4ba34	support adamw and grad norm hyperparams	2023-06-15 08:40:41 -04:00
Wing Lian	a81f52d575	Merge pull request #212 from OpenAccess-AI-Collective/doc-20230615-v1 add float16 docs and tweak typehints	2023-06-15 08:28:57 -04:00
Wing Lian	1925eaf1e6	Merge pull request #214 from OpenAccess-AI-Collective/fix-tokenizing-labels Fix tokenizing labels	2023-06-15 08:13:43 -04:00
Wing Lian	88e17ffc50	add float16 docs and tweak typehints	2023-06-15 02:05:31 -04:00
Wing Lian	7925ddce86	bugfix for potential off by one	2023-06-15 01:59:33 -04:00
maciej.karasek	136522f9c9	style correction	2023-06-14 20:02:09 +02:00
maciej.karasek	556fe408b3	issue #205 bugfix	2023-06-14 16:59:57 +02:00
Wing Lian	4b43a66a0b	update alpaca_chat prompts for instructions to explainn the conversation	2023-06-12 18:38:38 -04:00
Wing Lian	fd2c9814c9	Merge branch 'main' into flash-optimum	2023-06-12 13:12:15 -04:00
Wing Lian	93dacba228	Merge pull request #187 from OpenAccess-AI-Collective/strip-peft-device-map peft no longer needs device_map	2023-06-12 09:10:49 -04:00
Wing Lian	8002ffb41f	Merge pull request #177 from NanoCode012/fix/landmark-patch Fix landmark attention patch	2023-06-12 08:27:12 -04:00
Wing Lian	74ef5cc083	Merge pull request #192 from OpenAccess-AI-Collective/sharegpt-custom-prompt misc fixes	2023-06-12 08:26:38 -04:00
Wing Lian	5e616d91c0	Merge branch 'main' into strip-peft-device-map	2023-06-12 08:25:54 -04:00
NanoCode012	8e568bbdae	Merge pull request #159 from AngainorDev/patch-1 Fix training over existing lora	2023-06-12 20:27:11 +09:00
Wing Lian	c7dee56b87	add typehints	2023-06-11 19:52:34 -04:00
Wing Lian	aac4b7691e	add new sharegpt, refactor prompt so it can be customized later, add exception if no data is processed	2023-06-11 19:42:25 -04:00
Wing Lian	c9a149f9e8	add check for attr	2023-06-11 10:11:17 -04:00
Wing Lian	14668fa54e	new validation for mpt w grad checkpoints	2023-06-11 09:26:10 -04:00
AngainorDev	b565ecf0a1	Fix strict and Lint	2023-06-11 15:23:38 +02:00
Wing Lian	fe0b76854e	match up gradient checkpointing when using lora w config	2023-06-11 09:20:40 -04:00
NanoCode012	974dc00a7d	Fix set mem_id for inference and refactor	2023-06-11 14:00:54 +09:00
NanoCode012	a6190c8094	Clean up landmark patching	2023-06-11 11:59:03 +09:00
NanoCode012	563b6d89e6	Fix undefined LlamaForCausalLM and del try except	2023-06-11 11:58:31 +09:00
Wing Lian	cd0a6f6027	peft no longer needs device_map	2023-06-10 22:50:09 -04:00
NanoCode012	e285e24f7f	Address PR suggestion Co-authored-by: Wing Lian <wing.lian@gmail.com>	2023-06-11 10:52:12 +09:00
NanoCode012	919727b4d7	Refactor landmark attention patch	2023-06-11 10:51:05 +09:00
Wing Lian	958da70376	fix formatting	2023-06-10 15:28:08 -04:00
Angainor Development	a808bf913f	Fix missing cfg.	2023-06-10 20:28:49 +02:00
Wing Lian	01248253a3	Merge pull request #182 from OpenAccess-AI-Collective/fix-llama-ref fix for local variable 'LlamaForCausalLM' referenced before assignment	2023-06-10 14:25:51 -04:00
Wing Lian	0c6f928601	address PR feedback	2023-06-10 14:23:56 -04:00
Wing Lian	eea2731a5e	add streaming dataset support for pretraining datasets	2023-06-10 14:23:56 -04:00
Wing Lian	ab5cd28acf	more gpt-neox long ctx fixes	2023-06-10 14:23:55 -04:00
Wing Lian	1a82082e91	fix bettertransformers save, force it to skip after saving correctly in callback	2023-06-10 14:23:55 -04:00
Wing Lian	1210dc8fd5	more tweaks to do pre-training with bettertransformers	2023-06-10 14:23:55 -04:00
Wing Lian	488a67d75a	experimental expansion of ctx len	2023-06-10 14:23:53 -04:00
Wing Lian	71a43f8479	add validation/warning for bettertransformers and torch version	2023-06-10 14:22:31 -04:00
Wing Lian	1edc30c786	add support for opimum bettertransformers	2023-06-10 14:22:30 -04:00
Wing Lian	14163c15d9	fix for local variable 'LlamaForCausalLM' referenced before assignment	2023-06-10 14:11:13 -04:00
Angainor Development	79e2a6f140	Merge branch 'main' into patch-1	2023-06-10 19:07:54 +02:00
Wing Lian	a03a7d7d8b	add support to extend context with xpos rope	2023-06-10 10:29:46 -04:00
Wing Lian	7f09106437	fix for max sequence len across different model types	2023-06-09 20:42:33 -04:00
NanoCode012	aefb2fc681	Fix backward compat for peft	2023-06-10 07:46:36 +09:00
Angainor Development	813cfa4c14	WIP: Rely on cfg.inference	2023-06-09 08:49:32 +02:00
NanoCode012	2a801b001a	Fix grad checkpoint and outputs param	2023-06-09 14:28:44 +09:00
NanoCode012	e44c9e0b3e	Fix patching via import instead of hijacking	2023-06-09 14:27:24 +09:00
NanoCode012	55b8542de8	Feat: Add landmark attention	2023-06-09 12:54:08 +09:00
Bruno Cabral	f4df266842	Disable Wandb	2023-06-08 21:02:02 -03:00

1 2 3 4 5

234 Commits