axolotl

Author	SHA1	Message	Date
Wing Lian	01248253a3	Merge pull request #182 from OpenAccess-AI-Collective/fix-llama-ref fix for local variable 'LlamaForCausalLM' referenced before assignment	2023-06-10 14:25:51 -04:00
Wing Lian	0c6f928601	address PR feedback	2023-06-10 14:23:56 -04:00
Wing Lian	eea2731a5e	add streaming dataset support for pretraining datasets	2023-06-10 14:23:56 -04:00
Wing Lian	ab5cd28acf	more gpt-neox long ctx fixes	2023-06-10 14:23:55 -04:00
Wing Lian	1a82082e91	fix bettertransformers save, force it to skip after saving correctly in callback	2023-06-10 14:23:55 -04:00
Wing Lian	1210dc8fd5	more tweaks to do pre-training with bettertransformers	2023-06-10 14:23:55 -04:00
Wing Lian	488a67d75a	experimental expansion of ctx len	2023-06-10 14:23:53 -04:00
Wing Lian	71a43f8479	add validation/warning for bettertransformers and torch version	2023-06-10 14:22:31 -04:00
Wing Lian	1edc30c786	add support for opimum bettertransformers	2023-06-10 14:22:30 -04:00
Wing Lian	14163c15d9	fix for local variable 'LlamaForCausalLM' referenced before assignment	2023-06-10 14:11:13 -04:00
Angainor Development	79e2a6f140	Merge branch 'main' into patch-1	2023-06-10 19:07:54 +02:00
Wing Lian	a03a7d7d8b	add support to extend context with xpos rope	2023-06-10 10:29:46 -04:00
Wing Lian	7f09106437	fix for max sequence len across different model types	2023-06-09 20:42:33 -04:00
NanoCode012	aefb2fc681	Fix backward compat for peft	2023-06-10 07:46:36 +09:00
Angainor Development	813cfa4c14	WIP: Rely on cfg.inference	2023-06-09 08:49:32 +02:00
NanoCode012	2a801b001a	Fix grad checkpoint and outputs param	2023-06-09 14:28:44 +09:00
NanoCode012	e44c9e0b3e	Fix patching via import instead of hijacking	2023-06-09 14:27:24 +09:00
NanoCode012	55b8542de8	Feat: Add landmark attention	2023-06-09 12:54:08 +09:00
Bruno Cabral	f4df266842	Disable Wandb	2023-06-08 21:02:02 -03:00
NanoCode012	2ef4634d45	Refactor out unmodified save_steps and eval_steps	2023-06-09 01:23:13 +09:00
NanoCode012	2cfe9e9b16	Set to use cfg.seed or 42 for backward compat	2023-06-09 01:02:36 +09:00
NanoCode012	bfd27ba55e	Fix failing test	2023-06-09 00:35:03 +09:00
NanoCode012	babf0fdb71	Validate falcon with fsdp	2023-06-09 00:29:04 +09:00
NanoCode012	df9528f865	Fix future deprecate prepare_model_for_int8_training	2023-06-08 21:42:10 +09:00
Angainor Development	193c73bce0	Fix training over existing lora When training with Lora, and starting with an existing lora weights, current code produces a model with 0 trainable params and training can't work. Adding the "is_trainable" param allows the loaded peft to be trained and fixes the bug.	2023-06-08 09:18:58 +02:00
Wing Lian	59bb2197ed	fix camel ai, add guanaco/oasst mapping for sharegpt	2023-06-07 09:51:29 -04:00
Wing Lian	4ac9e251b7	new prompters, misc fixes for output dir missing using fsdp, and changing max seq len	2023-06-05 22:41:00 -04:00
NanoCode012	3c71c8debe	Update doc for grad_accu and add validation tests for batch size	2023-06-01 06:13:47 +09:00
Wing Lian	5a631b305b	fix batch size calculation	2023-05-31 14:11:32 -04:00
Wing Lian	9b8585dc70	fix packing so that concatenated sequences reset the attention	2023-05-31 11:38:52 -04:00
Wing Lian	2d0ba3b818	Merge pull request #124 from OpenAccess-AI-Collective/xformers-fix copy xformers attn from ooba since we removed dep on alpaca_lora_4bit	2023-05-31 00:11:40 -04:00
Wing Lian	c7021e191f	Merge pull request #120 from OpenAccess-AI-Collective/model-from-path split up llama model loading so config can be loaded from base config and models can be loaded from a path	2023-05-31 00:08:38 -04:00
Wing Lian	c56818b119	don't worry about dupes	2023-05-31 00:06:47 -04:00
Wing Lian	1076bcbbca	Update src/axolotl/monkeypatch/llama_attn_hijack_xformers.py Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>	2023-05-31 00:00:19 -04:00
Wing Lian	2daa6835f0	Update src/axolotl/monkeypatch/llama_attn_hijack_xformers.py Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>	2023-05-30 23:59:05 -04:00
Wing Lian	e3c494ca7b	remove unused import and update readme	2023-05-30 23:55:45 -04:00
Wing Lian	ad0ea6aaab	black formatting ignore copied file fix linting	2023-05-30 23:50:29 -04:00
Wing Lian	6cb2310592	copy xformers attn from ooba since we removed dep on alpaca_lora_4bit	2023-05-30 23:34:36 -04:00
Wing Lian	3aad5f3b3e	add support for gradient accumulation steps	2023-05-30 23:24:37 -04:00
Wing Lian	39a208c2bc	fix up tokenizer config, isort fix	2023-05-30 23:00:02 -04:00
Wing Lian	2520ecd6df	split up llama model loading so config can be loaded from base config and models can be loaded from a path	2023-05-30 22:32:44 -04:00
NanoCode012	594e72b6e8	Fix incorrect rebase	2023-05-31 02:58:50 +09:00
NanoCode012	25eeeeba0b	Fix sharegpt prompt	2023-05-31 02:55:21 +09:00
Wing Lian	cfcc549f6b	fix relative path for fixtures	2023-05-31 02:55:21 +09:00
NanoCode012	a1f9850b91	Fix security issue or ignore false positives	2023-05-31 02:53:53 +09:00
NanoCode012	c17dae6d07	Update src/axolotl/prompt_strategies/alpaca_instruct.py Co-authored-by: Wing Lian <wing.lian@gmail.com>	2023-05-31 02:53:53 +09:00
NanoCode012	37293dce07	Apply isort then black	2023-05-31 02:53:53 +09:00
NanoCode012	e9650d3ae4	Fix mypy typing	2023-05-31 02:53:53 +09:00
NanoCode012	be22551435	Fix unsupported operand type(s) for \|	2023-05-31 02:53:53 +09:00
NanoCode012	b832a0ac62	Black formatting	2023-05-31 02:53:53 +09:00

... 3 4 5 6 7 ...

403 Commits