axolotl

Author	SHA1	Message	Date
mhenrichsen	4fecbfe5e1	default model changed	2023-09-24 18:52:53 +02:00
Wing Lian	faecff9798	support to disable exllama for gptq (#604 ) * support to disable exllama for gptq * update property instead of item * fix config key	2023-09-19 17:51:08 -04:00
Wing Lian	674c57692d	more sane defaults for openllama 3b used for quickstarts (#602 ) * more sane defaults for openllama 3b used for quickstarts * don't use bf16 for quickstart to simplify gpu compatibility * use the update openlm-research/open_llama_3b_v2 models	2023-09-19 09:15:10 -04:00
Wing Lian	6b9b229356	btlm and falcon monkey patches for flash attn (#566 )	2023-09-17 13:49:18 -04:00
Wing Lian	62eaee7649	make phi training work with Loras (#588 ) * valdiation for phi loras * fix model config class check * update readme for phi traiing	2023-09-15 20:51:55 -04:00
Wing Lian	12a2dbbc2c	Support Sample packing for phi arch (#586 ) * phi sequence packing * sample packing fixes * fix linting * fix inference and phi e2e tests * update phi example now that sample packing works * wandb import keeps getting moved around	2023-09-15 15:46:54 -04:00
Doan Minh Phuong	1aa400721e	Fix Codellama examples (#582 ) * Fix seq_len * Update lora.yml * Update qlora.yml * Update lora.yml * Update lora.yml * Update qlora.yml	2023-09-15 04:19:13 -04:00
Wing Lian	228420972e	Phi examples (#569 ) * add phi full ft example * Add readme to point out that deepspeed should be used * zero1 is better than zero2 for phi	2023-09-14 11:17:47 -04:00
Glavin Wiechert	5b67ea98a6	Add training callback to send predictions to WandB table (#521 ) * WIP Add training callback to send predictions to WandB table * WIP improve wandb table reporting callback * WIP improve wandb table reporting callback (cont) * Add VSCode launching for debugging * Add tiny llama example * WIP attempt to improve post-eval prediction generation for table * WIP attempt to improve post-eval prediction generation for table - part 2 * WIP batch generation * WIP attempt to handle sample_packing using position_ids for wandb prediction table * WIP add code for debugging * Fix sample_packing support for wandb prediction table * Clean up code for PR review * Add eval_table_size, eval_table_max_new_tokens configs & clean up code * Clean up PR, delete VSCode config, add tiny-llama example * Add eval_table_size, eval_table_max_new_tokens documentation. Fix linting/formatting	2023-09-13 09:51:08 -04:00
Wing Lian	343714972b	recommend padding when using sample packing (#531 )	2023-09-06 17:00:21 -04:00
Wing Lian	3355706e22	Add support for GPTQ using native transformers/peft (#468 ) * auto gptq support * more tweaks and add yml * remove old gptq docker * don't need explicit peft install for tests * fix setup.py to use extra index url install torch for tests fix cuda version for autogptq index set torch in requirements so that it installs properly move gptq install around to work with github cicd * gptq doesn't play well with sample packing * address pr feedback * remove torch install for now * set quantization_config from model config * Fix the implementation for getting quant config from model config	2023-09-05 12:43:22 -04:00
Birch-san	8e197f6fb4	pad_to_worst_case_seq_len boolean, for testing memory limits (#498 ) * pad_to_worst_case_seq_len boolean, for testing memory limits * remove collator_pad_to_longest option since it does nothing see docs: https://huggingface.co/docs/transformers/main_classes/data_collator#transformers.DataCollatorWithPadding.padding True and "longest" mean the same thing * rename to `pad_to_sequence_len, and ensure 64 alignment --------- Co-authored-by: Aman Karmani <aman@tmm1.net>	2023-08-28 18:47:16 -04:00
mhenrichsen	35130711d6	Feat(cfg): Add code-llama configs for all sizes (#479 ) * configs for all sizes * update tokenizer type --------- Co-authored-by: mhenrichsen <some_email@hey.com>	2023-08-27 10:20:17 +09:00
Charles O. Goddard	fe4d6baf92	Add example Llama 2 ReLoRA config (#471 ) * Add example Llama 2 ReLoRA config * Use adamw_bnb_8bit in example relora config	2023-08-27 10:08:34 +09:00
Wing Lian	cb9797ef5a	improve llama pad token handling (#475 ) * improve llama pad token handling * tweak logic to not clobber	2023-08-24 13:20:35 -04:00
Wing Lian	1687be6a35	don't use mask expansion for inference (#392 )	2023-08-14 20:52:54 -04:00
mhenrichsen	fdffef5940	new llama-2 default settings (#370 ) * new default settings * fix whitespace * rm max packed sequence length --------- Co-authored-by: Mads Henrichsen <mads@BrbartiendeMads.lan>	2023-08-14 17:39:09 +09:00
Morgan McGuire	7019509daa	Add wandb_entity to wandb options, update example configs, update README (#361 ) * Update wandb_entity and add wandb descriptions * add wandb to config section * remove trailing whitespace for pre-commit hook * remove trailing whitespace for pre-commit hook --------- Co-authored-by: Morgan McGuire <morganmcguire@Morgans-MacBook-Pro.local> Co-authored-by: Wing Lian <wing.lian@gmail.com>	2023-08-12 12:17:11 -04:00
Aman Karmani	36fefcf94b	set group_by_length to false in examples	2023-08-06 23:59:09 -07:00
mhenrichsen	dc71d8872a	feat/llama-2 examples (#319 ) * qlora llama-2 * qlora llama-2 * linting * readme * lora added * linting * change group_by_length * 13b fitting on 24gb * grouped lengths true * add pad token * change out dir --------- Co-authored-by: Mads Henrichsen <mads@Brbar-tilhrende-Mads.local>	2023-08-03 19:22:48 +09:00
Ethan Smith	38811434e6	Add XGen info to README and example config	2023-07-21 00:44:50 -07:00
Steffen Röcker	945c4191a3	Use AutoTokenizer for redpajama example	2023-06-14 20:09:26 +02:00
Wing Lian	16bb6276a5	Merge pull request #92 from OpenAccess-AI-Collective/flash-optimum add support for opimum bettertransformers	2023-06-14 07:50:15 -04:00
Wing Lian	fd2c9814c9	Merge branch 'main' into flash-optimum	2023-06-12 13:12:15 -04:00
Wing Lian	2ba4ae8f46	tweak config to work	2023-06-12 10:07:18 -04:00
Wing Lian	94f310c7a6	Merge pull request #193 from OpenAccess-AI-Collective/config-fixes-20230612 config fixes	2023-06-12 08:24:52 -04:00
NanoCode012	52cde69288	Fix config path after config moved	2023-06-12 17:06:15 +09:00
Wing Lian	9a58e99e81	config fixes	2023-06-12 01:52:58 -04:00
Wing Lian	6b3f509d9e	forgot to add this file	2023-06-11 11:50:12 -04:00
Wing Lian	d0d7eaa4f3	update openllama and clean up paths	2023-06-11 11:03:31 -04:00
Wing Lian	effbbf6dd1	more pruning	2023-06-11 10:38:24 -04:00
Wing Lian	c530e4b9c8	more config pruning and migrating	2023-06-11 10:09:05 -04:00
Wing Lian	77762a5d6b	get rid of some configs, formalize pythioa lora config	2023-06-11 09:41:41 -04:00
Wing Lian	0c6f928601	address PR feedback	2023-06-10 14:23:56 -04:00
Wing Lian	1db46a9c72	linting fix	2023-06-10 14:23:56 -04:00
Wing Lian	39619028a3	use pythia-12b, neox-20b is flaky	2023-06-10 14:22:30 -04:00
NanoCode012	c8242de725	Merge pull request #132 from utensil/falcon-7b-qlora Axolotl supports falcon + qlora	2023-06-09 01:14:03 +09:00
Utensil	79a8f52181	Trim trailing whitespace	2023-06-08 23:48:57 +08:00
Utensil	a52f4816b0	Default `wandb_project` to empty as suggested Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>	2023-06-08 23:04:19 +08:00
Utensil	c9c050316f	Default micro_batch_size to 1 for a safer start	2023-06-03 17:26:33 +08:00
Utensil	ca11ae9689	Add comments/alternatives for falcon-qlora configs	2023-06-03 15:04:02 +08:00
Utensil	fb3d40f197	falcon + qlora + xformer mbs 40 gas 2 on A6000	2023-06-01 18:29:20 +08:00
Utensil	72bf8aafb6	Create config-7b-qlora.yml	2023-06-01 00:00:37 +08:00
Wing Lian	c2a0792680	swap batch size for gradient accumulation steps to decouple from num gpu	2023-05-31 09:38:12 -04:00
Wing Lian	4df9da74e3	Merge pull request #105 from viktoriussuwandi/viktoriussuwandi-patch Viktoriussuwandi patch	2023-05-30 15:05:23 -04:00
Wing Lian	2531ea24c1	Merge pull request #106 from fearnworks/qlora-openllama-3b-example Qlora openllama 3b example	2023-05-30 15:05:05 -04:00
NanoCode012	392dfd9b07	Lint and format	2023-05-31 02:53:22 +09:00
jphillips	6cee881d64	Update examples/qlora-openllama-3b/README.md Co-authored-by: Wing Lian <wing.lian@gmail.com>	2023-05-30 09:33:33 -05:00
jphillips	ac85c0ed36	Add Readme, Clean up comments	2023-05-29 14:35:58 -05:00
jphillips	370d057096	Add qlora-openllama-3b example	2023-05-29 09:07:46 -05:00

1 2

68 Commits