axolotl

Author	SHA1	Message	Date
Wing Lian	5f79b8242f	new evals_per_epoch and saves_per_epoch to make things cleaner (#944 ) * new evals_per_epoch and saves_per_epoch to make things cleaner * update per PR feedback	2023-12-12 15:35:23 -05:00
NanoCode012	a1da39cd48	Feat(wandb): Refactor to be more flexible (#767 ) * Feat: Update to handle wandb env better * chore: rename wandb_run_id to wandb_name * feat: add new recommendation and update config * fix: indent and pop disabled env if project passed * feat: test env set for wandb and recommendation * feat: update to use wandb_name and allow id * chore: add info to readme	2023-12-04 22:17:25 +09:00
Wing Lian	f544ab2bed	don't compile deepspeed or bitsandbytes from source (#837 )	2023-11-08 19:49:55 -05:00
Wing Lian	8b79ff0e94	fix eval_steps to be a sane default (#797 ) * fix eval_steps to be a sane default * update docs for fractional eval_steps	2023-10-27 22:36:30 -04:00
Wing Lian	2d8def68dc	simplify by removing duplicate base_model_config (#772 )	2023-10-23 01:42:38 -04:00
Wing Lian	e50a64e85e	prepared dataset caching, other misc fixes (#665 ) * prepared dataset caching, other misc fixes * also don't load from disk cache unless explicit	2023-10-02 21:07:24 -04:00
Wing Lian	d887ad86c3	eval_table isn't quite stable enough to be in default llama configs (#637 )	2023-09-26 10:13:20 -04:00
mhenrichsen	4fecbfe5e1	default model changed	2023-09-24 18:52:53 +02:00
Glavin Wiechert	5b67ea98a6	Add training callback to send predictions to WandB table (#521 ) * WIP Add training callback to send predictions to WandB table * WIP improve wandb table reporting callback * WIP improve wandb table reporting callback (cont) * Add VSCode launching for debugging * Add tiny llama example * WIP attempt to improve post-eval prediction generation for table * WIP attempt to improve post-eval prediction generation for table - part 2 * WIP batch generation * WIP attempt to handle sample_packing using position_ids for wandb prediction table * WIP add code for debugging * Fix sample_packing support for wandb prediction table * Clean up code for PR review * Add eval_table_size, eval_table_max_new_tokens configs & clean up code * Clean up PR, delete VSCode config, add tiny-llama example * Add eval_table_size, eval_table_max_new_tokens documentation. Fix linting/formatting	2023-09-13 09:51:08 -04:00
Wing Lian	343714972b	recommend padding when using sample packing (#531 )	2023-09-06 17:00:21 -04:00
Wing Lian	1687be6a35	don't use mask expansion for inference (#392 )	2023-08-14 20:52:54 -04:00
mhenrichsen	fdffef5940	new llama-2 default settings (#370 ) * new default settings * fix whitespace * rm max packed sequence length --------- Co-authored-by: Mads Henrichsen <mads@BrbartiendeMads.lan>	2023-08-14 17:39:09 +09:00
Morgan McGuire	7019509daa	Add wandb_entity to wandb options, update example configs, update README (#361 ) * Update wandb_entity and add wandb descriptions * add wandb to config section * remove trailing whitespace for pre-commit hook * remove trailing whitespace for pre-commit hook --------- Co-authored-by: Morgan McGuire <morganmcguire@Morgans-MacBook-Pro.local> Co-authored-by: Wing Lian <wing.lian@gmail.com>	2023-08-12 12:17:11 -04:00
Aman Karmani	36fefcf94b	set group_by_length to false in examples	2023-08-06 23:59:09 -07:00
mhenrichsen	dc71d8872a	feat/llama-2 examples (#319 ) * qlora llama-2 * qlora llama-2 * linting * readme * lora added * linting * change group_by_length * 13b fitting on 24gb * grouped lengths true * add pad token * change out dir --------- Co-authored-by: Mads Henrichsen <mads@Brbar-tilhrende-Mads.local>	2023-08-03 19:22:48 +09:00

15 Commits