axolotl

Author	SHA1	Message	Date
Wing Lian	75e4fc2825	wip more tp fixes	2023-11-01 01:45:36 -04:00
Wing Lian	e13c2fd6b1	getting better	2023-10-31 22:23:40 -04:00
Wing Lian	8a21e14a21	load to cpu first	2023-10-31 22:23:15 -04:00
Wing Lian	9c52a83403	load model faster w low_cpu_mem_usage	2023-10-31 22:23:15 -04:00
Wing Lian	fb8ee37ca6	wip tp	2023-10-31 22:23:14 -04:00
Wing Lian	65f3a4f703	tensor-parallel support	2023-10-31 22:21:40 -04:00
NanoCode012	10388a8daf	fix(tokenizer): update log order after update (#806 )	2023-10-31 13:21:20 +09:00
NanoCode012	637ed095a0	fix(config): Set eos/bos to tokenizer if different (#801 ) * fix(config): Set eos/bos to tokenizer if different * chore: fix lint	2023-10-29 21:32:37 +09:00
Wing Lian	827ec3d274	refactor neft patch to be more re-usable similar to trl's impl (#796 )	2023-10-29 04:33:13 -04:00
NanoCode012	11d1d607db	chore: refactor truthy check and fix mypy (#780 )	2023-10-24 12:28:40 +09:00
Casper	15d3a654bf	Implement fused modules (#747 ) * MLP: Memory saving * Remove RMSNorm restrictions * Map packed weights to original * FusedAttention module * Simplify code * Move fused modules * Fix critical typo * Split inplace * Add FFT config * Add validation of fused arguments * Add fused arguments to config * Update docs * Fix validation logic * Add fused modules to flash attn * Only fuse during training * Remove timing * Formatting * Formatting * Formatting * chore: lint * chore: lint * add e2e tests for fused llama * no lora for tests --------- Co-authored-by: Wing Lian <wing.lian@gmail.com>	2023-10-21 16:08:25 -04:00
NanoCode012	440c3ab527	Fix(model): Linear detected and added to target module with rope linear (#738 ) * Fix(model): Linear detected and added to target module with rope linear * fix: exclude layer instead	2023-10-18 22:13:20 -04:00
Maxime	3bd9528390	add noisy embedding (#721 ) * add noisy embedding * fix format * Update README.md * Update README.md * linter issues * caseus fixes --------- Co-authored-by: Maxime <maxime@nope.no>	2023-10-13 10:00:42 -04:00
NanoCode012	669f1d052c	Fix: Higher vram usage for mistral and sample_packing (#691 ) * Fix: Higher vram usage for mistral and sample_packing * chore: update comment * chore: lint	2023-10-06 12:33:43 -04:00
Wing Lian	2d60ba3a6e	flash_attention + sample packing for stablelm 3b (#671 ) * stablelm epoch fa patch * is causal for fa * working stablelm fa w packing * chore: pre-commit linting	2023-10-05 16:03:43 -04:00
NanoCode012	eb480dfd68	Fix: ValueError when FA + Mistral when padding_side=right (#681 ) * Fix: ValueError when FA + Mistral when padding_side=right * fix: remove tokenizer class check	2023-10-06 04:12:54 +09:00
NanoCode012	e0b7eeabfd	Fix(tokenizer): Set rstrip,lstrip,norm to False (#678 )	2023-10-06 03:50:49 +09:00
NanoCode012	e62d5901b5	chore: Clean up repetitive model kwargs (#670 )	2023-10-04 20:41:26 +09:00
NanoCode012	697c50d408	Feat: Allow usage of native Mistral FA when no sample_packing (#669 ) * Allow usage of native Mistral FA when no sample_packing * fix: do not apply custom patch when sample_pack off * chore: lint * chore: pin transformer to v4.35.0.dev0 * fix: split sample_packing to separate test	2023-10-04 20:40:47 +09:00
Wing Lian	f34648c8b9	remove patch fix for phi (#664 )	2023-10-02 21:07:41 -04:00
Wing Lian	b6ab8aad62	Mistral flash attn packing (#646 ) * add mistral monkeypatch * add arg for decoder attention masl * fix lint for duplicate code * make sure to update transformers too * tweak install for e2e * move mistral patch to conditional	2023-09-27 18:41:00 -04:00
Wing Lian	895f0a0723	skip some flash attn patches unless explicitly enabled (#643 ) * skip some flash attn patches if explicitly disabled * make the other patches optional	2023-09-27 12:11:07 -04:00
NanoCode012	19a600a8b8	Feat: Add support for upstream FA2 (#626 ) * Feat: Add support for upstream FA2 * chore: add is_falcon_derived_model: true to examples * chore: add config to readme for documentation * feat: add extra model types * fix: remove old falcon flash patch * chore: pin transformers and accelerate	2023-09-26 09:53:28 -04:00
Wing Lian	03e59077a0	misc fixes to add gptq tests (#621 ) * misc fixes to add gptq tests * set bf16 needed for fa2	2023-09-21 21:52:12 -04:00
Wing Lian	faecff9798	support to disable exllama for gptq (#604 ) * support to disable exllama for gptq * update property instead of item * fix config key	2023-09-19 17:51:08 -04:00
bofeng huang	aa656e04bd	Delete duplicate lines (#606 )	2023-09-19 16:40:05 -04:00
Wing Lian	6b9b229356	btlm and falcon monkey patches for flash attn (#566 )	2023-09-17 13:49:18 -04:00
Wing Lian	62eaee7649	make phi training work with Loras (#588 ) * valdiation for phi loras * fix model config class check * update readme for phi traiing	2023-09-15 20:51:55 -04:00
Wing Lian	360788296a	don't resize embeddings if it's already large enough (#577 ) * don't resize embeddings if it's already large enough * make sure to tie weights, even if we aren't resizing	2023-09-15 15:47:09 -04:00
Wing Lian	12a2dbbc2c	Support Sample packing for phi arch (#586 ) * phi sequence packing * sample packing fixes * fix linting * fix inference and phi e2e tests * update phi example now that sample packing works * wandb import keeps getting moved around	2023-09-15 15:46:54 -04:00
Glavin Wiechert	5b67ea98a6	Add training callback to send predictions to WandB table (#521 ) * WIP Add training callback to send predictions to WandB table * WIP improve wandb table reporting callback * WIP improve wandb table reporting callback (cont) * Add VSCode launching for debugging * Add tiny llama example * WIP attempt to improve post-eval prediction generation for table * WIP attempt to improve post-eval prediction generation for table - part 2 * WIP batch generation * WIP attempt to handle sample_packing using position_ids for wandb prediction table * WIP add code for debugging * Fix sample_packing support for wandb prediction table * Clean up code for PR review * Add eval_table_size, eval_table_max_new_tokens configs & clean up code * Clean up PR, delete VSCode config, add tiny-llama example * Add eval_table_size, eval_table_max_new_tokens documentation. Fix linting/formatting	2023-09-13 09:51:08 -04:00
Wing Lian	a94f9cb99e	fix for quant config from model (#540 )	2023-09-10 12:40:52 -04:00
Wing Lian	3355706e22	Add support for GPTQ using native transformers/peft (#468 ) * auto gptq support * more tweaks and add yml * remove old gptq docker * don't need explicit peft install for tests * fix setup.py to use extra index url install torch for tests fix cuda version for autogptq index set torch in requirements so that it installs properly move gptq install around to work with github cicd * gptq doesn't play well with sample packing * address pr feedback * remove torch install for now * set quantization_config from model config * Fix the implementation for getting quant config from model config	2023-09-05 12:43:22 -04:00
Maxime	1991946c5a	fix: bad dtype for full finetune (#504 ) * fix: bad dtype for full finetune * Update src/axolotl/utils/models.py Co-authored-by: Wing Lian <wing.lian@gmail.com> * Update models.py --------- Co-authored-by: Wing Lian <wing.lian@gmail.com>	2023-09-01 07:11:45 -07:00
Wing Lian	125cccb786	Refactor train cfg cli (#499 ) * wip to cleanup cfg cli options * fix launcher * fix cli args	2023-08-29 05:37:53 -07:00
Aman Karmani	267b7b24e5	simplify linear layer locator	2023-08-28 09:45:16 -04:00
Wing Lian	98bf76e236	fsdp requires params be the same type too (#493 )	2023-08-28 04:33:50 -04:00
NanoCode012	4c37bd0b54	Fix(tokenizer): Make sure to add pad for CodeLlamaTokenizer (#489 )	2023-08-28 09:39:10 +09:00
Aman Karmani	3a011ea1ef	fix condition and add logging	2023-08-27 20:09:26 +00:00
Aman Karmani	f319b0bc67	rename var and reformat	2023-08-27 19:55:11 +00:00
Maxime	7fd662dd89	Update src/axolotl/utils/models.py Co-authored-by: Aman Gupta Karmani <aman@tmm1.net>	2023-08-27 21:01:43 +02:00
Maxime	9e699683d7	Update src/axolotl/utils/models.py Co-authored-by: Aman Gupta Karmani <aman@tmm1.net>	2023-08-27 21:01:37 +02:00
Maxime	d03887fad5	ignore: address pr review	2023-08-26 22:45:45 +02:00
Maxime	a184549e4c	ignore: linter	2023-08-26 22:36:14 +02:00
Maxime	f311df9462	fix: finetune model inference needs the dtype fix to work with flash-attn	2023-08-26 22:34:11 +02:00
Wing Lian	0b7ba57ec4	fix types w lora (#478 )	2023-08-25 02:03:24 -04:00
NanoCode012	71bd06243c	Fix(tokenizer): Fix condition to add pad token (#477 ) * Fix(tokenizer): Fix condition to add pad token * chore: fix lint	2023-08-25 14:30:50 +09:00
Wing Lian	cb9797ef5a	improve llama pad token handling (#475 ) * improve llama pad token handling * tweak logic to not clobber	2023-08-24 13:20:35 -04:00
Wing Lian	96deb6bd67	recast loralayer, norm, lmhead + embed token weights per original qlora (#393 ) * recast loralayer, norm, lmhead + embed token weights per original qlora * try again for the fix * refactor torch dtype picking * linter fixes * missing import for LoraLayer * fix install for tests now that peft is involved	2023-08-21 18:41:12 -04:00
Wing Lian	ee262818ef	fix evals (#447 )	2023-08-20 23:39:42 -04:00

1 2 3 4

176 Commits