axolotl

Author	SHA1	Message	Date
Wing Lian	65f3a4f703	tensor-parallel support	2023-10-31 22:21:40 -04:00
NanoCode012	8966a6f566	chore: bump transformers to v4.34.1 to fix tokenizer issue (#745 )	2023-10-19 20:18:22 -04:00
Wing Lian	bfbdba8614	pin xformers >= 0.0.22 (#724 )	2023-10-13 10:27:56 -04:00
NanoCode012	43856c0a39	Fix(version): Update FA to work with Mistral SWA (#673 )	2023-10-04 21:32:19 +09:00
NanoCode012	697c50d408	Feat: Allow usage of native Mistral FA when no sample_packing (#669 ) * Allow usage of native Mistral FA when no sample_packing * fix: do not apply custom patch when sample_pack off * chore: lint * chore: pin transformer to v4.35.0.dev0 * fix: split sample_packing to separate test	2023-10-04 20:40:47 +09:00
Napuh	a7e56d83c2	removed duplicate on requirements.txt (#661 )	2023-10-02 08:40:05 -04:00
Wing Lian	5b0bc48fbc	add mistral e2e tests (#649 ) * mistral e2e tests * make sure to enable flash attention for the e2e tests * use latest transformers full sha * uninstall first	2023-09-29 00:22:40 -04:00
Wing Lian	b6ab8aad62	Mistral flash attn packing (#646 ) * add mistral monkeypatch * add arg for decoder attention masl * fix lint for duplicate code * make sure to update transformers too * tweak install for e2e * move mistral patch to conditional	2023-09-27 18:41:00 -04:00
Wing Lian	e7d3e2dbb6	use fastchat conversations template (#578 ) * use fastchat conversations template * require fastchat (fschat) pip install * handle roles dynamically from conversation * tweak fastchat conversation with a monkeypatch to get individual turns * fix up so it works with multiple conversation styles, and don't strip the turns * fix sharegpt fixture now that we're using a more correct tokenization * use a new prompter and support fastchat conversation type * use sharegpt from prompt strategies now * update docs, add chatml template * add a newline after im_end token * ensure we correctly set system message * update per PR feedback to handle deprecated sharegpt types * don't add duplicate wandb req * make sharegpt fields configurable from yml * llama2 fixes * don't fail fatally when turns are improper	2023-09-27 12:10:45 -04:00
NanoCode012	19a600a8b8	Feat: Add support for upstream FA2 (#626 ) * Feat: Add support for upstream FA2 * chore: add is_falcon_derived_model: true to examples * chore: add config to readme for documentation * feat: add extra model types * fix: remove old falcon flash patch * chore: pin transformers and accelerate	2023-09-26 09:53:28 -04:00
Wing Lian	c25ba7939b	update README w deepspeed info (#605 )	2023-09-22 00:15:52 -04:00
Javier	ec0958f4f8	Update requirements.txt (#610 )	2023-09-20 08:40:49 -04:00
Wing Lian	bf0804447c	fix wandb so mypy doesn't complain (#562 ) * fix wandb so mypy doesn't complain * fix wandb so mypy doesn't complain * no need for mypy override anymore	2023-09-13 10:36:16 -04:00
dongxiaolong	c1921c9acb	Update requirements.txt (#543 ) fix fsdp	2023-09-08 16:07:11 -04:00
Wing Lian	34c0a86a11	update readme to point to direct link to runpod template, cleanup install instrucitons (#532 ) * update readme to point to direct link to runpod template, cleanup install instrucitons * default install flash-attn and auto-gptq now too * update readme w flash-attn extra * fix version in setup	2023-09-08 11:58:54 -04:00
Wing Lian	3355706e22	Add support for GPTQ using native transformers/peft (#468 ) * auto gptq support * more tweaks and add yml * remove old gptq docker * don't need explicit peft install for tests * fix setup.py to use extra index url install torch for tests fix cuda version for autogptq index set torch in requirements so that it installs properly move gptq install around to work with github cicd * gptq doesn't play well with sample packing * address pr feedback * remove torch install for now * set quantization_config from model config * Fix the implementation for getting quant config from model config	2023-09-05 12:43:22 -04:00
Wing Lian	76576323df	add eval benchmark callback (#441 ) * add mmlu callback * use hf dataset for mmlu evals * default to mmlu-zs * make sure to define all the explicit positional args * include metrics in callback * another callback fix for collator max len attribute * fix mmlu evals * sample benchmarks, ensure we drop long samples * fix the data file * fix elif and add better messaging * more fixes * rename mmlu to bench * more fixes * dataset handling and aggregate across benchmark * better handling when no subjects * benchmark callback has its own dataloader and collator * fixes * updated dataset * more fixes * missing transformers import * improve support for customized dataset for bench evals * gather benchmarks from all ranks * fix for gather across multiple gpus	2023-08-29 13:24:19 -07:00
Wing Lian	548787daae	customizable ascii art (#506 )	2023-08-29 10:13:42 -07:00
Maxime	c500d02517	Fix missing 'packaging' wheel (#482 )	2023-08-26 12:02:15 -04:00
Aman Karmani	c29117a0d7	allow newer deps	2023-08-26 15:06:05 +00:00
mhenrichsen	cf6654769a	flash attn pip install (#426 ) * flash attn pip * add packaging * add packaging to apt get * install flash attn in dockerfile * remove unused whls * add wheel * clean up pr fix packaging requirement for ci upgrade pip for ci skip build isolation for requiremnents to get flash-attn working install flash-attn seperately * install wheel for ci * no flash-attn for basic cicd * install flash-attn as pip extras --------- Co-authored-by: Ubuntu <mgh@mgh-vm.wsyvwcia0jxedeyrchqg425tpb.ax.internal.cloudapp.net> Co-authored-by: mhenrichsen <some_email@hey.com> Co-authored-by: Mads Henrichsen <mads@BrbartiendeMads.lan> Co-authored-by: Wing Lian <wing.lian@gmail.com>	2023-08-18 19:00:27 -04:00
mhenrichsen	0a228479b3	adds color (#425 ) * adds color * chore: lint * fix for colorama --------- Co-authored-by: Wing Lian <wing.lian@gmail.com>	2023-08-18 10:59:43 -04:00
Wing Lian	82e111aba9	remove extra accelearate in requirements (#430 )	2023-08-18 10:56:14 -04:00
Wing Lian	2bb0b78975	Attention mask and position id fixes for packing (#285 ) * fix attetion mask with packing * set position ids and use block diagonal attn mask * fix expand mask for multiple batch items, make sure we pad position_ids * don't move masks to cpu * use multi pack dataloader w random sampler * add position_ids back * more fixes for dataloader integration * est total tokens, fix field loop * more fixes, position_ids seems broken * more fixes for sample packing * use distributed sampler, avoid accelerate prepare * use accelerator prepare for dataloader * fix for position_ids w packing * Update src/axolotl/utils/dataloader.py * validation for sample packing and doc * more fixes for 4k and optimizations * optimized expand mask fn * better handling of variance in multipack dataloader length and trainer hanging when it runs out of data * fix rounding of len of batches to int * better handling so that all devices have the same dataloader len * fix step calc for packing * pass sample packing efficiency to training args * add a test for the mask expansion for sequence packing * only process eval dataset for packing if not None * don't split batches when packing * weighted CE losses * weighted CEL fixes * limit packing to sequences of max seq len * seq_len_multiple for packing * make sure the chunk size is an int * sample_packing_seq_len_multiplier config * use cumulative seq len with var len flash attn v2 w packing * properly calculate max len * fix flash-attn, xformers, packing, support chatml * fix chatml system prompt for openorca, legacy tokenizer opts * add chatml * add unit tests for cum seq lens, add ability to build cu_seq_lens from positional ids, fix prompt test * fix test and pylint checks * more packing and dataset optimizations and fixes * filter w multiple cpus * more fixes and optimizations * fixes and go back to distributed sampler since batch sampler won't work * fix counts by accounting for num devices * fix steps calculation * previous accelerate is still most performant * add numba to requirements. * use custom distributed checks * fix sampler to prevent overfit w new epochs * let's not cleanup the cached datasets * calculate cum seq lens with pos_ids instead of mask, simplify packing params, fix distributed barrier * speed optimizations and set accelerate fsdp env vars * optimize dataset concatenation? * more optimizations for dataset handling * fix import for annotation * manual pre-commit fixes * another sum optimization and bug fix for calc steps * fix packing estimations * fix formatting * pylint problems * add back flash attention branch for handling unpacked sequences seperately * Address PR feedback * add optional sample packing config params to readme	2023-08-12 15:14:56 -04:00
Aman Gupta Karmani	35c8b90306	Merge pull request #355 from tmm1/bitsandbytes-fixes bump to latest bitsandbytes release with major bug fixes	2023-08-11 15:15:38 -07:00
Aman Karmani	fce40aab23	bump to latest bitsandbytes release with major bug fixes	2023-08-09 21:47:11 +00:00
Aman Karmani	9c314101d5	use newer pynvml package	2023-08-09 21:06:28 +00:00
Aman Karmani	e303d64728	log GPU memory usage	2023-08-09 18:26:28 +00:00
Wing Lian	6c9a87c8ee	pin accelerate so it works with llama2 (#330 )	2023-07-30 22:20:06 -04:00
Wing Lian	9f69c4d8c1	latest HEAD of accelerate causes 0 loss immediately w FSDP (#321 )	2023-07-24 11:23:56 -04:00
Wing Lian	6dd2e7d671	add hf_transfer to requirements for faster hf upload	2023-07-17 14:44:48 -04:00
Teknium	273b3a3aa7	Update requirements.txt Require latest git accelerate to fix saving checkpoint issue	2023-07-16 10:24:24 -07:00
Wing Lian	1edc30c786	add support for opimum bettertransformers	2023-06-10 14:22:30 -04:00
Wing Lian	36ec6e1a0e	Add accelerate dep	2023-05-30 16:36:13 -04:00
NanoCode012	1bf1f59a41	Move black to dev requirements	2023-05-31 02:53:53 +09:00
NanoCode012	bdfe7c9201	Convert attrdict to addict	2023-05-28 23:06:10 +09:00
Wing Lian	312b8d51d6	update docker to compile latest bnb to properly support qlora	2023-05-27 12:36:53 -04:00
Wing Lian	7e81ca720b	Update requirements.txt Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>	2023-05-24 15:44:48 -04:00
Wing Lian	3b4d055edd	integrate qlora? maybe?	2023-05-24 14:32:39 -04:00
Wing Lian	fa8bd14be4	update entrypoint and force min accelerate	2023-05-18 06:25:34 -04:00
NanoCode012	fe582df7d3	Fix BNB OOM by pinning version	2023-05-09 02:10:31 +09:00
Wing Lian	990bec63e6	docker layer caching, build w axolotl from base build	2023-05-07 17:16:05 -04:00
Wing Lian	7753cdee57	cleanup empty lines, tweak env for runpod setup	2023-04-19 08:24:58 -04:00
Wing Lian	0a472e1e08	quickstart instructions for starting from runpod (#5 )	2023-04-18 19:22:25 -04:00
Wing Lian	4131183115	fix install to work with latest alpaca lora 4bit	2023-04-17 12:45:12 -04:00
Wing Lian	77fca25f1b	4bit quantized support (wip)	2023-04-17 11:37:39 -04:00
Wing Lian	937f44f021	helpful info output	2023-04-15 00:03:43 -04:00
Wing Lian	80b2ed29d8	various bugfixes	2023-04-14 21:37:07 -04:00
Wing Lian	f2a2029d0d	config chooser, update readme instructions, device config, llama flash attention, debug out the labels, fix config key checks, other bugfixes	2023-04-14 12:18:56 -04:00
Wing Lian	ce24f5e246	WIP for axolotl trainer	2023-04-14 00:20:05 -04:00

50 Commits