axolotl

Author	SHA1	Message	Date
Wing Lian	afb8218c67	fix the monkeypatch	2024-11-19 02:12:33 -05:00
Wing Lian	1ff78d6347	remove temp_dir decorator as we're using fixtures now	2024-11-19 01:28:27 -05:00
Wing Lian	613a217142	monkeypatch for zero3 w 8bit lora	2024-11-19 00:45:20 -05:00
Wing Lian	127953af4e	zero3 can'y use 8bit optimizer	2024-11-19 00:45:20 -05:00
Wing Lian	920ea77bdf	reduce number of steps	2024-11-19 00:45:20 -05:00
Wing Lian	ef60e3e851	bi-weekly 8bit lora zero3 check	2024-11-19 00:45:20 -05:00
Wing Lian	c07bd2fa65	Readme updates v2 (#2078 ) * update readme logos * use full logo * Fix svgs * add srcset * resize svgs to match * Rename file * align badges center	2024-11-18 14:58:03 -05:00
Wing Lian	ed079d434a	static assets, readme, and badges update v1 (#2077 )	2024-11-18 13:59:32 -05:00
Wing Lian	8403c67156	don't build bdist (#2076 ) [skip ci]	2024-11-18 12:36:03 -05:00
Wing Lian	9871fa060b	optim e2e tests to run a bit faster (#2069 ) [skip ci] * optim e2e tests to run a bit faster * run prequant w/o lora_modules_to_save * use smollm2	2024-11-18 12:35:31 -05:00
Wing Lian	70cf79ef52	upgrade autoawq==0.2.7.post2 for transformers fix (#2070 ) * point to upstream autoawq for transformers fix * use autoawq 0.2.7 release * test wheel for awq * try different format for wheel def * autoawq re-release * Add intel_extension_for_pytorch dep * ipex gte version * forcefully remove intel-extension-for-pytorch * add -y option to pip uninstall for ipex * use post2 release for autoawq and remove uninstall of ipex	2024-11-18 11:53:37 -05:00
Wing Lian	c06b8f0243	increase worker count to 8 for basic pytests (#2075 ) [skip ci]	2024-11-18 11:52:35 -05:00
Chirag Jain	0c8b1d824a	Update `get_unpad_data` patching for multipack (#2013 ) * Update `get_unpad_data` patching for multipack * Update src/axolotl/utils/models.py * Update src/axolotl/utils/models.py * Add test case --------- Co-authored-by: Wing Lian <wing.lian@gmail.com> Co-authored-by: Wing Lian <wing@axolotl.ai>	2024-11-15 20:35:50 -05:00
NanoCode012	fd70eec577	fix: loading locally downloaded dataset (#2056 ) [skip ci]	2024-11-15 20:35:26 -05:00
Wing Lian	d42f202046	Fsdp grad accum monkeypatch (#2064 )	2024-11-15 19:11:04 -05:00
Wing Lian	0dabde1962	support for schedule free and e2e ci smoke test (#2066 ) [skip ci] * support for schedule free and e2e ci smoke test * set default lr scheduler to constant in test * ignore duplicate code * fix quotes for config/dict	2024-11-15 19:10:14 -05:00
Wing Lian	15f1462ccd	support passing trust_remote_code to dataset loading (#2050 ) [skip ci] * support passing trust_remote_code to dataset loading * add doc for trust_remote_code in dataset config	2024-11-15 19:09:48 -05:00
Wing Lian	521e62daf1	remove the bos token from dpo outputs (#1733 ) [skip ci] * remove the bos token from dpo outputs * don't forget to fix prompt_input_ids too * use processing_class instead of tokenizer * fix for processing class	2024-11-15 19:09:20 -05:00
Wing Lian	c16ec398d7	update to be deprecated evaluation_strategy (#1682 ) [skip ci] * update to be deprecated evaluation_strategy and c4 dataset * chore: lint * remap eval strategy to new config and add tests	2024-11-15 19:09:00 -05:00
Wing Lian	2f20cb7ebf	upgrade datasets==3.1.0 and add upstream check (#2067 ) [skip ci]	2024-11-15 19:08:38 -05:00
Wing Lian	71d4030b79	gradient accumulation tests, embeddings w pad_token fix, smaller models (#2059 ) * add more test cases for gradient accumulation and fix zero3 * swap out for smaller model * fix missing return * fix missing pad_token in config * support concurrency for multigpu testing * cast empty deepspeed to empty string for zero3 check * fix temp_dir as fixture so parametrize works properly * fix test file for multigpu evals * don't use default * don't use default for fsdp_state_dict_type * don't use llama tokenizer w smollm * also automatically cancel multigpu for concurrency	2024-11-14 12:59:00 -05:00
Wing Lian	f3a5d119af	fix env var extraction (#2043 ) [skip ci]	2024-11-14 12:58:06 -05:00
Wing Lian	ba219b51a5	fix duplicate base build (#2061 ) [skip ci]	2024-11-14 10:31:19 -05:00
Wing Lian	5be8e13d35	make sure to add tags for versioned tag on cloud docker images (#2060 )	2024-11-14 10:24:49 -05:00
Wing Lian	2d7830fda6	upgrade to flash-attn 2.7.0 (#2048 )	2024-11-14 06:59:25 -05:00
Wing Lian	5e98cdddac	Grokfast support (#1917 )	2024-11-13 17:10:36 -05:00
Sunny Liu	1d7aee0ad2	ADOPT optimizer integration (#2032 ) [skip ci] * adopt integration * stuff * doc and test for ADOPT * rearrangement * fixed formatting * hacking pre-commit * chore: lint * update module doc for adopt optimizer * remove un-necessary example yaml for adopt optimizer * skip test adopt if torch<2.5.1 * formatting * use version.parse * specifies required torch version for adopt_adamw --------- Co-authored-by: sunny <sunnyliu19981005@gmail.com> Co-authored-by: Wing Lian <wing@axolotl.ai>	2024-11-13 17:10:17 -05:00
Wing Lian	659ee5d723	don't cancel the tests on main automatically for concurrency (#2055 ) [skip ci]	2024-11-13 17:07:41 -05:00
Sunny Liu	342935cff3	Update unsloth for torch.cuda.amp deprecation (#2042 ) * update deprecated unsloth tirch cuda amp decorator * WIP fix torch.cuda.amp deprecation * lint * laxing torch version requirement * remove use of partial * remove use of partial * lint --------- Co-authored-by: sunny <sunnyliu19981005@gmail.com>	2024-11-13 15:17:34 -05:00
Wing Lian	c5eb9ea2c2	fix push to main and tag semver build for docker ci (#2054 )	2024-11-13 14:04:28 -05:00
Wing Lian	f2145a3ccb	add default torch version if not installed, and support for xformers new wheels (#2049 )	2024-11-13 13:16:47 -05:00
Wing Lian	010d0e7ff3	retry flaky test_packing_stream_dataset test that timesout on read (#2052 ) [skip ci]	2024-11-13 13:16:16 -05:00
Wing Lian	01881c3113	make sure to tag images in docker for tagged releases (#2051 ) [skip ci] * make sure to tag images in docker for tagged releases * fix tag event	2024-11-13 13:15:49 -05:00
Wing Lian	0e8eb96e07	run pypi release action on tag create w version (#2047 )	2024-11-13 10:21:48 -05:00
NanoCode012	4e1891b12b	feat: upgrade to liger 0.4.1 (#2045 )	2024-11-13 10:07:24 -05:00
NanoCode012	28924fc791	feat: cancel ongoing tests if new CI is triggered (#2046 ) [skip ci]	2024-11-13 10:06:59 -05:00
NanoCode012	8c480b2804	fix: inference not using chat_template (#2019 ) [skip ci]	2024-11-13 10:06:41 -05:00
Oliver Molenschot	a4b1cc6df0	Add example YAML file for training Mistral using DPO (#2029 ) [skip ci] * Add example YAML file for training Mistral using DPO * chore: lint * Apply suggestions from code review Co-authored-by: NanoCode012 <kevinvong@rocketmail.com> * Update mistral-dpo.yml Adding qlora and removing role-related data (unecessary) * Rename mistral-dpo.yml to mistral-dpo-qlora.yml * Apply suggestions from code review Co-authored-by: NanoCode012 <kevinvong@rocketmail.com> --------- Co-authored-by: Wing Lian <wing.lian@gmail.com> Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>	2024-11-13 10:06:25 -05:00
NanoCode012	7b78a31593	feat: print out dataset length even if not preprocess (#2034 ) [skip ci]	2024-11-13 10:06:00 -05:00
Wing Lian	810ebc2c0e	invert the string in string check for p2p device check (#2044 )	2024-11-12 23:20:47 -05:00
Wing Lian	ad435a3b09	add P2P env when multi-gpu but not the full node (#2041 ) Co-authored-by: Wing Lian <wing@axolotl.ai>	2024-11-12 17:58:26 -05:00
NanoCode012	9f1cf9b17c	fix: handle sharegpt dataset missing (#2035 ) * fix: handle sharegpt dataset missing * fix: explanation * feat: add test	2024-11-12 12:51:37 +07:00
Wing Lian	3931a42763	change deprecated modal Stub to App (#2038 )	2024-11-11 15:10:34 -05:00
NanoCode012	dc8f9059f7	feat: add metharme chat_template (#2033 ) [skip ci] * feat: add metharme chat_template * fix: add eos token	2024-11-11 15:09:58 -05:00
Wing Lian	234e94e9dd	replace references to personal docker hub to org docker hub (#2036 ) [skip ci]	2024-11-11 15:09:29 -05:00
Wing Lian	f68fb71005	update actions version for node16 deprecation (#2037 ) [skip ci] * update actions version for node16 deprecation * update pre-commit/action to use 3.0.1 for actions/cache@v4 dep * update docker/setup-buildx-action too to v3	2024-11-11 15:09:11 -05:00
Wing Lian	9bc3ee6c75	add axolotlai docker hub org to publish list (#2031 ) * add axolotlai docker hub org to publish list * fix to use latest actions docker metadata version * fix list in yaml for expected format for action * missed a change	2024-11-11 09:48:19 -05:00
Wing Lian	d356740ffa	move deprecated kwargs from trainer to trainingargs (#2028 )	2024-11-10 12:45:47 -05:00
Wing Lian	e4af51eb66	remove direct dependency on fused dense lib (#2027 ) Some checks failed publish pypi / Upload release to PyPI (push) Has been cancelled Details v0.5.0	2024-11-08 14:48:04 -05:00
Wing Lian	e20b15bee3	make publish to pypi manually dispatchable as a workflow (#2026 ) [skip ci]	2024-11-08 14:18:16 -05:00

1 2 3 4 5 ...

1707 Commits