axolotl

Author	SHA1	Message	Date
Wing Lian	fedbcc0254	remove torch 2.4.1 CI as part of support deprecation (#2582 )	2025-04-29 08:28:32 -04:00
Dan Saunders	c4053481ff	Codecov fixes / improvements (#2549 ) * adding codecov reporting * random change * codecov fixes * adding missing dependency * fix --------- Co-authored-by: Dan Saunders <dan@axolotl.ai>	2025-04-23 10:33:30 -04:00
Wing Lian	e0aba74dd0	Release update 20250331 (#2460 ) [skip ci] * make torch 2.6.0 the default image * fix tests against upstream main * fix attribute access * use fixture dataset * fix dataset load * correct the fixtures + tests * more fixtures * add accidentally removed shakespeare fixture * fix conversion from unittest to pytest class * nightly main ci caches * build 12.6.3 cuda base image * override for fix from huggingface/transformers#37162 * address PR feedback	2025-04-01 08:47:50 -04:00
NanoCode012	cf0c79d52e	fix: minor patches for multimodal (#2441 ) * fix: update chat_template * fix: handle gemma3 showing a lot of no content for turn 0 * fix: remove unknown config from examples * fix: test * fix: temporary disable gemma2 test * fix: stop overwriting config.text_config unnecessarily * fix: handling of set cache to the text_config section * feat: add liger gemma support and bump liger to 0.5.5 * fix: add double use_cache setting * fix: add support for final_logit_softcap in CCE for gemma2/3 * fix: set use_cache before model load * feat: add missing layernorm override * fix: handle gemma3 rmsnorm * fix: use wrapper to pass dim as hidden_size * fix: change dim to positional * fix: patch with wrong mlp * chore: refactor use_cache handling * fix import issues * fix tests.e2e.utils import --------- Co-authored-by: Wing Lian <wing@axolotl.ai>	2025-03-31 13:40:12 +07:00
Wing Lian	aae4337f40	add 12.8.1 cuda to the base matrix (#2426 ) * add 12.8.1 cuda to the base matrix * use nightly * bump deepspeed and set no binary * deepspeed binary fixes hopefully * install deepspeed by itself * multiline fix * make sure ninja is installed * try with reversion of packaging/setuptools/wheel install * use license instead of license-file * try rolling back packaging and setuptools versions * comment out license for validation for now * make sure packaging version is consistent * more parity across tests and docker images for packaging/setuptools	2025-03-21 10:17:25 -04:00
NanoCode012	fd8cb32547	chore: remove redundant py310 from tests (#2316 )	2025-02-07 21:34:16 -05:00
NanoCode012	5bbad5ef93	feat: add torch2.6 to ci (#2311 )	2025-02-07 07:28:54 -05:00
salman	c071a530f7	removing 2.3.1 (#2294 )	2025-01-28 23:23:44 -05:00
Wing Lian	5e0124e2ab	update modal version for ci (#2242 )	2025-01-09 21:01:02 +00:00
Wing Lian	02629c7cdf	parity for nightly ci - make sure to install setuptools (#2176 ) [skip ci]	2024-12-11 20:14:55 -05:00
Wing Lian	d009ead101	fix build w pyproject to respect insalled torch version (#2168 ) * fix build w pyproject to respect insalled torch version * include in manifest * disable duplicate code check for now * move parser so it can be found * add checks for correct pytorch version so this doesn't slip by again	2024-12-10 16:25:25 -05:00
Wing Lian	5e9fa33f3d	reduce test concurrency to avoid HF rate limiting, test suite parity (#2128 ) * reduce test concurrency to avoid HF rate limiting, test suite parity * make val_set_size smaller to speed up e2e tests * more retries for pytest fixture downloads * val_set_size was too small * move retry_on_request_exceptions to data utils and add retry strategy * pre-download ultrafeedback as a test fixture * refactor download retry into it's own fn * don't import from data utils * use retry mechanism now for fixtures	2024-12-06 10:20:20 -05:00
Dan Saunders	08fa133177	Fix broken CLI; remove duplicate metadata from setup.py (#2136 ) * Fix broken CLI; remove duplicate metadata from setup.py * Adding tests.yml CLI check * updating * remove test with requests to github due to rate limiting --------- Co-authored-by: Dan Saunders <dan@axolotl.ai>	2024-12-06 10:19:54 -05:00
Dan Saunders	fc973f4322	CLI Implementation with Click (#2107 ) * Initial CLI implementation with click package * Adding fetch command for pulling examples and deepspeed configs * Automating default options for CliArgs classes * Mimicking existing no config behavior * bugfix in choose_config * Updating fetch to sync instead of re-download * bugfix * isort fix * fixing yaml isort order * pre-commit fixes * simplifying argument parsing -- pass through kwargs to do_cli * make accelerate launch default for non-preprocess commands * fixing arg handling * testing None placeholder approach * removing hacky --use-gpu argument to preprocess command * Adding brief README documentation for CLI * remove (New) * Initial CLI pytest tests * progress on CLI pytest * adding inference CLI tests; cleanup * Refactor train CLI tests to remove various mocking * Major CLI test refator; adding remaining CLI codepath test coverage * pytest fixes * remove integration markers * parallelizing examples, deepspeed config downloads; rename test to match other CLI test naming * moving cli pytest due to isolation issues; cleanup * testing fixes; various minor improvements * fix * tests fix * Update tests/cli/conftest.py Co-authored-by: Wing Lian <wing.lian@gmail.com> --------- Co-authored-by: Dan Saunders <dan@axolotl.ai> Co-authored-by: Wing Lian <wing.lian@gmail.com>	2024-12-05 22:11:48 -05:00
NanoCode012	bd8436bc6e	feat: add cut_cross_entropy (#2091 ) * feat: add cut_cross_entropy * fix: add to input * fix: remove from setup.py * feat: refactor into an integration * chore: ignore lint * feat: add test for cce * fix: set max_steps for liger test * chore: Update base model following suggestion Co-authored-by: Wing Lian <wing.lian@gmail.com> * chore: update special_tokens following suggestion Co-authored-by: Wing Lian <wing.lian@gmail.com> * chore: remove with_temp_dir following comments * fix: plugins aren't loaded * chore: update quotes in error message * chore: lint * chore: lint * feat: enable FA on test * chore: refactor get_pytorch_version * fix: lock cce commit version * fix: remove subclassing UT * fix: downcast even if not using FA and config check * feat: add test to check different attentions * feat: add install to CI * chore: refactor to use parametrize for attention * fix: pytest not detecting test * feat: handle torch lower than 2.4 * fix args/kwargs to match docs * use release version cut-cross-entropy==24.11.4 * fix quotes * fix: use named params for clarity for modal builder * fix: handle install from pip * fix: test check only top level module install * fix: re-add import check * uninstall existing version if no transformers submodule in cce * more dataset fixtures into the cache --------- Co-authored-by: Wing Lian <wing.lian@gmail.com> Co-authored-by: Wing Lian <wing@axolotl.ai>	2024-12-03 08:22:22 -05:00
Wing Lian	2f20cb7ebf	upgrade datasets==3.1.0 and add upstream check (#2067 ) [skip ci]	2024-11-15 19:08:38 -05:00
Wing Lian	f68fb71005	update actions version for node16 deprecation (#2037 ) [skip ci] * update actions version for node16 deprecation * update pre-commit/action to use 3.0.1 for actions/cache@v4 dep * update docker/setup-buildx-action too to v3	2024-11-11 15:09:11 -05:00
Wing Lian	3cb2d75de1	upgrade pytorch to 2.5.1 (#2024 )	2024-11-08 10:46:24 -05:00
Wing Lian	052a9a79b4	only run the remainder of the gpu test suite if one case passes first (#2009 ) [skip ci] * only run the remainder of the gpu test suite if one case passes first * also reduce the test matrix	2024-10-31 13:45:01 -04:00
NanoCode012	2501c1a6a3	Fix: Gradient Accumulation issue (#1980 ) * feat: support new arg num_items_in_batch * use kwargs to manage extra unknown kwargs for now * upgrade against upstream transformers main * make sure trl is on latest too * fix for upgraded trl * fix: handle trl and transformer signature change * feat: update trl to handle transformer signature * RewardDataCollatorWithPadding no longer has max_length * handle updated signature for tokenizer vs processor class * invert logic for tokenizer vs processor class * processing_class, not processor class * also handle processing class in dpo * handle model name w model card creation * upgrade transformers and add a loss check test * fix install of tbparse requirements * make sure to add tbparse to req * feat: revert kwarg to positional kwarg to be explicit --------- Co-authored-by: Wing Lian <wing.lian@gmail.com>	2024-10-25 11:28:23 -04:00
Wing Lian	e12a2130e9	first pass at pytorch 2.5.0 support (#1982 ) * first pass at pytorch 2.5.0 support * attempt to install causal_conv1d with mamba * gracefully handle missing xformers * fix import * fix incorrect version, add 2.5.0 * increase tests timeout	2024-10-21 11:00:45 -04:00
Wing Lian	e8d3da0081	upgrade pytorch from 2.4.0 => 2.4.1 (#1950 ) * upgrade pytorch from 2.4.0 => 2.4.1 * update xformers for updated pytorch version * handle xformers version case for torch==2.3.1	2024-10-09 11:53:56 -04:00
Wing Lian	3c6b9eda2e	run pytests with varied pytorch versions too (#1883 )	2024-08-31 22:49:35 -04:00
Wing Lian	e8ff5d5738	don't mess with bnb since it needs compiled wheels (#1859 )	2024-08-23 12:18:47 -04:00
Wing Lian	b33dc07a77	rename nightly test and add badge (#1853 )	2024-08-22 13:13:33 -04:00
Wing Lian	dcbff16983	run nightly ci builds against upstream main (#1851 ) * run nightly ci builds against upstream main * add test badges * run the multigpu tests against nightly main builds too	2024-08-22 13:10:54 -04:00

26 Commits