axolotl

Author	SHA1	Message	Date
Wing Lian	3c1921e400	add hf cache caching for GHA (#2247 ) * add hf cache caching for GHA * use modal volume to cache hf data * make sure to update the cache as we add new fixtures in conftest	2025-01-09 20:59:54 +00:00
NanoCode012	bd8436bc6e	feat: add cut_cross_entropy (#2091 ) * feat: add cut_cross_entropy * fix: add to input * fix: remove from setup.py * feat: refactor into an integration * chore: ignore lint * feat: add test for cce * fix: set max_steps for liger test * chore: Update base model following suggestion Co-authored-by: Wing Lian <wing.lian@gmail.com> * chore: update special_tokens following suggestion Co-authored-by: Wing Lian <wing.lian@gmail.com> * chore: remove with_temp_dir following comments * fix: plugins aren't loaded * chore: update quotes in error message * chore: lint * chore: lint * feat: enable FA on test * chore: refactor get_pytorch_version * fix: lock cce commit version * fix: remove subclassing UT * fix: downcast even if not using FA and config check * feat: add test to check different attentions * feat: add install to CI * chore: refactor to use parametrize for attention * fix: pytest not detecting test * feat: handle torch lower than 2.4 * fix args/kwargs to match docs * use release version cut-cross-entropy==24.11.4 * fix quotes * fix: use named params for clarity for modal builder * fix: handle install from pip * fix: test check only top level module install * fix: re-add import check * uninstall existing version if no transformers submodule in cce * more dataset fixtures into the cache --------- Co-authored-by: Wing Lian <wing.lian@gmail.com> Co-authored-by: Wing Lian <wing@axolotl.ai>	2024-12-03 08:22:22 -05:00
Wing Lian	3931a42763	change deprecated modal Stub to App (#2038 )	2024-11-11 15:10:34 -05:00
Wing Lian	e12a2130e9	first pass at pytorch 2.5.0 support (#1982 ) * first pass at pytorch 2.5.0 support * attempt to install causal_conv1d with mamba * gracefully handle missing xformers * fix import * fix incorrect version, add 2.5.0 * increase tests timeout	2024-10-21 11:00:45 -04:00
Wing Lian	dcbff16983	run nightly ci builds against upstream main (#1851 ) * run nightly ci builds against upstream main * add test badges * run the multigpu tests against nightly main builds too	2024-08-22 13:10:54 -04:00
Wing Lian	54392ac8a6	Attempt to run multigpu in PR CI for now to ensure it works (#1815 ) [skip ci] * Attempt to run multigpu in PR CI for now to ensure it works * fix yaml file * forgot to include multigpu tests * fix call to cicd.multigpu * dump dictdefault to dict for yaml conversion * use to_dict instead of casting * 16bit-lora w flash attention, 8bit lora seems problematic * add llama fsdp test * more tests * Add test for qlora + fsdp with prequant * limit accelerate to 2 processes and disable broken qlora+fsdp+bnb test * move multigpu tests to biweekly	2024-08-09 11:50:13 -04:00
Wing Lian	00018629e7	run tests again on Modal (#1289 ) [skip ci] * run tests again on Modal * make sure to run the full suite of tests on modal * run cicd steps via shell script * run tests in different runs * increase timeout * split tests into steps on modal * increase workflow timeout * retry doing this with only a single script * fix yml launch for modal ci * reorder tests to run on modal * skip dpo tests on modal * run on L4s, A10G takes too long * increase CPU and RAM for modal test * run modal tests on A100s * skip phi test on modal * env not arg in modal dockerfile * upgrade pydantic and fastapi for modal tests * cleanup stray character * use A10s instead of A100 for modal	2024-02-29 14:26:26 -05:00
Wing Lian	8da1633124	Revert "run PR e2e docker CI tests in Modal" (#1220 ) [skip ci]	2024-01-26 16:50:44 -05:00
Wing Lian	36d053f6f0	run PR e2e docker CI tests in Modal (#1217 ) [skip ci] * wip modal for ci * handle falcon layernorms better * update * rebuild the template each time with the pseudo-ARGS * fix ref * update tests to use modal * cleanup ci script * make sure to install jinja2 also * kickoff the gh action on gh hosted runners and specify num gpus	2024-01-26 16:13:27 -05:00

9 Commits