axolotl

Author	SHA1	Message	Date
Wing Lian	17b441248c	use latest hf-xet and don't install vllm for torch 2.7.0 (#2603 ) * use latest hf-xet and don't install vllm for torch 2.7.0 * fix runpod hub tests	2025-05-07 16:10:12 -04:00
Wing Lian	ff106ace44	ensure we pass axolotl extras to the Dockerfile so vllm is included in shipped images (#2599 )	2025-05-07 16:10:12 -04:00
Wing Lian	e85cbb8645	remove torch 2.4.1 CI as part of support deprecation (#2582 )	2025-05-07 16:10:10 -04:00
Wing Lian	dc4da4a7e2	update trl to 0.17.0 (#2560 ) * update trl to 0.17.0 * grpo + vllm no longer supported with 2.5.1 due to vllm constraints * disable VLLM_USE_V1 for ci * imporve handle killing off of multiprocessing vllm service * debug why this doesn't run in CI * increase vllm wait time * increase timeout to 5min * upgrade to vllm 0.8.4 * dump out the vllm log for debugging * use debug logging * increase vllm start timeout * use NVL instead * disable torch compile cache * revert some commented checks now that grpo tests are fixed * increase vllm timeoout back to 5min	2025-04-27 19:19:53 -04:00
Wing Lian	a4d5112ae1	builds for torch 2.7.0 (#2552 ) * builds for torch==2.7.0 * use xformers==0.0.29.post3 * no vllm support with torch 2.7 * update default, fix conditional * no xformers for 270 * no vllm on 2.7.0 for multigpu test too * remove deprecated verbose arg from scheduler * 2.7.0 tests on cpu	2025-04-24 00:39:31 -04:00
NanoCode012	682a9cf79b	Fix: add delinearization and make qlora work with fsdp2 (#2515 ) * fixes for delinearization, and make qlora work with fsdp2 * Add back mistakenly removed lm_eval * typo [skip ci] * patch evals for torch.compile + fsdp2 * also check torch_compile w fsdp2 * lots of fixes for flex attn with llama4 * fix patch check and patch llama4 too * attempt to make the patches stick * use transformers 4.51.2 * update configs and README for llama4 * remove torch.compile for CI test * cleanup any existing singletons * set singleton cache to None instead of deleting * use importlib reload with monkeypatch * don't worry about transformers version, mark inputs with grads, fix regex * make sure embeds aren't on cpu * logging and mem improvements * vllm version and add to docker, make sure to save processor on conversion * fix ambiguous tensor bool check * fix vllm to not use v1, upgrade hf transformers * fix tests * make flex_attn_compile_kwargs configurable, since this depends on model params --------- Co-authored-by: Wing Lian <wing@axolotl.ai> Co-authored-by: Salman Mohammadi <salman.mohammadi@outlook.com>	2025-04-15 23:31:39 -07:00
Wing Lian	e0aba74dd0	Release update 20250331 (#2460 ) [skip ci] * make torch 2.6.0 the default image * fix tests against upstream main * fix attribute access * use fixture dataset * fix dataset load * correct the fixtures + tests * more fixtures * add accidentally removed shakespeare fixture * fix conversion from unittest to pytest class * nightly main ci caches * build 12.6.3 cuda base image * override for fix from huggingface/transformers#37162 * address PR feedback	2025-04-01 08:47:50 -04:00
Wing Lian	04f6324833	build cloud images with torch 2.6.0 (#2413 ) * build cloud images with torch 2.6.0 * nightlies too	2025-03-13 23:28:51 -04:00
Wing Lian	ffae8d6a95	GRPO (#2307 )	2025-02-13 16:01:01 -05:00
NanoCode012	5bbad5ef93	feat: add torch2.6 to ci (#2311 )	2025-02-07 07:28:54 -05:00
Wing Lian	1063d82b51	match the cuda version for 2.4.1 build w/o tmux (#2299 )	2025-01-30 11:46:09 -05:00
salman	c071a530f7	removing 2.3.1 (#2294 )	2025-01-28 23:23:44 -05:00
Wing Lian	d8b4027200	use 2.5.1 docker images as latest tag as it seems stable (#2198 )	2025-01-10 08:35:25 -05:00
Wing Lian	db51a9e4cb	use pep440 instead of semver (#2088 ) [skip ci]	2024-11-19 15:02:10 -05:00
Wing Lian	a77c8a71cf	fix brackets on docker ci builds, add option to skip e2e builds [skip e2e] (#2080 ) [skip ci]	2024-11-19 10:29:31 -05:00
Wing Lian	5be8e13d35	make sure to add tags for versioned tag on cloud docker images (#2060 )	2024-11-14 10:24:49 -05:00
Wing Lian	c5eb9ea2c2	fix push to main and tag semver build for docker ci (#2054 )	2024-11-13 14:04:28 -05:00
Wing Lian	01881c3113	make sure to tag images in docker for tagged releases (#2051 ) [skip ci] * make sure to tag images in docker for tagged releases * fix tag event	2024-11-13 13:15:49 -05:00
Wing Lian	f68fb71005	update actions version for node16 deprecation (#2037 ) [skip ci] * update actions version for node16 deprecation * update pre-commit/action to use 3.0.1 for actions/cache@v4 dep * update docker/setup-buildx-action too to v3	2024-11-11 15:09:11 -05:00
Wing Lian	9bc3ee6c75	add axolotlai docker hub org to publish list (#2031 ) * add axolotlai docker hub org to publish list * fix to use latest actions docker metadata version * fix list in yaml for expected format for action * missed a change	2024-11-11 09:48:19 -05:00
Wing Lian	3cb2d75de1	upgrade pytorch to 2.5.1 (#2024 )	2024-11-08 10:46:24 -05:00
Wing Lian	718cfb2dd1	revert image tagged as main-latest (#1990 )	2024-10-22 13:54:24 -04:00
Wing Lian	5c629ee444	use torch 2.4.1 images as latest now that torch 2.5.0 is out (#1987 )	2024-10-21 19:51:06 -04:00
Wing Lian	e12a2130e9	first pass at pytorch 2.5.0 support (#1982 ) * first pass at pytorch 2.5.0 support * attempt to install causal_conv1d with mamba * gracefully handle missing xformers * fix import * fix incorrect version, add 2.5.0 * increase tests timeout	2024-10-21 11:00:45 -04:00
Wing Lian	e8d3da0081	upgrade pytorch from 2.4.0 => 2.4.1 (#1950 ) * upgrade pytorch from 2.4.0 => 2.4.1 * update xformers for updated pytorch version * handle xformers version case for torch==2.3.1	2024-10-09 11:53:56 -04:00
Wing Lian	dbf8fb549e	publish axolotl images without extras in the tag name (#1798 )	2024-07-30 13:36:19 -04:00
Wing Lian	9a63884597	update test and main/nightly builds (#1797 ) * update test and main/nightly builds * don't install mamba-ssm on 2.4.0 since it has no wheels yet	2024-07-30 12:37:40 -04:00
Wing Lian	1e57b4c562	update to pytorch 2.3.1 (#1746 ) [skip ci]	2024-07-13 13:28:17 -04:00
Wing Lian	a159724e44	bump trl and accelerate for latest releases (#1730 ) * bump trl and accelerate for latest releases * ensure that the CI runs on new gh org * drop kto_pair support since removed upstream	2024-07-10 11:15:44 -04:00
Wing Lian	60113437e4	cloud image w/o tmux (#1628 )	2024-05-15 22:27:40 -04:00
Wing Lian	3319780300	update torch 2.2.1 -> 2.2.2 (#1622 )	2024-05-15 09:45:27 -04:00
Wing Lian	70185763f6	add torch 2.3.0 to builds (#1593 )	2024-05-05 18:45:45 -04:00
Wing Lian	8cb127abeb	configure nightly docker builds (#1454 ) [skip ci] * configure nightly docker builds * also test update pytorch in modal ci	2024-03-29 08:25:45 -04:00
Wing Lian	5894f0e57e	make mlflow optional (#1317 ) * make mlflow optional * fix xformers don't patch swiglu if xformers not working fix the check for xformers swiglu * fix install of xformers with extra index url for docker builds * fix docker build arg quoting	2024-02-26 11:41:33 -05:00
NanoCode012	a359579371	deprecate: pytorch 2.0.1 image (#1315 ) [skip ci] * deprecate: pytorch 2.0.1 image * deprecate from main image * Update main.yml * Update tests.yml	2024-02-22 11:39:47 +09:00
Wing Lian	ea00dd0852	don't use load and push together (#1284 )	2024-02-09 14:54:31 -05:00
Wing Lian	aaf54dc730	run the docker image builds and push on gh action gpu runners (#1218 )	2024-02-09 10:32:54 -05:00
Wing Lian	74c72ca5eb	drop py39 docker images, add py311, upgrade pytorch to 2.1.2 (#1205 ) * drop py39 docker images, add py311, upgrade pytorch to 2.1.2 * also allow the main build to be manually triggered * fix workflow_dispatch in yaml	2024-01-26 00:38:49 -05:00
Wing Lian	0f77b8d798	add commit message option to skip docker image builds in ci (#1168 ) [skip ci]	2024-01-22 19:55:36 -05:00
Wing Lian	ece0211996	Agnostic cloud gpu docker image and Jupyter lab (#1097 )	2024-01-15 22:37:54 -05:00
Wing Lian	37820f6540	support for cuda 12.1 (#989 )	2023-12-22 11:08:22 -05:00
Hamel Husain	2e61dc3180	Add tests to Docker (#993 )	2023-12-22 06:37:20 -08:00
Hamel Husain	62ba1609b6	bump actions versions	2023-12-21 08:54:08 -08:00
Wing Lian	161bcb6517	Dockerfile torch fix (#987 ) * add torch to requirements.txt at build time to force version to stick * fix xformers check * better handling of xformers based on installed torch version * fix for ci w/o torch	2023-12-21 09:38:20 -05:00
Wing Lian	70157ccb8f	add a latest tag for regular axolotl image, cleanup extraneous print statement (#746 )	2023-10-19 12:28:29 -04:00
Wing Lian	2aa1f71464	fix pytorch 2.1.0 build, add multipack docs (#722 )	2023-10-13 08:57:28 -04:00
Wing Lian	7f2618b5f4	add docker images for pytorch 2.10 (#697 )	2023-10-07 12:23:31 -04:00
Wing Lian	9218ebecd2	e2e testing (#574 )	2023-09-14 21:56:11 -04:00
Wing Lian	3355706e22	Add support for GPTQ using native transformers/peft (#468 ) * auto gptq support * more tweaks and add yml * remove old gptq docker * don't need explicit peft install for tests * fix setup.py to use extra index url install torch for tests fix cuda version for autogptq index set torch in requirements so that it installs properly move gptq install around to work with github cicd * gptq doesn't play well with sample packing * address pr feedback * remove torch install for now * set quantization_config from model config * Fix the implementation for getting quant config from model config	2023-09-05 12:43:22 -04:00
mhenrichsen	cf6654769a	flash attn pip install (#426 ) * flash attn pip * add packaging * add packaging to apt get * install flash attn in dockerfile * remove unused whls * add wheel * clean up pr fix packaging requirement for ci upgrade pip for ci skip build isolation for requiremnents to get flash-attn working install flash-attn seperately * install wheel for ci * no flash-attn for basic cicd * install flash-attn as pip extras --------- Co-authored-by: Ubuntu <mgh@mgh-vm.wsyvwcia0jxedeyrchqg425tpb.ax.internal.cloudapp.net> Co-authored-by: mhenrichsen <some_email@hey.com> Co-authored-by: Mads Henrichsen <mads@BrbartiendeMads.lan> Co-authored-by: Wing Lian <wing.lian@gmail.com>	2023-08-18 19:00:27 -04:00

1 2

77 Commits