axolotl

Author	SHA1	Message	Date
Dan Saunders	125e7b5fe6	fast path	2025-09-17 13:44:26 -04:00
Dan Saunders	43ada1278a	moe kernels init scaffold	2025-09-17 13:44:26 -04:00
salman	e5c427f6de	qat doc updates (#3162 ) [skip-ci]	2025-09-17 10:38:15 +01:00
salman	58d67bf98d	Migrate QAT API; fix `axolotl quantize` for QAT-ed models; add NVFP4 (#3107 )	2025-09-12 10:55:50 +01:00
yardenhoch	efa1da52d5	Center rewards coefficient (#3124 ) * feat: add center_rewards_coefficient for reward modeling - Add center_rewards_coefficient parameter to Pydantic schema with paper reference - Pass parameter through base builder and causal builder to training args - Add documentation section with usage examples and theoretical background - Enable parameter in reward modeling example configs with recommended value - Enables reward centering for improved training stability in RLHF workflows Implements auxiliary loss from Eisenstein et al. 2023 (https://huggingface.co/papers/2312.09244) to incentivize mean-zero reward outputs without post-training normalization. * Update description * test: add unit tests for center_rewards_coefficient integration * Update src/axolotl/core/builders/base.py Co-authored-by: NanoCode012 <kevinvong@rocketmail.com> * Update docs/reward_modelling.qmd Co-authored-by: NanoCode012 <kevinvong@rocketmail.com> * Update docs/reward_modelling.qmd Co-authored-by: NanoCode012 <kevinvong@rocketmail.com> * reference to TRL documentation. * add new reward model configuration for qwen3 with comprehensive parameters * Verified center_rewards_coefficient is correctly passed through the trainer builder to training arguments. * Refactor reward modeling documentation to consolidate information on center_rewards_coefficient * Remove unit tests for center_rewards_coefficient integration as part of codebase cleanup. * linting * nit * Apply suggestions from code review Co-authored-by: NanoCode012 <kevinvong@rocketmail.com> * lint --------- Co-authored-by: NanoCode012 <kevinvong@rocketmail.com> Co-authored-by: Salman Mohammadi <salman.mohammadi@outlook.com>	2025-09-03 16:22:37 -04:00
NanoCode012	e48aa8a5b1	feat(doc): improve visibility for colab notebooks (#3110 ) [skip ci] * feat: improve visibility for colab notebooks * fix: link to GH colab * feat: change to badge and move higher	2025-09-03 01:40:53 -04:00
Dan Saunders	231a67e70b	Streaming SFT support (#3101 ) * working * fixes * deprecate --iterable; cleanup * pretrain_multipack_buffer_size -> streaming_multipack_buffer_size * improvements * tests * remove unused * docs, examples * nit * nit * add val_set_size validation * val * nit * min * coderabbito * cleanup * nit * add depr warning, cleanup * nit * fix test, fix quarto * fix * review comments * review comments * fix	2025-09-02 12:08:44 -04:00
Wing Lian	7ed40f1d70	automatically set env vars for single gpu deepspeed zero3 (#3118 ) [skip ci] * automatically set env vars for single gpu deepspeed zero3 * use setdefault	2025-08-29 13:36:47 -04:00
Dan Saunders	79ddaebe9a	Add ruff, remove black, isort, flake8, pylint (#3092 ) * black, isort, flake8 -> ruff * remove unused * add back needed import * fix	2025-08-23 23:37:33 -04:00
Wing Lian	130ef7c51a	Various fixes for VLMs (#3063 ) * fix to not use batch feature indexing * more vlm fixes * use AutoModelForImageTextToText * add example yaml and need num2words for chat template * improve handling of adding image tokens to conversation * add lfm2-vl support * update the lfm readme * fix markdown and add rtol for loss checks * feat: add smolvlm2 processing strat * fix: check for causal-conv1d in lfm models * feat: add docs for lfm2 * feat: add new models and tips to docs * feat: add smolvlm2 docs and remove extra dep * chore: update docs * feat: add video instructions * chore: cleanup * chore: comments * fix: typo * feat: add usage stats * chore: refactor --------- Co-authored-by: NanoCode012 <nano@axolotl.ai>	2025-08-15 10:52:57 -04:00
NanoCode012	4273d5cf7e	feat: update nd parallelism readme (#3039 ) Co-authored-by: salman <salman.mohammadi@outlook.com>	2025-08-08 12:45:36 +01:00
NanoCode012	ca796fb56e	feat(doc): update gpt-oss readme (#3029 ) [skip ci] * feat(doc): update gpt-oss readme * fix: caps * feat: add toolcalling section * feat: add example tool dataset to docs * chore: update	2025-08-07 09:26:42 -04:00
salman	46dfacf255	ND Parallel Doc Nits (#3032 )	2025-08-07 10:34:26 +01:00
NanoCode012	e3177c3210	feat: add complete optimizer docs (#3017 ) [skip ci] * feat: add complete optimizer docs * fix: deprecate old torchao adamw low bit	2025-08-06 08:01:51 -04:00
Wing Lian	ab49d16e34	Dion optimizer support (#3014 ) * Add support for Dion optimizer * dion training kwargs * fix var names * no dion 8bit for now * use updated axolotl-contribs-mit for dion optimizer * add smoke test for dion optimizer * add docs * fix typo during edits * fix test to not remove load in 8bit	2025-08-04 16:33:30 -04:00
NanoCode012	a54c1be972	Fix: shorten mem logs to 2 decimal places and renamed nd docs (#3011 ) [skip ci] * fix: shorten memory logs * fix: title name	2025-08-04 10:23:36 -04:00
NanoCode012	7026cd5e9e	Feat: Add N-D parallelism docs (#2989 ) * fix: remove non-existent file * feat: add n-d parallel docs * fix: comments --------- Co-authored-by: salman <salman.mohammadi@outlook.com>	2025-08-01 13:18:31 +07:00
salman	294c7fe7a6	Distributed/ND-Parallel (#2977 )	2025-07-31 15:25:02 -04:00
Dan Saunders	bb1cae1a20	CLI: add --launcher option, support launcher args, cleanup, refactor (#2924 ) * add --launcher option; explicit True/False bool args; small cleanup * refactor * add torchrun, accelerate cli args * add rdzv arg default + tests * update _quarto * coderabbit * fix * we can't set rdvz_id independently across nodes * coderabbit * fix tests	2025-07-30 15:46:56 -04:00
NanoCode012	41434f0c28	feat(doc): add all providers to readme (#2972 ) [skip ci] * feat(doc): add vastai link * feat: add cloud providers to readme for more visibility * add prime intellect, remove Modal as sponsor --------- Co-authored-by: Wing Lian <wing@axolotl.ai>	2025-07-27 17:03:50 -04:00
Wing Lian	0ff2f172ef	Act offload lora fix (#2928 ) [skip ci] * fix activation offloading with lora * update w e2e test * add docs for error	2025-07-24 16:10:04 -04:00
Dan Saunders	208fb7b8e7	basic torchao fp8 mixed precision training (#2926 ) * debug * debug * debug * revert unneeded change * add accelerator config to base trainer builder * add back accumulated_cache_size_limit setting * lint * accelerator constructor patch for single-GPU torch fp8 * lint * re-using existing fp8 code * lint * remove accelerate patch now fix in latest release * fix * docs * add fp8 + fsdp2 example * remove unused config * update config * smoke tests * add validator * add 2.7.0 guard for fsdp2 * fix * add config descriptions * add FSDP doc link * nit * set force_recompute_fp8_weight_in_bwd with enable_fsdp_float8_all_gather * better cfg for smoke tests * add test for accelerate patching * update fp8 validator	2025-07-22 16:27:47 -04:00
NanoCode012	dfba881e99	Feat: add gemma3n support (#2852 ) * feat: add gemma3n cce * feat: add sample config * feat: add gemma3n multimodal mode * feat: add audio example * feat: support audio and return pixel values in collator * feat: support unmask only assistant region (gemma3n for now) * feat(doc): add notes for audio loading * feat: add audio support for gemma3n * feat: update examples * feat: add gemma3n to the docs * fix: add link at top * feat(doc): clarify additional requirements * fix: mllama missing aspect ratio * fix: mllama need attention fixes for fa2 * Partially Revert "fix: mllama need attention fixes for fa2" This reverts commit `a0bfdd1777`. * fix: disable FA2 for mllama in vision mode * feat: update configs to use proper attention * fix: support other vision features * feat(doc): clarify requirements for gemma3n	2025-07-22 16:52:15 +07:00
salman	e5734e5cf0	adding torchtitan link (#2945 ) [skip ci]	2025-07-19 13:54:14 -04:00
Wing Lian	99187cd208	Activation Offloading w CUDA Streams (#2900 ) [skip ci] * use cuda streams for activation offloading * use torch native ops * update cfg schema for streams * fix literal constructor for set * use context for training step so it doesn't affect evals * disable streams * auto gc on eval steps * use activation_offloading config arg * add docs for gradient checkpointing * handle validation for gc/ao * use cuda streams for act offloading * add more validation for AC w/o GC * fix docs * move activation_offloading lower in definition so it doesn't break args/kwargs * fix kd due to import order	2025-07-14 20:10:20 -04:00
Jiawei Liu	7fb8441e0e	fix: customized dataset with simpo (#2894 ) [skip ci]	2025-07-12 11:40:30 -04:00
NanoCode012	4dc5910e1c	feat(doc): re-add docker 2.7.0 tag back (#2902 ) [skip ci]	2025-07-12 11:40:01 -04:00
salman	d6e4a611e5	FSDP1 -> FSDP2 (#2760 ) * FSDP2 args migration implementation This commit implements the migration to FSDP2 arguments including: - FSDP2 support with LoRA training - DPO integration with FSDP2 - Model loading fixes and refactoring - CPU offloading and PEFT handling - Test updates and CI improvements - Bug fixes for dtype errors and various edge cases	2025-07-12 15:18:01 +01:00
NanoCode012	9b95a625ab	feat: add devstral small 2507 (#2896 ) * feat: add devstral small 2507 * chore: update blog doc	2025-07-11 09:34:19 +07:00
Wing Lian	76aeb16156	tiled_mlp supports single gpu (#2891 ) * tiled_mlp supports single gpu * use checkpoint offloading for arctic training * patch torch checkpoint too * support for single gpu zero3 * add linkback to where it was copied from	2025-07-09 12:48:22 -04:00
Wing Lian	c6d69d5c1b	release v0.11.0 (#2875 ) Some checks failed ci-cd / build-axolotl (<nil>, 126, 12.6.3, 3.11, 2.6.0) (push) Has been cancelled Details ci-cd / build-axolotl (<nil>, 126, 12.6.3, 3.11, 2.7.1) (push) Has been cancelled Details ci-cd / build-axolotl (<nil>, 128, 12.8.1, 3.11, 2.7.1) (push) Has been cancelled Details ci-cd / build-axolotl (vllm, 126, 12.6.3, 3.11, 2.7.0) (push) Has been cancelled Details publish pypi / Create Release (push) Has been cancelled Details ci-cd / build-axolotl-cloud (<nil>, 126, 12.6.3, 3.11, 2.7.0) (push) Has been cancelled Details ci-cd / build-axolotl-cloud (<nil>, 126, 12.6.3, 3.11, 2.7.1) (push) Has been cancelled Details ci-cd / build-axolotl-cloud (<nil>, 126, 12.6.3, true, 3.11, 2.6.0) (push) Has been cancelled Details ci-cd / build-axolotl-cloud (<nil>, 128, 12.8.1, 3.11, 2.7.1) (push) Has been cancelled Details ci-cd / build-axolotl-cloud-no-tmux (<nil>, 126, 12.6.3, 3.11, 2.6.0) (push) Has been cancelled Details publish pypi / Upload release to PyPI (push) Has been cancelled Details * release v0.11.0 * don't build vllm into release for now * remove 2.5.1 references * smollm3 multipack support * fix ordering of e2e tests	2025-07-09 09:22:35 -04:00
float-trip	1032e22650	Fix link in FSDP + QLoRA docs. (#2879 ) [skip ci]	2025-07-08 09:19:09 -04:00
Wing Lian	d68cc1e8ab	densemixer plugin integration (#2868 ) * densemixer plugin integration * update readme with usage docs * automatically find new integrations that aren't explicitly defined * make sure to import os	2025-07-07 17:05:19 -04:00
NanoCode012	22d4a838dc	feat(doc): add vllm and fa2 incompat error to faq (#2877 )	2025-07-07 14:13:37 -04:00
NanoCode012	5a961ecadf	Fix: do not call preprocess in multimodal or pretraining case (#2861 ) * fix: let users know to not call preprocess for vision mode * fix: improve ux for pretraining dataset and skip prepare ds * feat: add info to doc * Update src/axolotl/cli/preprocess.py following comment Co-authored-by: salman <salman.mohammadi@outlook.com> --------- Co-authored-by: salman <salman.mohammadi@outlook.com>	2025-07-06 21:55:33 -04:00
NanoCode012	bf5928d0ee	feat(doc): update docker tag examples (#2851 ) [skip ci] * feat(doc): update docker tag examples * chore: comment	2025-07-02 08:05:01 -04:00
NanoCode012	927bf530bc	fix(doc): default messages example used wrong key (#2832 ) * fix(doc): default messages example used wrong key * feat: add links to SP, multi-gpu, multi-node on readme	2025-06-26 10:47:31 -04:00
NanoCode012	26c39e1ca7	fix(doc): address exitcode formatting to help search (#2809 ) [skip ci]	2025-06-19 11:19:52 -04:00
NanoCode012	0bb9077553	Fix: logging on py310 (#2802 ) * feat: encourage py311 * fix: logging import on py310 * fix: do upper and simplify handling	2025-06-18 15:46:27 -04:00
Dan Saunders	9d5bfc127e	Config doc autogen (#2718 ) * config reference doc autogen * improvements * cleanup; still ugly but working * reformat * remove autogen config ref from git * factor out validations * rewrite * rewrite * cleanup * progress * progress * progress * lint and minifying somewhat * remove unneeded * coderabbit * coderabbit * update preview-docs workflow triggers * installing with deps * coderabbit * update refs * overwrote file accidentally	2025-06-18 15:36:53 -04:00
NanoCode012	a3c82e8cbb	fix: grpo doc link (#2788 ) [skip ci]	2025-06-13 12:03:47 -07:00
NanoCode012	eac4a61f55	Feat: Add Magistral and mistral-common tokenizer support (#2780 )	2025-06-12 19:18:33 -04:00
salman	3634d8ff9d	QAT docfix (#2778 ) [skip ci] * nits * Update docs/qat.qmd Co-authored-by: NanoCode012 <nano@axolotl.ai> --------- Co-authored-by: NanoCode012 <nano@axolotl.ai>	2025-06-12 13:22:40 -04:00
Wing Lian	581dd324cc	build base images for torch 2.7.1 (#2764 ) * build base images for torch 2.7.1 * fix: update base docker to use torch 2.7.1 * fix: update doc for main base to use 2.7.1 * make sure to install fa2 in base uv too * use no build isolation for uv+flashattn * install psutil also for fa2 * longer timeout for flash attn build --------- Co-authored-by: NanoCode012 <nano@axolotl.ai>	2025-06-11 17:11:06 -04:00
NanoCode012	83632f71d8	Feat: add tool calling support via tools column (#2774 ) * feat: add tool_calling field support * fix: add tests	2025-06-09 21:42:05 -07:00
Wing Lian	c67910fa6f	bump hf deps (#2735 ) [skip ci] * bump hf deps * upgrade liger-kernel too * install cce from fork for transformers fix * fix reference to vocab size in gemma3 patch * use padding_idx instead of pad_token_id * remove fixed gemma3 patch * use updated cce fork * fix local mllama cce patches w docstring * add test for multipack with trainer setup and fix trainer for trainer refactor upstream * bump modal version * guard for iterable datasetS * mllama model arch layout changed in latest transformers * fix batch sampler with drop_last * fix: address upstream vlm changes for lora * fix: update references to old lora target path * fix: remove mllama fa2 patch due to upstream fix * fix: lora kernel patch path for multimodal models * fix: removed mllama from quarto * run test for came optim on 2.6.0+ * fix fsdp2 patch and remove deprecated patch * make sure to set sequence_parallel_degree for grpo * Add SP test for GRPO * add sp to grpo config for trainer * use reward_funcs as kwarg to grpo trainer * fix the comprehension for reward funcs * reward funcs already passed in as args * init sp_group right before training * fix check for adding models to SP context * make sure to pass args to super * upgrade deepspeed * use updated trl and add reasoning flags for vllm * patch the worker --------- Co-authored-by: NanoCode012 <nano@axolotl.ai>	2025-06-05 07:20:33 -07:00
mhenrhcsen	2bf61d8e25	fix abbriviatation spelling error	2025-06-03 21:30:40 +02:00
mhenrhcsen	68788e419e	feat: add Group Relative Policy Optimization (GPRO) to RLHF documentation	2025-06-03 21:30:40 +02:00
Wing Lian	ecc719f5c7	add support for base image with uv (#2691 )	2025-06-02 12:48:55 -07:00
NanoCode012	6778856804	Fix: RL base feature parity (#2133 ) * feat: add num_proc and load from cache for rl mapping * fix: refactor sft and rl trainer to set same base args * feat: add report_to to set run name * fix: consolidate handling of fp16, bf16, tf32 kwarg * chore: consolidate eval_strat, loraplus, lr sched, max_length * fix: deprecate old types * fix: adding missing Any * fix: max_steps incorrectly set * fix: remove unnecessary datacollator kwarg insert and pop * fix: update default max_steps * fix: add missing weight_decay handling * fix: ignore max_length for grpo * feat: update CI on trainer_builder * fix: comments * improve handling of warmup/logging steps * use transformers default for logging steps, not None * fix: remove redundant override * fix: lint * feat: allow custom optim for rl methods * fix: duplicate optim setting * fix(test): set sequence_parallel_degree default in base cfg * feat: add handling for seed and SP/ring-attn config * chore: add back return typing from rebase * fix(test): use RLType directly to skip needing to validate * feat: split training builder into sub modules * fix: remove deprecated clause * chore: add missing config to doc * fix: update quarto autodoc * fix: import path for trainer builder and submodules * fix: remove redundant configs from rebase mistake * chore: simplify dynamo check * fix: optimizer_cls_and_kwargs to be passed into trainer_kwargs * fix: add missing rex from rebase * fix: move pop optimizer_cls_and_kwargs * fix: pop optimizer cls in rl too * fix: leftover bug from rebase * fix: update handling of trainer_cls in RL * fix: address pr feedback * feat: call hook_pre_create_trainer for rl * chore: lint * fix: return notimplemented for ppo * feat: moved torch compile to base and refactor collator setting * chore: remove unused importlib.util import * fix: optimizer cls not being popped * feat: move epoch setting to base * fix: catch unhandled custom optimizer * fix: remove duplicate lora plus setting * chore: refactor if condition * chore: refactor set_base_training_args into smaller modules * fix: address TrainerBuilderBase class variables to instance var * fix: add handling for beta3 and episilon2 * fix: change to pass dict via arg instead of updating dict * chore: simplify if condition * fix: force access to lr & weight decay in case not provided to early error * fix: remove log sweep * chore: refactor if condition * fix: address renamed cfg * fix: improve handling of cosine hyp * fix: remove unused params * chore: refactor * chore: clarify doc safetensors * fix: update import path to be unified following comments * fix: duplicate kwargs passed * feat: return separate trainer_kwargs * chore: refactor * chore: refactor based on comments * chore: refactor based on comments * fix: move gpustats callback to base * chore: create trainer_cls_args first based on comments * fix: ipo label smoothing passed incorrectly * feat: add optimizer parity for RL methods with test * feat: add parity for optimizer in RM/PRM and add test * fix: remove redundant function override for orpo/cpo batch metrics * fix: improve handling of dpo_label_smoothing and merge issue * fix: test fixture returning wrong field * fix: address avoid direct modify fixture * chore: minor refactor * Revert "chore: refactor" This reverts commit `99c8859eb0`. * feat: rename trainer_builder to builders --------- Co-authored-by: Wing Lian <wing@axolotl.ai>	2025-05-30 11:21:47 +07:00

1 2 3 4

199 Commits