Commit Graph

200 Commits

Author SHA1 Message Date
Dan Saunders
7d572b58d1 just grouped_mm for now 2025-09-17 13:44:26 -04:00
Dan Saunders
125e7b5fe6 fast path 2025-09-17 13:44:26 -04:00
Dan Saunders
43ada1278a moe kernels init scaffold 2025-09-17 13:44:26 -04:00
salman
e5c427f6de qat doc updates (#3162) [skip-ci] 2025-09-17 10:38:15 +01:00
salman
58d67bf98d Migrate QAT API; fix axolotl quantize for QAT-ed models; add NVFP4 (#3107) 2025-09-12 10:55:50 +01:00
yardenhoch
efa1da52d5 Center rewards coefficient (#3124)
* feat: add center_rewards_coefficient for reward modeling

- Add center_rewards_coefficient parameter to Pydantic schema with paper reference
- Pass parameter through base builder and causal builder to training args
- Add documentation section with usage examples and theoretical background
- Enable parameter in reward modeling example configs with recommended value
- Enables reward centering for improved training stability in RLHF workflows

Implements auxiliary loss from Eisenstein et al. 2023 (https://huggingface.co/papers/2312.09244)
to incentivize mean-zero reward outputs without post-training normalization.

* Update description

* test: add unit tests for center_rewards_coefficient integration

* Update src/axolotl/core/builders/base.py

Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>

* Update docs/reward_modelling.qmd

Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>

* Update docs/reward_modelling.qmd

Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>

* reference to TRL documentation.

* add new reward model configuration for qwen3 with comprehensive parameters

* Verified center_rewards_coefficient is correctly passed through the trainer builder to training arguments.

* Refactor reward modeling documentation to consolidate information on center_rewards_coefficient

* Remove unit tests for center_rewards_coefficient integration as part of codebase cleanup.

* linting

* nit

* Apply suggestions from code review

Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>

* lint

---------

Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
Co-authored-by: Salman Mohammadi <salman.mohammadi@outlook.com>
2025-09-03 16:22:37 -04:00
NanoCode012
e48aa8a5b1 feat(doc): improve visibility for colab notebooks (#3110) [skip ci]
* feat: improve visibility for colab notebooks

* fix: link to GH colab

* feat: change to badge and move higher
2025-09-03 01:40:53 -04:00
Dan Saunders
231a67e70b Streaming SFT support (#3101)
* working

* fixes

* deprecate --iterable; cleanup

* pretrain_multipack_buffer_size -> streaming_multipack_buffer_size

* improvements

* tests

* remove unused

* docs, examples

* nit

* nit

* add val_set_size validation

* val

* nit

* min

* coderabbito

* cleanup

* nit

* add depr warning, cleanup

* nit

* fix test, fix quarto

* fix

* review comments

* review comments

* fix
2025-09-02 12:08:44 -04:00
Wing Lian
7ed40f1d70 automatically set env vars for single gpu deepspeed zero3 (#3118) [skip ci]
* automatically set env vars for single gpu deepspeed zero3

* use setdefault
2025-08-29 13:36:47 -04:00
Dan Saunders
79ddaebe9a Add ruff, remove black, isort, flake8, pylint (#3092)
* black, isort, flake8 -> ruff

* remove unused

* add back needed import

* fix
2025-08-23 23:37:33 -04:00
Wing Lian
130ef7c51a Various fixes for VLMs (#3063)
* fix to not use batch feature indexing

* more vlm fixes

* use AutoModelForImageTextToText

* add example yaml and need num2words for chat template

* improve handling of adding image tokens to conversation

* add lfm2-vl support

* update the lfm readme

* fix markdown and add rtol for loss checks

* feat: add smolvlm2 processing strat

* fix: check for causal-conv1d in lfm models

* feat: add docs for lfm2

* feat: add new models and tips to docs

* feat: add smolvlm2 docs and remove extra dep

* chore: update docs

* feat: add video instructions

* chore: cleanup

* chore: comments

* fix: typo

* feat: add usage stats

* chore: refactor

---------

Co-authored-by: NanoCode012 <nano@axolotl.ai>
2025-08-15 10:52:57 -04:00
NanoCode012
4273d5cf7e feat: update nd parallelism readme (#3039)
Co-authored-by: salman <salman.mohammadi@outlook.com>
2025-08-08 12:45:36 +01:00
NanoCode012
ca796fb56e feat(doc): update gpt-oss readme (#3029) [skip ci]
* feat(doc): update gpt-oss readme

* fix: caps

* feat: add toolcalling section

* feat: add example tool dataset to docs

* chore: update
2025-08-07 09:26:42 -04:00
salman
46dfacf255 ND Parallel Doc Nits (#3032) 2025-08-07 10:34:26 +01:00
NanoCode012
e3177c3210 feat: add complete optimizer docs (#3017) [skip ci]
* feat: add complete optimizer docs

* fix: deprecate old torchao adamw low bit
2025-08-06 08:01:51 -04:00
Wing Lian
ab49d16e34 Dion optimizer support (#3014)
* Add support for Dion optimizer

* dion training kwargs

* fix var names

* no dion 8bit for now

* use updated axolotl-contribs-mit for dion optimizer

* add smoke test for dion optimizer

* add docs

* fix typo during edits

* fix test to not remove load in 8bit
2025-08-04 16:33:30 -04:00
NanoCode012
a54c1be972 Fix: shorten mem logs to 2 decimal places and renamed nd docs (#3011) [skip ci]
* fix: shorten memory logs

* fix: title name
2025-08-04 10:23:36 -04:00
NanoCode012
7026cd5e9e Feat: Add N-D parallelism docs (#2989)
* fix: remove non-existent file

* feat: add n-d parallel docs

* fix: comments

---------

Co-authored-by: salman <salman.mohammadi@outlook.com>
2025-08-01 13:18:31 +07:00
salman
294c7fe7a6 Distributed/ND-Parallel (#2977) 2025-07-31 15:25:02 -04:00
Dan Saunders
bb1cae1a20 CLI: add --launcher option, support launcher args, cleanup, refactor (#2924)
* add --launcher option; explicit True/False bool args; small cleanup

* refactor

* add torchrun, accelerate cli args

* add rdzv arg default + tests

* update _quarto

* coderabbit

* fix

* we can't set rdvz_id independently across nodes

* coderabbit

* fix tests
2025-07-30 15:46:56 -04:00
NanoCode012
41434f0c28 feat(doc): add all providers to readme (#2972) [skip ci]
* feat(doc): add vastai link

* feat: add cloud providers to readme for more visibility

* add prime intellect, remove Modal as sponsor

---------

Co-authored-by: Wing Lian <wing@axolotl.ai>
2025-07-27 17:03:50 -04:00
Wing Lian
0ff2f172ef Act offload lora fix (#2928) [skip ci]
* fix activation offloading with lora

* update w e2e test

* add docs for error
2025-07-24 16:10:04 -04:00
Dan Saunders
208fb7b8e7 basic torchao fp8 mixed precision training (#2926)
* debug

* debug

* debug

* revert unneeded change

* add accelerator config to base trainer builder

* add back accumulated_cache_size_limit setting

* lint

* accelerator constructor patch for single-GPU torch fp8

* lint

* re-using existing fp8 code

* lint

* remove accelerate patch now fix in latest release

* fix

* docs

* add fp8 + fsdp2 example

* remove unused config

* update config

* smoke tests

* add validator

* add 2.7.0 guard for fsdp2

* fix

* add config descriptions

* add FSDP doc link

* nit

* set force_recompute_fp8_weight_in_bwd with enable_fsdp_float8_all_gather

* better cfg for smoke tests

* add test for accelerate patching

* update fp8 validator
2025-07-22 16:27:47 -04:00
NanoCode012
dfba881e99 Feat: add gemma3n support (#2852)
* feat: add gemma3n cce

* feat: add sample config

* feat: add gemma3n multimodal mode

* feat: add audio example

* feat: support audio and return pixel values in collator

* feat: support unmask only assistant region (gemma3n for now)

* feat(doc): add notes for audio loading

* feat: add audio support for gemma3n

* feat: update examples

* feat: add gemma3n to the docs

* fix: add link at top

* feat(doc): clarify additional requirements

* fix: mllama missing aspect ratio

* fix: mllama need attention fixes for fa2

* Partially Revert "fix: mllama need attention fixes for fa2"

This reverts commit a0bfdd1777.

* fix: disable FA2 for mllama in vision mode

* feat: update configs to use proper attention

* fix: support other vision features

* feat(doc): clarify requirements for gemma3n
2025-07-22 16:52:15 +07:00
salman
e5734e5cf0 adding torchtitan link (#2945) [skip ci] 2025-07-19 13:54:14 -04:00
Wing Lian
99187cd208 Activation Offloading w CUDA Streams (#2900) [skip ci]
* use cuda streams for activation offloading

* use torch native ops

* update cfg schema for streams

* fix literal constructor for set

* use context for training step so it doesn't affect evals

* disable streams

* auto gc on eval steps

* use activation_offloading config arg

* add docs for gradient checkpointing

* handle validation for gc/ao

* use cuda streams for act offloading

* add more validation for AC w/o GC

* fix docs

* move activation_offloading lower in definition so it doesn't break args/kwargs

* fix kd due to import order
2025-07-14 20:10:20 -04:00
Jiawei Liu
7fb8441e0e fix: customized dataset with simpo (#2894) [skip ci] 2025-07-12 11:40:30 -04:00
NanoCode012
4dc5910e1c feat(doc): re-add docker 2.7.0 tag back (#2902) [skip ci] 2025-07-12 11:40:01 -04:00
salman
d6e4a611e5 FSDP1 -> FSDP2 (#2760)
* FSDP2 args migration implementation

This commit implements the migration to FSDP2 arguments including:
- FSDP2 support with LoRA training
- DPO integration with FSDP2
- Model loading fixes and refactoring
- CPU offloading and PEFT handling
- Test updates and CI improvements
- Bug fixes for dtype errors and various edge cases
2025-07-12 15:18:01 +01:00
NanoCode012
9b95a625ab feat: add devstral small 2507 (#2896)
* feat: add devstral small 2507

* chore: update blog doc
2025-07-11 09:34:19 +07:00
Wing Lian
76aeb16156 tiled_mlp supports single gpu (#2891)
* tiled_mlp supports single gpu

* use checkpoint offloading for arctic training

* patch torch checkpoint too

* support for single gpu zero3

* add linkback to where it was copied from
2025-07-09 12:48:22 -04:00
Wing Lian
c6d69d5c1b release v0.11.0 (#2875)
Some checks failed
ci-cd / build-axolotl (<nil>, 126, 12.6.3, 3.11, 2.6.0) (push) Has been cancelled
ci-cd / build-axolotl (<nil>, 126, 12.6.3, 3.11, 2.7.1) (push) Has been cancelled
ci-cd / build-axolotl (<nil>, 128, 12.8.1, 3.11, 2.7.1) (push) Has been cancelled
ci-cd / build-axolotl (vllm, 126, 12.6.3, 3.11, 2.7.0) (push) Has been cancelled
publish pypi / Create Release (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 126, 12.6.3, 3.11, 2.7.0) (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 126, 12.6.3, 3.11, 2.7.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 126, 12.6.3, true, 3.11, 2.6.0) (push) Has been cancelled
ci-cd / build-axolotl-cloud (<nil>, 128, 12.8.1, 3.11, 2.7.1) (push) Has been cancelled
ci-cd / build-axolotl-cloud-no-tmux (<nil>, 126, 12.6.3, 3.11, 2.6.0) (push) Has been cancelled
publish pypi / Upload release to PyPI (push) Has been cancelled
* release v0.11.0

* don't build vllm into release for now

* remove 2.5.1 references

* smollm3 multipack support

* fix ordering of e2e tests
2025-07-09 09:22:35 -04:00
float-trip
1032e22650 Fix link in FSDP + QLoRA docs. (#2879) [skip ci] 2025-07-08 09:19:09 -04:00
Wing Lian
d68cc1e8ab densemixer plugin integration (#2868)
* densemixer plugin integration

* update readme with usage docs

* automatically find new integrations that aren't explicitly defined

* make sure to import os
2025-07-07 17:05:19 -04:00
NanoCode012
22d4a838dc feat(doc): add vllm and fa2 incompat error to faq (#2877) 2025-07-07 14:13:37 -04:00
NanoCode012
5a961ecadf Fix: do not call preprocess in multimodal or pretraining case (#2861)
* fix: let users know to not call preprocess for vision mode

* fix: improve ux for pretraining dataset and skip prepare ds

* feat: add info to doc

* Update src/axolotl/cli/preprocess.py following comment

Co-authored-by: salman <salman.mohammadi@outlook.com>

---------

Co-authored-by: salman <salman.mohammadi@outlook.com>
2025-07-06 21:55:33 -04:00
NanoCode012
bf5928d0ee feat(doc): update docker tag examples (#2851) [skip ci]
* feat(doc): update docker tag examples

* chore: comment
2025-07-02 08:05:01 -04:00
NanoCode012
927bf530bc fix(doc): default messages example used wrong key (#2832)
* fix(doc): default messages example used wrong key

* feat: add links to SP, multi-gpu, multi-node on readme
2025-06-26 10:47:31 -04:00
NanoCode012
26c39e1ca7 fix(doc): address exitcode formatting to help search (#2809) [skip ci] 2025-06-19 11:19:52 -04:00
NanoCode012
0bb9077553 Fix: logging on py310 (#2802)
* feat: encourage py311

* fix: logging import on py310

* fix: do upper and simplify handling
2025-06-18 15:46:27 -04:00
Dan Saunders
9d5bfc127e Config doc autogen (#2718)
* config reference doc autogen

* improvements

* cleanup; still ugly but working

* reformat

* remove autogen config ref from git

* factor out validations

* rewrite

* rewrite

* cleanup

* progress

* progress

* progress

* lint and minifying somewhat

* remove unneeded

* coderabbit

* coderabbit

* update preview-docs workflow triggers

* installing with deps

* coderabbit

* update refs

* overwrote file accidentally
2025-06-18 15:36:53 -04:00
NanoCode012
a3c82e8cbb fix: grpo doc link (#2788) [skip ci] 2025-06-13 12:03:47 -07:00
NanoCode012
eac4a61f55 Feat: Add Magistral and mistral-common tokenizer support (#2780) 2025-06-12 19:18:33 -04:00
salman
3634d8ff9d QAT docfix (#2778) [skip ci]
* nits

* Update docs/qat.qmd

Co-authored-by: NanoCode012 <nano@axolotl.ai>

---------

Co-authored-by: NanoCode012 <nano@axolotl.ai>
2025-06-12 13:22:40 -04:00
Wing Lian
581dd324cc build base images for torch 2.7.1 (#2764)
* build base images for torch 2.7.1

* fix: update base docker to use torch 2.7.1

* fix: update doc for main base to use 2.7.1

* make sure to install fa2 in base uv too

* use no build isolation for uv+flashattn

* install psutil also for fa2

* longer timeout for flash attn build

---------

Co-authored-by: NanoCode012 <nano@axolotl.ai>
2025-06-11 17:11:06 -04:00
NanoCode012
83632f71d8 Feat: add tool calling support via tools column (#2774)
* feat: add tool_calling field support

* fix: add tests
2025-06-09 21:42:05 -07:00
Wing Lian
c67910fa6f bump hf deps (#2735) [skip ci]
* bump hf deps

* upgrade liger-kernel too

* install cce from fork for transformers fix

* fix reference to vocab size in gemma3 patch

* use padding_idx instead of pad_token_id

* remove fixed gemma3 patch

* use updated cce fork

* fix local mllama cce patches w docstring

* add test for multipack with trainer setup and fix trainer for trainer refactor upstream

* bump modal version

* guard for iterable datasetS

* mllama model arch layout changed in latest transformers

* fix batch sampler with drop_last

* fix: address upstream vlm changes for lora

* fix: update references to old lora target path

* fix: remove mllama fa2 patch due to upstream fix

* fix: lora kernel patch path for multimodal models

* fix: removed mllama from quarto

* run test for came optim on 2.6.0+

* fix fsdp2 patch and remove deprecated patch

* make sure to set sequence_parallel_degree for grpo

* Add SP test for GRPO

* add sp to grpo config for trainer

* use reward_funcs as kwarg to grpo trainer

* fix the comprehension for reward funcs

* reward funcs already passed in as args

* init sp_group right before training

* fix check for adding models to SP context

* make sure to pass args to super

* upgrade deepspeed

* use updated trl and add reasoning flags for vllm

* patch the worker

---------

Co-authored-by: NanoCode012 <nano@axolotl.ai>
2025-06-05 07:20:33 -07:00
mhenrhcsen
2bf61d8e25 fix abbriviatation spelling error 2025-06-03 21:30:40 +02:00
mhenrhcsen
68788e419e feat: add Group Relative Policy Optimization (GPRO) to RLHF documentation 2025-06-03 21:30:40 +02:00
Wing Lian
ecc719f5c7 add support for base image with uv (#2691) 2025-06-02 12:48:55 -07:00