* feat: move to uv first
* fix: update doc to uv first
* fix: merge dev/tests into uv pyproject
* fix: update docker docs to match current config
* fix: migrate examples to readme
* fix: add llmcompressor to conflict
* feat: rec uv sync with lockfile for dev/ci
* fix: update docker docs to clarify how to use uv images
* chore: docs
* fix: use system python, no venv
* fix: set backend cpu
* fix: only set for installing pytorch step
* fix: remove unsloth kernel and installs
* fix: remove U in tests
* fix: set backend in deps too
* chore: test
* chore: comments
* fix: attempt to lock torch
* fix: workaround torch cuda and not upgraded
* fix: forgot to push
* fix: missed source
* fix: nightly upstream loralinear config
* fix: nightly phi3 long rope not work
* fix: forgot commit
* fix: test phi3 template change
* fix: no more requirements
* fix: carry over changes from new requirements to pyproject
* chore: remove lockfile per discussion
* fix: set match-runtime
* fix: remove unneeded hf hub buildtime
* fix: duplicate cache delete on nightly
* fix: torchvision being overridden
* fix: migrate to uv images
* fix: leftover from merge
* fix: simplify base readme
* fix: update assertion message to be clearer
* chore: docs
* fix: change fallback for cicd script
* fix: match against main exactly
* fix: peft 0.19.1 change
* fix: e2e test
* fix: ci
* fix: e2e test
* fix: remove unneeded debug log
* fix: cleanup
* feat: add dense gemma config and cleanup
* feat: add cce support
* update notes and set torch compile
* fix patch for new number of return vals
* fixes for gemma4
* fix packing bug
* use updated cce for mm
* fix: pass in kv cache func when avail for transformers 5.5
* feat: update examples with flex variant and readme
* gemma4 lora attention kernels
---------
Co-authored-by: Wing Lian <wing.lian@gmail.com>
Co-authored-by: Wing Lian <wing@axolotl.ai>
* nemo gym integration with grpo wip
* mostly working
* cleanup
* simplify
* update docs
* nemo gym support wip
* cleanup
* chore: lint
* address PR review and add more tests
* chore: lint
* post merge lora fixes for CI (#3536) [skip ci]
* post merge lora fixes for CI
* handle lora kernel auto-enable for moe without grouped_mm
* prefer not to import torch in schema validation
* address pr comments, add timeout, add tests
* roundup_power2_divisions not needed with newer pytorch versions (#3540)
* roundup_power2_divisions not needed with newer pytorch versions
* remove typo
* update qwen3.5 moe 35b-a3b yaml for 5090
* more bug fixes
* fix tests to match updated trainer
* don't use fa2 for hooks test
* reset plugins on the instance
* retry download
* fix references to renamed axolotl_cfg property on trainer
* Fix ref to trainer cfg
* fix: robust handling of race condition on patching check (#3543) [skip ci]
* EBFT: Matching Features, Not Tokens: Energy-Based Fine-Tuning of Language Models (#3527) [skip ci]
* EBFT wip
* fixes
* more fixeS
* add missing strided module
* ebft fixes for multi-turn
* make ebft work with async
* add example for ebft w qwen3.5
* fix for split thinking and update yaml for lora over linear attention only
* enforce_eager for vllm arg in schema
* fix sync weights
* fix multi-gpu
* handle updated sig for mm
* ddp fixes
* improve multi-gpu handling, don't calculate logits, adaptive completion length
* chore: lint
* chore: lint
* support completion_mean
* Address corereview feedback
* clamp min IS ratio
* Address PR code review
* more fixes identified
* address code review
* Fix property from rebase conflict
* fix for ebft sync and update docs
* make trainer loss patch check a solo test
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* EBFT wip
* fixes
* more fixeS
* add missing strided module
* ebft fixes for multi-turn
* make ebft work with async
* add example for ebft w qwen3.5
* fix for split thinking and update yaml for lora over linear attention only
* enforce_eager for vllm arg in schema
* fix sync weights
* fix multi-gpu
* handle updated sig for mm
* ddp fixes
* improve multi-gpu handling, don't calculate logits, adaptive completion length
* chore: lint
* chore: lint
* support completion_mean
* Address corereview feedback
* clamp min IS ratio
* Address PR code review
* more fixes identified
* address code review
* Fix property from rebase conflict
* roundup_power2_divisions not needed with newer pytorch versions
* remove typo
* update qwen3.5 moe 35b-a3b yaml for 5090
* more bug fixes
* fix tests to match updated trainer
* don't use fa2 for hooks test
* reset plugins on the instance
* retry download
* fix references to renamed axolotl_cfg property on trainer
* Fix ref to trainer cfg
* optimize moe + lora
* more scattermoe optims
* selective dequant
* add correctness unit tests and benchmarks for scattermoe + lora
* handle base+lora split kernel for older moe models
* chore: lint
* fix casting for H200 and B200
* register pressure estimation and pruning for h200/b200
* use soft limit for pruning
* qkv patch for qwen3.5moe
* support text_model for qwen3.5 moe
* nesting of qwen3
* use udpated cce with zero3 support
* Fix decomposed backward for QKV and O projections
eliminates B @ A materialization in LoRA attention backward, replacing full [out, in] matmuls with two small [T, R] matmuls.
* mxfp4 axo
* import lint
* test for qat mxfp4
* config for mxfp4
* add qat:
* pass base config
* MXFakeQuantizeConfig
* lint
* tune config so it fits in 32GB VRAM
---------
Co-authored-by: Wing Lian <wing@axolotl.ai>
* chore: rename without period
* feat: add glm45 air
* feat: add doc on expert quantization
* feat: update base readme with new changes
* chore: cleanup
* chore: cleanup
* chore: cleanup
* fix: disable quantize_moe_expert on merge per comment
* chore: add kernel info to optimizations doc
* Prepare for transformers v5 upgrade
* fix hf cli
* update for hf hub changes
* fix tokenizer apply_chat_template args
* remap include_tokens_per_second
* fix tps
* handle migration for warmup
* use latest hf hub
* Fix scan -> ls
* fix import
* fix for renaming of mistral common tokenizer -> backend
* update for fixed tokenziation for llama
* Skip phi35 tests for now
* remove mistral patch fixed upstream in huggingface/transformers#41439
* use namespacing for patch
* don't rely on sdist for e2e tests for now
* run modal ci without waiting too
* Fix dep for ci
* fix imports
* Fix fp8 check
* fsdp2 fixes
* fix version handling
* update fsdp version tests for new v5 behavior
* Fail multigpu tests after 3 failures
* skip known v5 broken tests for now and cleanup
* bump deps
* unmark skipped test
* re-enable test_fsdp_qlora_prequant_packed test
* increase multigpu ci timeout
* skip broken gemma3 test
* reduce timout back to original 120min now that the hanging test is skipped
* fix for un-necessary collator for pretraining with bsz=1
* fix: safe_serialization deprecated in transformers v5 rc01 (#3318)
* torch_dtype deprecated
* load model in float32 for consistency with tests
* revert some test fixtures back
* use hf cache ls instead of scan
* don't strip fsdp_version
more fdsp_Version fixes for v5
fix version in fsdp_config
fix aliasing
fix fsdp_version check
check fsdp_version is 2 in both places
* Transformers v5 rc2 (#3347)
* bump dep
* use latest fbgemm, grab model config as part of fixture, un-skip test
* import AutoConfig
* don't need more problematic autoconfig when specifying config.json manually
* add fixtures for argilla ultrafeedback datasets
* download phi4-reasoning
* fix arg
* update tests for phi fast tokenizer changes
* use explicit model types for gemma3
---------
Co-authored-by: Wing Lian <wing@axolotl.ai>
* fix: AutoModelForVision2Seq -> AutoModelForImageTextToText
* chore: remove duplicate
* fix: attempt fix gemma3 text mode
* chore: lint
* ga release of v5
* need property setter for name_or_path for mistral tokenizer
* vllm not compatible with transformers v5
* setter for chat_template w mistral too
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai>
Co-authored-by: salman <salman.mohammadi@outlook.com>
* build examples readmes with quarto
* chore: formatting
* feat: dynamic build docs
* feat: add more model guides
* chore: format
* fix: collapse sidebar completely to have space for model guides
* fix: security protection for generated qmd
* fix: adjust collapse level, add new models, update links
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai>