Wing Lian fc4e37920b transformers v5 upgrade (#3272)
* Prepare for transformers v5 upgrade

* fix hf cli

* update for hf hub changes

* fix tokenizer apply_chat_template args

* remap include_tokens_per_second

* fix tps

* handle migration for warmup

* use latest hf hub

* Fix scan -> ls

* fix import

* fix for renaming of mistral common tokenizer -> backend

* update for fixed tokenziation for llama

* Skip phi35 tests for now

* remove mistral patch fixed upstream in huggingface/transformers#41439

* use namespacing for patch

* don't rely on sdist for e2e tests for now

* run modal ci without waiting too

* Fix dep for ci

* fix imports

* Fix fp8 check

* fsdp2 fixes

* fix version handling

* update fsdp version tests for new v5 behavior

* Fail multigpu tests after 3 failures

* skip known v5 broken tests for now and cleanup

* bump deps

* unmark skipped test

* re-enable test_fsdp_qlora_prequant_packed test

* increase multigpu ci timeout

* skip broken gemma3 test

* reduce timout back to original 120min now that the hanging test is skipped

* fix for un-necessary collator for pretraining with bsz=1

* fix: safe_serialization deprecated in transformers v5 rc01 (#3318)

* torch_dtype deprecated

* load model in float32 for consistency with tests

* revert some test fixtures back

* use hf cache ls instead of scan

* don't strip fsdp_version

more fdsp_Version fixes for v5
fix version in fsdp_config
fix aliasing
fix fsdp_version check
check fsdp_version is 2 in both places

* Transformers v5 rc2 (#3347)

* bump dep

* use latest fbgemm, grab model config as part of fixture, un-skip test

* import AutoConfig

* don't need more problematic autoconfig when specifying config.json manually

* add fixtures for argilla ultrafeedback datasets

* download phi4-reasoning

* fix arg

* update tests for phi fast tokenizer changes

* use explicit model types for gemma3

---------

Co-authored-by: Wing Lian <wing@axolotl.ai>

* fix: AutoModelForVision2Seq -> AutoModelForImageTextToText

* chore: remove duplicate

* fix: attempt fix gemma3 text mode

* chore: lint

* ga release of v5

* need property setter for name_or_path for mistral tokenizer

* vllm not compatible with transformers v5

* setter for chat_template w mistral too

---------

Co-authored-by: NanoCode012 <nano@axolotl.ai>
Co-authored-by: salman <salman.mohammadi@outlook.com>
2026-01-27 17:08:24 -05:00
2026-01-27 17:08:24 -05:00
2026-01-27 17:08:24 -05:00
2026-01-27 17:08:24 -05:00
2026-01-21 20:00:18 -05:00
2026-01-27 17:08:24 -05:00
2026-01-27 17:08:24 -05:00
2024-11-18 14:58:03 -05:00
2026-01-27 17:08:24 -05:00
2026-01-27 17:08:24 -05:00
2023-04-14 00:20:05 -04:00
2024-08-23 12:21:51 -04:00
2025-09-12 10:55:11 +01:00
2025-04-10 12:34:25 +07:00
2023-06-10 23:36:14 -07:00
2025-06-17 18:09:24 -04:00
2023-07-21 09:49:29 -04:00
2026-01-21 20:00:18 -05:00
2026-01-21 20:00:18 -05:00
2026-01-21 20:00:18 -05:00
2026-01-20 22:58:29 -05:00

Axolotl

A Free and Open Source LLM Fine-tuning Framework

GitHub License tests codecov Releases
contributors GitHub Repo stars
discord twitter google-colab
tests-nightly multigpu-semi-weekly tests

🎉 Latest Updates

Expand older updates
  • 2025/03: Axolotl has implemented Sequence Parallelism (SP) support. Read the blog and docs to learn how to scale your context length when fine-tuning.
  • 2025/06: Magistral with mistral-common tokenizer support has been added to Axolotl. See docs to start training your own Magistral models with Axolotl!
  • 2025/04: Llama 4 support has been added in Axolotl. See docs to start training your own Llama 4 models with Axolotl's linearized version!
  • 2025/03: (Beta) Fine-tuning Multimodal models is now supported in Axolotl. Check out the docs to fine-tune your own!
  • 2025/02: Axolotl has added LoRA optimizations to reduce memory usage and improve training speed for LoRA and QLoRA in single GPU and multi-GPU training (DDP and DeepSpeed). Jump into the docs to give it a try.
  • 2025/02: Axolotl has added GRPO support. Dive into our blog and GRPO example and have some fun!
  • 2025/01: Axolotl has added Reward Modelling / Process Reward Modelling fine-tuning support. See docs.

Overview

Axolotl is a free and open-source tool designed to streamline post-training and fine-tuning for the latest large language models (LLMs).

Features:

  • Multiple Model Support: Train various models like GPT-OSS, LLaMA, Mistral, Mixtral, Pythia, and many more models available on the Hugging Face Hub.
  • Multimodal Training: Fine-tune vision-language models (VLMs) including LLaMA-Vision, Qwen2-VL, Pixtral, LLaVA, SmolVLM2, and audio models like Voxtral with image, video, and audio support.
  • Training Methods: Full fine-tuning, LoRA, QLoRA, GPTQ, QAT, Preference Tuning (DPO, IPO, KTO, ORPO), RL (GRPO), and Reward Modelling (RM) / Process Reward Modelling (PRM).
  • Easy Configuration: Re-use a single YAML configuration file across the full fine-tuning pipeline: dataset preprocessing, training, evaluation, quantization, and inference.
  • Performance Optimizations: Multipacking, Flash Attention, Xformers, Flex Attention, Liger Kernel, Cut Cross Entropy, Sequence Parallelism (SP), LoRA optimizations, Multi-GPU training (FSDP1, FSDP2, DeepSpeed), Multi-node training (Torchrun, Ray), and many more!
  • Flexible Dataset Handling: Load from local, HuggingFace, and cloud (S3, Azure, GCP, OCI) datasets.
  • Cloud Ready: We ship Docker images and also PyPI packages for use on cloud platforms and local hardware.

🚀 Quick Start - LLM Fine-tuning in Minutes

Requirements:

  • NVIDIA GPU (Ampere or newer for bf16 and Flash Attention) or AMD GPU
  • Python 3.11
  • PyTorch ≥2.8.0

Google Colab

Open In Colab

Installation

Using pip

pip3 install -U packaging==26.0 setuptools==75.8.0 wheel ninja
pip3 install --no-build-isolation axolotl[flash-attn,deepspeed]

# Download example axolotl configs, deepspeed configs
axolotl fetch examples
axolotl fetch deepspeed_configs  # OPTIONAL

Using Docker

Installing with Docker can be less error prone than installing in your own environment.

docker run --gpus '"all"' --rm -it axolotlai/axolotl:main-latest

Other installation approaches are described here.

Cloud Providers

Your First Fine-tune

# Fetch axolotl examples
axolotl fetch examples

# Or, specify a custom path
axolotl fetch examples --dest path/to/folder

# Train a model using LoRA
axolotl train examples/llama-3/lora-1b.yml

That's it! Check out our Getting Started Guide for a more detailed walkthrough.

📚 Documentation

🤝 Getting Help

🌟 Contributing

Contributions are welcome! Please see our Contributing Guide for details.

📈 Telemetry

Axolotl has opt-out telemetry that helps us understand how the project is being used and prioritize improvements. We collect basic system information, model types, and error rates—never personal data or file paths. Telemetry is enabled by default. To disable it, set AXOLOTL_DO_NOT_TRACK=1. For more details, see our telemetry documentation.

❤️ Sponsors

Interested in sponsoring? Contact us at wing@axolotl.ai

📝 Citing Axolotl

If you use Axolotl in your research or projects, please cite it as follows:

@software{axolotl,
  title = {Axolotl: Open Source LLM Post-Training},
  author = {{Axolotl maintainers and contributors}},
  url = {https://github.com/axolotl-ai-cloud/axolotl},
  license = {Apache-2.0},
  year = {2023}
}

📜 License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

Description
Fork of axolotl-ai-cloud/axolotl @ v0.16.1 � activeblue patches for RTX 5080 / CUDA 12.8
Readme Apache-2.0 Cite this repository 48 MiB
Languages
Python 97.4%
Jinja 2.3%
Shell 0.2%