A powerful, flexible tool designed to streamline post-training for various AI models with enterprise-grade features and optimizations.
Magistral with mistral-common tokenizer support has been added to Axolotl. See examples →
QAT support has been added to Axolotl. Explore the docs →
Llama 4 support has been added in Axolotl. See examples →
• Sequence Parallelism (SP) for scaling context length - Blog | Docs
• (Beta) Multimodal models fine-tuning - Check docs →
Reward Modelling / Process Reward Modelling fine-tuning support added. See docs →
Train LLaMA, Mistral, Mixtral, Pythia, and more. Full compatibility with HuggingFace transformers causal language models.
Full fine-tuning, LoRA, QLoRA, GPTQ, QAT, Preference Tuning (DPO, IPO, KTO, ORPO), RL (GRPO), Multimodal, and Reward Modelling.
Reuse a single YAML file across dataset preprocessing, training, evaluation, quantization, and inference.
Multipacking, Flash Attention, Xformers, Flex Attention, Liger Kernel, Sequence Parallelism, and Multi-GPU training.
Load from local files, HuggingFace datasets, and cloud storage (S3, Azure, GCP, OCI).
Pre-built Docker images and PyPI packages for seamless deployment on cloud platforms and local hardware.
# Install dependencies
pip3 install -U packaging==23.2 setuptools==75.8.0 wheel ninja
# Install Axolotl with Flash Attention and DeepSpeed
pip3 install --no-build-isolation axolotl[flash-attn,deepspeed]
# Download examples and configs
axolotl fetch examples
axolotl fetch deepspeed_configs # OPTIONAL
Other installation methods available in our documentation
# Fetch examples
axolotl fetch examples
# Or specify custom path
axolotl fetch examples --dest path/to/folder
# Start training with LoRA
axolotl train examples/llama-3/lora-1b.yml
That's it! Check our Getting Started Guide for detailed walkthrough
Detailed setup instructions for different environments
Full configuration options and examples
Loading datasets from various sources
Supported formats and usage instructions
Scale your training across multiple GPUs
Distributed training across multiple machines
Efficient batch packing for training
Auto-generated code documentation
Frequently asked questions
We welcome contributions from the community! Whether it's bug fixes,