axolotl

Go to file

NanoCode012 6778856804 Fix: RL base feature parity (#2133 )

* feat: add num_proc and load from cache for rl mapping

* fix: refactor sft and rl trainer to set same base args

* feat: add report_to to set run name

* fix: consolidate handling of fp16, bf16, tf32 kwarg

* chore: consolidate eval_strat, loraplus, lr sched, max_length

* fix: deprecate old types

* fix: adding missing Any

* fix: max_steps incorrectly set

* fix: remove unnecessary datacollator kwarg insert and pop

* fix: update default max_steps

* fix: add missing weight_decay handling

* fix: ignore max_length for grpo

* feat: update CI on trainer_builder

* fix: comments

* improve handling of warmup/logging steps

* use transformers default for logging steps, not None

* fix: remove redundant override

* fix: lint

* feat: allow custom optim for rl methods

* fix: duplicate optim setting

* fix(test): set sequence_parallel_degree default in base cfg

* feat: add handling for seed and SP/ring-attn config

* chore: add back return typing from rebase

* fix(test): use RLType directly to skip needing to validate

* feat: split training builder into sub modules

* fix: remove deprecated clause

* chore: add missing config to doc

* fix: update quarto autodoc

* fix: import path for trainer builder and submodules

* fix: remove redundant configs from rebase mistake

* chore: simplify dynamo check

* fix: optimizer_cls_and_kwargs to be passed into trainer_kwargs

* fix: add missing rex from rebase

* fix: move pop optimizer_cls_and_kwargs

* fix: pop optimizer cls in rl too

* fix: leftover bug from rebase

* fix: update handling of trainer_cls in RL

* fix: address pr feedback

* feat: call hook_pre_create_trainer for rl

* chore: lint

* fix: return notimplemented for ppo

* feat: moved torch compile to base and refactor collator setting

* chore: remove unused importlib.util import

* fix: optimizer cls not being popped

* feat: move epoch setting to base

* fix: catch unhandled custom optimizer

* fix: remove duplicate lora plus setting

* chore: refactor if condition

* chore: refactor set_base_training_args into smaller modules

* fix: address TrainerBuilderBase class variables to instance var

* fix: add handling for beta3 and episilon2

* fix: change to pass dict via arg instead of updating dict

* chore: simplify if condition

* fix: force access to lr & weight decay in case not provided to early error

* fix: remove log sweep

* chore: refactor if condition

* fix: address renamed cfg

* fix: improve handling of cosine hyp

* fix: remove unused params

* chore: refactor

* chore: clarify doc safetensors

* fix: update import path to be unified following comments

* fix: duplicate kwargs passed

* feat: return separate trainer_kwargs

* chore: refactor

* chore: refactor based on comments

* chore: refactor based on comments

* fix: move gpustats callback to base

* chore: create trainer_cls_args first based on comments

* fix: ipo label smoothing passed incorrectly

* feat: add optimizer parity for RL methods with test

* feat: add parity for optimizer in RM/PRM and add test

* fix: remove redundant function override for orpo/cpo batch metrics

* fix: improve handling of dpo_label_smoothing and merge issue

* fix: test fixture returning wrong field

* fix: address avoid direct modify fixture

* chore: minor refactor

* Revert "chore: refactor"

This reverts commit 99c8859eb0.

* feat: rename trainer_builder to builders

---------

Co-authored-by: Wing Lian <wing@axolotl.ai>

2025-05-30 11:21:47 +07:00

.github

no need to generate diff file (#2728 )

2025-05-27 11:44:06 -04:00

.runpod

feat(doc): add info on how to use dapo / dr grpo and misc doc fixes (#2673 ) [skip ci]

2025-05-28 15:51:04 +07:00

.vscode

feat: enable trl's autounwrap (#1060 )

2024-01-11 08:43:41 -05:00

cicd

GRPO fixes (peft) (#2676 )

2025-05-16 15:47:03 -04:00

deepspeed_configs

add deepspeed example with torch compile enabled (#2212 ) [skip ci]

2024-12-22 12:11:39 -05:00

devtools

remove fastchat and sharegpt (#2021 )

2024-11-08 13:45:49 -05:00

docker

add base docker image with pytorch 2.7.0 and variant for cuda 12.8 (#2551 )

2025-04-23 14:59:03 -04:00

docs

Fix: RL base feature parity (#2133 )

2025-05-30 11:21:47 +07:00

examples

Rank 0-only logging (#2608 )

2025-05-28 14:57:30 +01:00

image

Readme updates v2 (#2078 )

2024-11-18 14:58:03 -05:00

scripts

feat: update cce to latest (#2521 )

2025-04-15 22:17:10 -07:00

src

Fix: RL base feature parity (#2133 )

2025-05-30 11:21:47 +07:00

tests

Fix: RL base feature parity (#2133 )

2025-05-30 11:21:47 +07:00

_quarto.yml

Fix: RL base feature parity (#2133 )

2025-05-30 11:21:47 +07:00

.bandit

Add bandit

2023-05-31 02:53:53 +09:00

.coveragerc

adding codecov reporting (#2372 ) [skip ci]

2025-04-16 15:02:17 -07:00

.editorconfig

WIP for axolotl trainer

2023-04-14 00:20:05 -04:00

.flake8

Update ignores

2023-05-31 02:53:22 +09:00

.gitattributes

make it work with pythia in the cloud

2023-04-14 07:24:55 -04:00

.gitignore

Autodoc generation with quartodoc (#2419 )

2025-03-21 12:26:47 -04:00

.isort.cfg

fix: minor patches for multimodal (#2441 )

2025-03-31 13:40:12 +07:00

.mypy.ini

Liger Kernel integration (#1861 )

2024-08-23 12:21:51 -04:00

.pre-commit-config.yaml

chore: update pre-commit hooks (#2729 )

2025-05-27 11:45:31 -04:00

.pylintrc

Fixing OSX installation (#2231 )

2025-01-07 13:42:01 +00:00

CNAME

feat: add CNAME (#2513 )

2025-04-10 12:34:25 +07:00

codecov.yml

update doc and use P2P=LOC for brittle grpo test (#2649 )

2025-05-12 14:17:25 -04:00

docker-compose.yaml

add git environment variables to compose: avoid checkout failure error 128 on build (#534 )

2023-09-08 15:59:49 -04:00

FAQS.md

Update FAQS.md

2023-06-10 23:36:14 -07:00

favicon.jpg

Bootstrap Hosted Axolotl Docs w/Quarto (#1429 )

2024-03-21 22:28:36 -07:00

index.qmd

Feat(doc): Reorganize documentation, fix broken syntax, update notes (#2348 )

2025-02-25 16:09:37 +07:00

LICENSE

add apache 2.0 license

2023-07-21 09:49:29 -04:00

MANIFEST.in

fix build w pyproject to respect insalled torch version (#2168 )

2024-12-10 16:25:25 -05:00

pyproject.toml

chore: update doc links (#2509 )

2025-04-11 09:53:18 -04:00

README.md

adding codecov reporting (#2372 ) [skip ci]

2025-04-16 15:02:17 -07:00

requirements-dev.txt

adding codecov reporting (#2372 ) [skip ci]

2025-04-16 15:02:17 -07:00

requirements-tests.txt

Codecov fixes / improvements (#2549 )

2025-04-23 10:33:30 -04:00

requirements.txt

QAT (#2590 )

2025-05-28 12:35:47 +01:00

setup.py

Add CAME Optimizer (#2385 )

2025-05-07 10:31:46 -04:00

styles.css

Autodoc generation with quartodoc (#2419 )

2025-03-21 12:26:47 -04:00

TODO.md

fdsp config dict fix, todo list, add torchdistx support

2023-04-30 13:32:07 -04:00

README.md

Axolotl is a tool designed to streamline post-training for various AI models. Post-training refers to any modifications or additional training performed on pre-trained models - including full model fine-tuning, parameter-efficient tuning (like LoRA and QLoRA), supervised fine-tuning (SFT), instruction tuning, and alignment techniques. With support for multiple model architectures and training configurations, Axolotl makes it easy to get started with these techniques.

Axolotl is designed to work with YAML config files that contain everything you need to preprocess a dataset, train or fine-tune a model, run model inference or evaluation, and much more.

Features:

Train various Huggingface models such as llama, pythia, falcon, mpt
Supports fullfinetune, lora, qlora, relora, and gptq
Customize configurations using a simple yaml file or CLI overwrite
Load different dataset formats, use custom formats, or bring your own tokenized datasets
Integrated with xformers, flash attention, liger kernel, rope scaling, and multipacking
Works with single GPU or multiple GPUs via FSDP or Deepspeed
Easily run with Docker locally or on the cloud
Log results and optionally checkpoints to wandb, mlflow or Comet
And more!

🚀 Quick Start

Requirements:

NVIDIA GPU (Ampere or newer for bf16 and Flash Attention) or AMD GPU
Python 3.11
PyTorch ≥2.4.1

Installation

pip3 install -U packaging==23.2 setuptools==75.8.0 wheel ninja
pip3 install --no-build-isolation axolotl[flash-attn,deepspeed]

# Download example axolotl configs, deepspeed configs
axolotl fetch examples
axolotl fetch deepspeed_configs  # OPTIONAL

Other installation approaches are described here.

Your First Fine-tune

# Fetch axolotl examples
axolotl fetch examples

# Or, specify a custom path
axolotl fetch examples --dest path/to/folder

# Train a model using LoRA
axolotl train examples/llama-3/lora-1b.yml

That's it! Check out our Getting Started Guide for a more detailed walkthrough.

✨ Key Features

Multiple Model Support: Train various models like LLaMA, Mistral, Mixtral, Pythia, and more
Training Methods: Full fine-tuning, LoRA, QLoRA, and more
Easy Configuration: Simple YAML files to control your training setup
Performance Optimizations: Flash Attention, xformers, multi-GPU training
Flexible Dataset Handling: Use various formats and custom datasets
Cloud Ready: Run on cloud platforms or local hardware

📚 Documentation

Installation Options - Detailed setup instructions for different environments
Configuration Guide - Full configuration options and examples
Dataset Guide - Supported formats and how to use them
Multi-GPU Training
Multi-Node Training
Multipacking
API Reference - Auto-generated code documentation
FAQ - Frequently asked questions

🤝 Getting Help

Join our Discord community for support
Check out our Examples directory
Read our Debugging Guide
Need dedicated support? Please contact ✉️wing@axolotl.ai for options

🌟 Contributing

Contributions are welcome! Please see our Contributing Guide for details.

Supported Models

	fp16/fp32	lora	qlora	gptq	gptq w/flash attn	flash attn	xformers attn
llama	✅	✅	✅	✅	✅	✅	✅
Mistral	✅	✅	✅	✅	✅	✅	✅
Mixtral-MoE	✅	✅	✅	❓	❓	❓	❓
Mixtral8X22	✅	✅	✅	❓	❓	❓	❓
Pythia	✅	✅	✅	❌	❌	❌	❓
cerebras	✅	✅	✅	❌	❌	❌	❓
btlm	✅	✅	✅	❌	❌	❌	❓
mpt	✅	❌	❓	❌	❌	❌	❓
falcon	✅	✅	✅	❌	❌	❌	❓
gpt-j	✅	✅	✅	❌	❌	❓	❓
XGen	✅	❓	✅	❓	❓	❓	✅
phi	✅	✅	✅	❓	❓	❓	❓
RWKV	✅	❓	❓	❓	❓	❓	❓
Qwen	✅	✅	✅	❓	❓	❓	❓
Gemma	✅	✅	✅	❓	❓	✅	❓
Jamba	✅	✅	✅	❓	❓	✅	❓

✅: supported ❌: not supported ❓: untested

❤️ Sponsors

Thank you to our sponsors who help make Axolotl possible:

Modal - Modal lets you run jobs in the cloud, by just writing a few lines of Python. Customers use Modal to deploy Gen AI models at large scale, fine-tune large language models, run protein folding simulations, and much more.

Interested in sponsoring? Contact us at wing@axolotl.ai

📜 License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.