axolotl

Go to file

Wing Lian 78ce268848 KD Trainer w logprobs (#2303 )

* refactor trainer to prevent circular dependencies later

fix loader default
KD dataset loading and KD with logprobs
filter bad rows
make batch smaller
handle padding/collation for KD datasets
make it work
flipped the slice
cross entropy loss coefficient during KD
make sure to multiply against the correct loss
chore: lint
triton wip
no where support
v2 trial
no torch.exp inside triton kernel
no log etc
no torch.tensor
v3
fix kwarg
don't use triton for now
better rescaling for temperatures
hash for temperature too
use kd_alpha in the correct loss method
fix kd loss so it's causal (fixes repeating tokens)
var naming and add todo
chore: lint
refactor so we can easily add new loss functions
add license block
remove references to triton kd for now
handle token/logprob shifting
support for custom trainer classes from plugins
refactor kd chat template loader
move more things to kd plugin
remove moved class from import
make plugin setup concise
increase logging around loading plugins
add copyrights
remove duplicate code
more info on preprocess for kd and fix import
be a bit pickier about loading dynamic prompt strategies
kd sample packing
make loss torch script compat
support streaming for processing sft datasts?
improve iterable support
ensure that batch vs single is done properly
tweak check for batched prompt data
reward can use same batch check
fix reward trainer calls for tokenization
improve check for batched
reward model doesn't work well with batched
add kd trainer e2e test
linting
rename test files so it gets picked up
make the kd e2e fit in vram for ci and add lora version
set lora_dropout explicitly
lower lr
make sure to set tokenizer from l3 70b and save safetensors
make sure to use the correct tokenizer
fix adapter model check
make sure to use tensorboard to capture loss for checks
chore: lint
chore: lint
improve logprob masking and shift in trainer
more fixes
try tests for kd on l40s
don't shift student logits for kd
no batching for kd chat templates
make sure to truncate logprobs if there are more than top_k
change up logic so we always truncate to top_k
use iter instead of tuple
fix finding the top-k rather than assuming first position has the correct val
apply z-score scaling to kd
kd loss needs to be calculated in full precision
Always re-normalize teacher distribution
various fixes

* support for configurable top-k/softmax ordering

* add attribute check for filter rows and lint

* fix logic

* handle none case for conversion to int

* fix student logit off by one

* set kd_temp to 1.0 for test loss

* address PR feedback

2025-01-31 20:18:52 -05:00

.github

KD Trainer w logprobs (#2303 )

2025-01-31 20:18:52 -05:00

.vscode

feat: enable trl's autounwrap (#1060 )

2024-01-11 08:43:41 -05:00

cicd

KD Trainer w logprobs (#2303 )

2025-01-31 20:18:52 -05:00

deepspeed_configs

add deepspeed example with torch compile enabled (#2212 ) [skip ci]

2024-12-22 12:11:39 -05:00

devtools

remove fastchat and sharegpt (#2021 )

2024-11-08 13:45:49 -05:00

docker

Ray Train Axolotl Integration (#2251 )

2025-01-29 00:10:19 -05:00

docs

KD Trainer w logprobs (#2303 )

2025-01-31 20:18:52 -05:00

examples

updating to fused (#2293 )

2025-01-30 11:45:56 -05:00

image

Readme updates v2 (#2078 )

2024-11-18 14:58:03 -05:00

scripts

native support for modal cloud from CLI (#2237 )

2025-01-30 11:34:02 -05:00

src

KD Trainer w logprobs (#2303 )

2025-01-31 20:18:52 -05:00

tests

KD Trainer w logprobs (#2303 )

2025-01-31 20:18:52 -05:00

_quarto.yml

refactor README; hardcode links to quarto docs; add additional quarto doc pages (#2295 )

2025-01-30 12:49:21 -05:00

.bandit

Add bandit

2023-05-31 02:53:53 +09:00

.editorconfig

WIP for axolotl trainer

2023-04-14 00:20:05 -04:00

.flake8

Update ignores

2023-05-31 02:53:22 +09:00

.gitattributes

make it work with pythia in the cloud

2023-04-14 07:24:55 -04:00

.gitignore

add outputs (symlink) to gitignore [skip ci] (#2205 )

2024-12-19 20:14:43 -05:00

.isort.cfg

Comet integration (#1939 )

2024-10-09 16:03:37 -04:00

.mypy.ini

Liger Kernel integration (#1861 )

2024-08-23 12:21:51 -04:00

.pre-commit-config.yaml

Process reward models (#2241 )

2025-01-29 00:08:33 -05:00

.pylintrc

Fixing OSX installation (#2231 )

2025-01-07 13:42:01 +00:00

docker-compose.yaml

add git environment variables to compose: avoid checkout failure error 128 on build (#534 )

2023-09-08 15:59:49 -04:00

FAQS.md

Update FAQS.md

2023-06-10 23:36:14 -07:00

favicon.jpg

Bootstrap Hosted Axolotl Docs w/Quarto (#1429 )

2024-03-21 22:28:36 -07:00

index.qmd

fix toc

2024-04-03 12:05:49 -07:00

LICENSE

add apache 2.0 license

2023-07-21 09:49:29 -04:00

MANIFEST.in

fix build w pyproject to respect insalled torch version (#2168 )

2024-12-10 16:25:25 -05:00

pyproject.toml

fix build w pyproject to respect insalled torch version (#2168 )

2024-12-10 16:25:25 -05:00

README.md

refactor README; hardcode links to quarto docs; add additional quarto doc pages (#2295 )

2025-01-30 12:49:21 -05:00

requirements_env.txt

wip add new proposed message structure (#1904 )

2024-10-13 12:15:18 -04:00

requirements-dev.txt

fix optimizer reset for relora sft (#1414 )

2024-12-03 08:58:23 -05:00

requirements-tests.txt

fix optimizer reset for relora sft (#1414 )

2024-12-03 08:58:23 -05:00

requirements.txt

native support for modal cloud from CLI (#2237 )

2025-01-30 11:34:02 -05:00

setup.py

Ray Train Axolotl Integration (#2251 )

2025-01-29 00:10:19 -05:00

styles.css

refactor README; hardcode links to quarto docs; add additional quarto doc pages (#2295 )

2025-01-30 12:49:21 -05:00

TODO.md

fdsp config dict fix, todo list, add torchdistx support

2023-04-30 13:32:07 -04:00

README.md

Axolotl is a tool designed to streamline post-training for various AI models. Post-training refers to any modifications or additional training performed on pre-trained models - including full model fine-tuning, parameter-efficient tuning (like LoRA and QLoRA), supervised fine-tuning (SFT), instruction tuning, and alignment techniques. With support for multiple model architectures and training configurations, Axolotl makes it easy to get started with these techniques.

Axolotl is designed to work with YAML config files that contain everything you need to preprocess a dataset, train or fine-tune a model, run model inference or evaluation, and much more.

Features:

Train various Huggingface models such as llama, pythia, falcon, mpt
Supports fullfinetune, lora, qlora, relora, and gptq
Customize configurations using a simple yaml file or CLI overwrite
Load different dataset formats, use custom formats, or bring your own tokenized datasets
Integrated with xformers, flash attention, liger kernel, rope scaling, and multipacking
Works with single GPU or multiple GPUs via FSDP or Deepspeed
Easily run with Docker locally or on the cloud
Log results and optionally checkpoints to wandb, mlflow or Comet
And more!

🚀 Quick Start

Requirements:

NVIDIA GPU (Ampere or newer for bf16 and Flash Attention) or AMD GPU
Python ≥3.10
PyTorch ≥2.4.1

Installation

pip3 install --no-build-isolation axolotl[flash-attn,deepspeed]

# Download example axolotl configs, deepspeed configs
axolotl fetch examples
axolotl fetch deepspeed_configs  # OPTIONAL

Other installation approaches are described here.

Your First Fine-tune

# Fetch axolotl examples
axolotl fetch examples

# Or, specify a custom path
axolotl fetch examples --dest path/to/folder

# Train a model using LoRA
axolotl train examples/llama-3/lora-1b.yml

That's it! Check out our Getting Started Guide for a more detailed walkthrough.

✨ Key Features

Multiple Model Support: Train various models like LLaMA, Mistral, Mixtral, Pythia, and more
Training Methods: Full fine-tuning, LoRA, QLoRA, and more
Easy Configuration: Simple YAML files to control your training setup
Performance Optimizations: Flash Attention, xformers, multi-GPU training
Flexible Dataset Handling: Use various formats and custom datasets
Cloud Ready: Run on cloud platforms or local hardware

📚 Documentation

Installation Options - Detailed setup instructions for different environments
Configuration Guide - Full configuration options and examples
Dataset Guide - Supported formats and how to use them
Multi-GPU Training
Multi-Node Training
Multipacking
FAQ - Frequently asked questions

🤝 Getting Help

Join our Discord community for support
Check out our Examples directory
Read our Debugging Guide
Need dedicated support? Please contact ✉️wing@axolotl.ai for options

🌟 Contributing

Contributions are welcome! Please see our Contributing Guide for details.

Supported Models

	fp16/fp32	lora	qlora	gptq	gptq w/flash attn	flash attn	xformers attn
llama	✅	✅	✅	✅	✅	✅	✅
Mistral	✅	✅	✅	✅	✅	✅	✅
Mixtral-MoE	✅	✅	✅	❓	❓	❓	❓
Mixtral8X22	✅	✅	✅	❓	❓	❓	❓
Pythia	✅	✅	✅	❌	❌	❌	❓
cerebras	✅	✅	✅	❌	❌	❌	❓
btlm	✅	✅	✅	❌	❌	❌	❓
mpt	✅	❌	❓	❌	❌	❌	❓
falcon	✅	✅	✅	❌	❌	❌	❓
gpt-j	✅	✅	✅	❌	❌	❓	❓
XGen	✅	❓	✅	❓	❓	❓	✅
phi	✅	✅	✅	❓	❓	❓	❓
RWKV	✅	❓	❓	❓	❓	❓	❓
Qwen	✅	✅	✅	❓	❓	❓	❓
Gemma	✅	✅	✅	❓	❓	✅	❓
Jamba	✅	✅	✅	❓	❓	✅	❓

✅: supported ❌: not supported ❓: untested

❤️ Sponsors

Thank you to our sponsors who help make Axolotl possible:

Modal - Modal lets you run jobs in the cloud, by just writing a few lines of Python. Customers use Modal to deploy Gen AI models at large scale, fine-tune large language models, run protein folding simulations, and much more.

Interested in sponsoring? Contact us at wing@axolotl.ai

📜 License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.