Files

NanoCode012 372f664c63 feat: cleanup old flex mask patch, suppress Matmul bnb warn, and misc (#3330 ) [skip-ci]

* feat: add pos id to flex attention for packing part 1

* feat: update to include sliding window mask patch

* fix: suppress MatMul8bitLt: inputs will be cast from warnings

* fix: remove redundant flex attention patch

* chore: update olmo docs

* feat: add validator patch for cross entropy

2025-12-25 17:56:20 +07:00

1.5 KiB

Raw Blame History

Finetune Allenai's Olmo 3 with Axolotl

Olmo 3 are a family of 7B and 32B models open source models trained by The Allen Institute for Artificial Intelligence.

This guide shows how to fine-tune it with Axolotl with multi-turn conversations and proper masking.

Getting started

Install Axolotl following the installation guide.
Install Cut Cross Entropy to reduce training VRAM usage.

Run the finetuning example:

axolotl train examples/olmo3/olmo3-7b-qlora.yaml

This uses about 11.3 GiB VRAM. Let us know how it goes. Happy finetuning! 🚀

TIPS

The example config can be re-used for Olmo and Olmo 2.
You can run a full finetuning by removing the adapter: qlora and load_in_4bit: true from the config.
Read more on how to load your own dataset at docs.
The dataset format follows the OpenAI Messages format as seen here.

Optimization Guides

Please check the Optimizations doc.

1.5 KiB Raw Blame History

Finetune Allenai's Olmo 3 with Axolotl

Getting started

TIPS

Optimization Guides

Related Resources

1.5 KiB

Raw Blame History