* feat: add pos id to flex attention for packing part 1 * feat: update to include sliding window mask patch * fix: suppress MatMul8bitLt: inputs will be cast from warnings * fix: remove redundant flex attention patch * chore: update olmo docs * feat: add validator patch for cross entropy
1.5 KiB
1.5 KiB
Finetune Allenai's Olmo 3 with Axolotl
Olmo 3 are a family of 7B and 32B models open source models trained by The Allen Institute for Artificial Intelligence.
This guide shows how to fine-tune it with Axolotl with multi-turn conversations and proper masking.
Getting started
-
Install Axolotl following the installation guide.
-
Install Cut Cross Entropy to reduce training VRAM usage.
-
Run the finetuning example:
axolotl train examples/olmo3/olmo3-7b-qlora.yaml
This uses about 11.3 GiB VRAM. Let us know how it goes. Happy finetuning! 🚀
TIPS
- The example config can be re-used for Olmo and Olmo 2.
- You can run a full finetuning by removing the
adapter: qloraandload_in_4bit: truefrom the config. - Read more on how to load your own dataset at docs.
- The dataset format follows the OpenAI Messages format as seen here.
Optimization Guides
Please check the Optimizations doc.