%%capture# This step can take ~5-10 minutes to install dependencies!pip install --no-build-isolation axolotl[flash-attn]>=0.9.1
-!pip install "cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@bb8d9f8"
2025/07: Voxtral with mistral-common tokenizer support has been integrated in Axolotl. Read the docs!
-
2025/07: TiledMLP support for single-GPU to multi-GPU training with DDP, DeepSpeed and FSDP support has been added to support Arctic Long Sequence Training. (ALST). See examples for using ALST with Axolotl!
-
2025/06: Magistral with mistral-common tokenizer support has been added to Axolotl. See examples to start training your own Magistral models with Axolotl!
+
2025/07:
+
+
ND Parallelism support has been added into Axolotl. Compose Context Parallelism (CP), Tensor Parallelism (TP), and Fully Sharded Data Parallelism (FSDP) within a single node and across multiple nodes. Check out the blog post for more info.
TiledMLP support for single-GPU to multi-GPU training with DDP, DeepSpeed and FSDP support has been added to support Arctic Long Sequence Training. (ALST). See examples for using ALST with Axolotl!
+
2025/05: Quantization Aware Training (QAT) support has been added to Axolotl. Explore the docs to learn more!
-
2025/04: Llama 4 support has been added in Axolotl. See examples to start training your own Llama 4 models with Axolotl’s linearized version!
2025/03: Axolotl has implemented Sequence Parallelism (SP) support. Read the blog and docs to learn how to scale your context length when fine-tuning.
+
+
+
+Expand older updates
+
+
+
2025/06: Magistral with mistral-common tokenizer support has been added to Axolotl. See examples to start training your own Magistral models with Axolotl!
+
2025/04: Llama 4 support has been added in Axolotl. See examples to start training your own Llama 4 models with Axolotl’s linearized version!
2025/03: (Beta) Fine-tuning Multimodal models is now supported in Axolotl. Check out the docs to fine-tune your own!
2025/02: Axolotl has added LoRA optimizations to reduce memory usage and improve training speed for LoRA and QLoRA in single GPU and multi-GPU training (DDP and DeepSpeed). Jump into the docs to give it a try.
2025/02: Axolotl has added GRPO support. Dive into our blog and GRPO example and have some fun!
2025/01: Axolotl has added Reward Modelling / Process Reward Modelling fine-tuning support. See docs.
+
✨ Overview
diff --git a/search.json b/search.json
index 0641d7fcd..635d672f4 100644
--- a/search.json
+++ b/search.json
@@ -18,7 +18,7 @@
"href": "index.html#latest-updates",
"title": "Axolotl",
"section": "🎉 Latest Updates",
- "text": "🎉 Latest Updates\n\n2025/07: Voxtral with mistral-common tokenizer support has been integrated in Axolotl. Read the docs!\n2025/07: TiledMLP support for single-GPU to multi-GPU training with DDP, DeepSpeed and FSDP support has been added to support Arctic Long Sequence Training. (ALST). See examples for using ALST with Axolotl!\n2025/06: Magistral with mistral-common tokenizer support has been added to Axolotl. See examples to start training your own Magistral models with Axolotl!\n2025/05: Quantization Aware Training (QAT) support has been added to Axolotl. Explore the docs to learn more!\n2025/04: Llama 4 support has been added in Axolotl. See examples to start training your own Llama 4 models with Axolotl’s linearized version!\n2025/03: Axolotl has implemented Sequence Parallelism (SP) support. Read the blog and docs to learn how to scale your context length when fine-tuning.\n2025/03: (Beta) Fine-tuning Multimodal models is now supported in Axolotl. Check out the docs to fine-tune your own!\n2025/02: Axolotl has added LoRA optimizations to reduce memory usage and improve training speed for LoRA and QLoRA in single GPU and multi-GPU training (DDP and DeepSpeed). Jump into the docs to give it a try.\n2025/02: Axolotl has added GRPO support. Dive into our blog and GRPO example and have some fun!\n2025/01: Axolotl has added Reward Modelling / Process Reward Modelling fine-tuning support. See docs.",
+ "text": "🎉 Latest Updates\n\n2025/07:\n\nND Parallelism support has been added into Axolotl. Compose Context Parallelism (CP), Tensor Parallelism (TP), and Fully Sharded Data Parallelism (FSDP) within a single node and across multiple nodes. Check out the blog post for more info.\nAxolotl adds more models: GPT-OSS, Gemma 3n, Liquid Foundation Model 2 (LFM2), and Arcee Foundation Models (AFM).\nFP8 finetuning with fp8 gather op is now possible in Axolotl via torchao. Get started here!\nVoxtral, Magistral 1.1, and Devstral with mistral-common tokenizer support has been integrated in Axolotl!\nTiledMLP support for single-GPU to multi-GPU training with DDP, DeepSpeed and FSDP support has been added to support Arctic Long Sequence Training. (ALST). See examples for using ALST with Axolotl!\n\n2025/05: Quantization Aware Training (QAT) support has been added to Axolotl. Explore the docs to learn more!\n2025/03: Axolotl has implemented Sequence Parallelism (SP) support. Read the blog and docs to learn how to scale your context length when fine-tuning.\n\n\n\nExpand older updates\n\n\n2025/06: Magistral with mistral-common tokenizer support has been added to Axolotl. See examples to start training your own Magistral models with Axolotl!\n2025/04: Llama 4 support has been added in Axolotl. See examples to start training your own Llama 4 models with Axolotl’s linearized version!\n2025/03: (Beta) Fine-tuning Multimodal models is now supported in Axolotl. Check out the docs to fine-tune your own!\n2025/02: Axolotl has added LoRA optimizations to reduce memory usage and improve training speed for LoRA and QLoRA in single GPU and multi-GPU training (DDP and DeepSpeed). Jump into the docs to give it a try.\n2025/02: Axolotl has added GRPO support. Dive into our blog and GRPO example and have some fun!\n2025/01: Axolotl has added Reward Modelling / Process Reward Modelling fine-tuning support. See docs.",
"crumbs": [
"Home"
]
@@ -2929,7 +2929,7 @@
"href": "docs/custom_integrations.html#cut-cross-entropy",
"title": "Custom Integrations",
"section": "Cut Cross Entropy",
- "text": "Cut Cross Entropy\nCut Cross Entropy (CCE) reduces VRAM usage through optimization on the cross-entropy operation during loss calculation.\nSee https://github.com/apple/ml-cross-entropy\n\nRequirements\n\nPyTorch 2.4.0 or higher\n\n\n\nInstallation\nRun the following command to install cut_cross_entropy[transformers] if you don’t have it already.\n\nIf you are in dev environment\n\npython scripts/cutcrossentropy_install.py | sh\n\nIf you are installing from pip\n\npip3 uninstall -y cut-cross-entropy && pip3 install \"cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@bb8d9f8\"\n\n\nUsage\nplugins:\n - axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin\n\n\nSupported Models\n\narcee\ncohere\ncohere2\ngemma\ngemma2\ngemma3\ngemma3_text\ngemma3n\ngemma3n_text\nglm\nglm4\ngpt_oss\ngranite\ngranitemoe\nhunyuan_v1_dense\nhunyuan_v1_moe\nllama\nllama4\nllama4_text\nmistral\nmistral3\nmixtral\nmllama\nphi\nphi3\nphi4_multimodal\nqwen2\nqwen2_vl\nqwen2_moe\nqwen2_5_vl\nqwen3\nqwen3_moe\nsmollm3\nvoxtral\n\n\n\nCitation\n@article{wijmans2024cut,\n author = {Erik Wijmans and\n Brody Huval and\n Alexander Hertzberg and\n Vladlen Koltun and\n Philipp Kr\\\"ahenb\\\"uhl},\n title = {Cut Your Losses in Large-Vocabulary Language Models},\n journal = {arXiv},\n year = {2024},\n url = {https://arxiv.org/abs/2411.09009},\n}\nPlease see reference here",
+ "text": "Cut Cross Entropy\nCut Cross Entropy (CCE) reduces VRAM usage through optimization on the cross-entropy operation during loss calculation.\nSee https://github.com/apple/ml-cross-entropy\n\nRequirements\n\nPyTorch 2.4.0 or higher\n\n\n\nInstallation\nRun the following command to install cut_cross_entropy[transformers] if you don’t have it already.\n\nIf you are in dev environment\n\npython scripts/cutcrossentropy_install.py | sh\n\nIf you are installing from pip\n\npip3 uninstall -y cut-cross-entropy && pip3 install \"cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@0ee9ee8\"\n\n\nUsage\nplugins:\n - axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin\n\n\nSupported Models\n\narcee\ncohere\ncohere2\ngemma\ngemma2\ngemma3\ngemma3_text\ngemma3n\ngemma3n_text\nglm\nglm4\ngpt_oss\ngranite\ngranitemoe\nhunyuan_v1_dense\nhunyuan_v1_moe\nllama\nllama4\nllama4_text\nmistral\nmistral3\nmixtral\nmllama\nphi\nphi3\nphi4_multimodal\nqwen2\nqwen2_vl\nqwen2_moe\nqwen2_5_vl\nqwen3\nqwen3_moe\nsmollm3\nvoxtral\n\n\n\nCitation\n@article{wijmans2024cut,\n author = {Erik Wijmans and\n Brody Huval and\n Alexander Hertzberg and\n Vladlen Koltun and\n Philipp Kr\\\"ahenb\\\"uhl},\n title = {Cut Your Losses in Large-Vocabulary Language Models},\n journal = {arXiv},\n year = {2024},\n url = {https://arxiv.org/abs/2411.09009},\n}\nPlease see reference here",
"crumbs": [
"Advanced Features",
"Custom Integrations"
diff --git a/sitemap.xml b/sitemap.xml
index 3401980d2..881bc6f33 100644
--- a/sitemap.xml
+++ b/sitemap.xml
@@ -2,794 +2,794 @@
https://docs.axolotl.ai/TODO.html
- 2025-08-08T12:15:25.300Z
+ 2025-08-08T12:24:17.832Zhttps://docs.axolotl.ai/index.html
- 2025-08-08T12:15:25.321Z
+ 2025-08-08T12:24:17.854Zhttps://docs.axolotl.ai/docs/debugging.html
- 2025-08-08T12:15:25.302Z
+ 2025-08-08T12:24:17.834Zhttps://docs.axolotl.ai/docs/amd_hpc.html
- 2025-08-08T12:15:25.301Z
+ 2025-08-08T12:24:17.833Zhttps://docs.axolotl.ai/docs/api/utils.callbacks.mlflow_.html
- 2025-08-08T12:18:41.102Z
+ 2025-08-08T12:27:42.672Zhttps://docs.axolotl.ai/docs/api/monkeypatch.llama_expand_mask.html
- 2025-08-08T12:18:40.512Z
+ 2025-08-08T12:27:42.098Zhttps://docs.axolotl.ai/docs/api/loaders.patch_manager.html
- 2025-08-08T12:18:40.141Z
+ 2025-08-08T12:27:41.739Zhttps://docs.axolotl.ai/docs/api/core.chat.format.llama3x.html
- 2025-08-08T12:18:39.817Z
+ 2025-08-08T12:27:41.424Zhttps://docs.axolotl.ai/docs/api/cli.train.html
- 2025-08-08T12:18:39.874Z
+ 2025-08-08T12:27:41.480Zhttps://docs.axolotl.ai/docs/api/utils.callbacks.perplexity.html
- 2025-08-08T12:18:41.093Z
+ 2025-08-08T12:27:42.663Zhttps://docs.axolotl.ai/docs/api/core.chat.messages.html
- 2025-08-08T12:18:39.814Z
+ 2025-08-08T12:27:41.421Zhttps://docs.axolotl.ai/docs/api/utils.callbacks.lisa.html
- 2025-08-08T12:18:41.098Z
+ 2025-08-08T12:27:42.668Zhttps://docs.axolotl.ai/docs/api/cli.merge_sharded_fsdp_weights.html
- 2025-08-08T12:18:39.972Z
+ 2025-08-08T12:27:41.573Zhttps://docs.axolotl.ai/docs/api/monkeypatch.mixtral.html
- 2025-08-08T12:18:40.572Z
+ 2025-08-08T12:27:42.157Zhttps://docs.axolotl.ai/docs/api/utils.chat_templates.html
- 2025-08-08T12:18:40.610Z
+ 2025-08-08T12:27:42.194Zhttps://docs.axolotl.ai/docs/api/core.chat.format.shared.html
- 2025-08-08T12:18:39.819Z
+ 2025-08-08T12:27:41.425Zhttps://docs.axolotl.ai/docs/api/core.trainers.mixins.optimizer.html
- 2025-08-08T12:18:40.149Z
+ 2025-08-08T12:27:41.746Zhttps://docs.axolotl.ai/docs/api/utils.collators.mamba.html
- 2025-08-08T12:18:41.040Z
+ 2025-08-08T12:27:42.612Zhttps://docs.axolotl.ai/docs/api/logging_config.html
- 2025-08-08T12:18:39.763Z
+ 2025-08-08T12:27:41.370Zhttps://docs.axolotl.ai/docs/api/utils.collators.mm_chat.html
- 2025-08-08T12:18:41.045Z
+ 2025-08-08T12:27:42.616Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.completion.html
- 2025-08-08T12:18:40.274Z
+ 2025-08-08T12:27:41.867Zhttps://docs.axolotl.ai/docs/api/kernels.utils.html
- 2025-08-08T12:18:40.496Z
+ 2025-08-08T12:27:42.083Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.dpo.chat_template.html
- 2025-08-08T12:18:40.308Z
+ 2025-08-08T12:27:41.900Zhttps://docs.axolotl.ai/docs/api/kernels.swiglu.html
- 2025-08-08T12:18:40.487Z
+ 2025-08-08T12:27:42.074Zhttps://docs.axolotl.ai/docs/api/common.const.html
- 2025-08-08T12:18:40.999Z
+ 2025-08-08T12:27:42.572Zhttps://docs.axolotl.ai/docs/api/cli.cloud.base.html
- 2025-08-08T12:18:39.995Z
+ 2025-08-08T12:27:41.596Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.orpo.chat_template.html
- 2025-08-08T12:18:40.372Z
+ 2025-08-08T12:27:41.963Zhttps://docs.axolotl.ai/docs/api/core.builders.rl.html
- 2025-08-08T12:18:39.778Z
+ 2025-08-08T12:27:41.386Zhttps://docs.axolotl.ai/docs/api/utils.dict.html
- 2025-08-08T12:18:40.703Z
+ 2025-08-08T12:27:42.285Zhttps://docs.axolotl.ai/docs/api/utils.schemas.integrations.html
- 2025-08-08T12:18:40.818Z
+ 2025-08-08T12:27:42.395Zhttps://docs.axolotl.ai/docs/api/core.trainers.utils.html
- 2025-08-08T12:18:40.106Z
+ 2025-08-08T12:27:41.704Zhttps://docs.axolotl.ai/docs/api/monkeypatch.trainer_fsdp_optim.html
- 2025-08-08T12:18:40.561Z
+ 2025-08-08T12:27:42.146Zhttps://docs.axolotl.ai/docs/api/cli.evaluate.html
- 2025-08-08T12:18:39.883Z
+ 2025-08-08T12:27:41.488Zhttps://docs.axolotl.ai/docs/api/core.builders.causal.html
- 2025-08-08T12:18:39.774Z
+ 2025-08-08T12:27:41.381Zhttps://docs.axolotl.ai/docs/api/monkeypatch.multipack.html
- 2025-08-08T12:18:40.506Z
+ 2025-08-08T12:27:42.093Zhttps://docs.axolotl.ai/docs/api/monkeypatch.llama_patch_multipack.html
- 2025-08-08T12:18:40.552Z
+ 2025-08-08T12:27:42.137Zhttps://docs.axolotl.ai/docs/api/cli.delinearize_llama4.html
- 2025-08-08T12:18:39.936Z
+ 2025-08-08T12:27:41.539Zhttps://docs.axolotl.ai/docs/api/utils.schemas.trl.html
- 2025-08-08T12:18:40.800Z
+ 2025-08-08T12:27:42.378Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.dpo.zephyr.html
- 2025-08-08T12:18:40.330Z
+ 2025-08-08T12:27:41.922Zhttps://docs.axolotl.ai/docs/api/integrations.kd.trainer.html
- 2025-08-08T12:18:40.986Z
+ 2025-08-08T12:27:42.559Zhttps://docs.axolotl.ai/docs/api/monkeypatch.gradient_checkpointing.offload_disk.html
- 2025-08-08T12:18:40.601Z
+ 2025-08-08T12:27:42.186Zhttps://docs.axolotl.ai/docs/api/utils.optimizers.adopt.html
- 2025-08-08T12:18:40.711Z
+ 2025-08-08T12:27:42.292Zhttps://docs.axolotl.ai/docs/api/monkeypatch.data.batch_dataset_fetcher.html
- 2025-08-08T12:18:40.571Z
+ 2025-08-08T12:27:42.155Zhttps://docs.axolotl.ai/docs/api/cli.cloud.modal_.html
- 2025-08-08T12:18:40.002Z
+ 2025-08-08T12:27:41.602Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.alpaca_chat.html
- 2025-08-08T12:18:40.232Z
+ 2025-08-08T12:27:41.827Zhttps://docs.axolotl.ai/docs/api/utils.freeze.html
- 2025-08-08T12:18:40.632Z
+ 2025-08-08T12:27:42.216Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.bradley_terry.llama3.html
- 2025-08-08T12:18:40.376Z
+ 2025-08-08T12:27:41.967Zhttps://docs.axolotl.ai/docs/api/integrations.base.html
- 2025-08-08T12:18:40.973Z
+ 2025-08-08T12:27:42.547Zhttps://docs.axolotl.ai/docs/api/monkeypatch.unsloth_.html
- 2025-08-08T12:18:40.569Z
+ 2025-08-08T12:27:42.154Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.kto.chatml.html
- 2025-08-08T12:18:40.350Z
+ 2025-08-08T12:27:41.941Zhttps://docs.axolotl.ai/docs/api/cli.main.html
- 2025-08-08T12:18:39.866Z
+ 2025-08-08T12:27:41.472Zhttps://docs.axolotl.ai/docs/api/common.datasets.html
- 2025-08-08T12:18:41.015Z
+ 2025-08-08T12:27:42.587Zhttps://docs.axolotl.ai/docs/api/train.html
- 2025-08-08T12:18:39.676Z
+ 2025-08-08T12:27:41.286Zhttps://docs.axolotl.ai/docs/api/utils.trainer.html
- 2025-08-08T12:18:40.650Z
+ 2025-08-08T12:27:42.233Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.llama2_chat.html
- 2025-08-08T12:18:40.267Z
+ 2025-08-08T12:27:41.861Zhttps://docs.axolotl.ai/docs/api/index.html
- 2025-08-08T12:18:39.614Z
+ 2025-08-08T12:27:41.225Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.chat_template.html
- 2025-08-08T12:18:40.218Z
+ 2025-08-08T12:27:41.814Zhttps://docs.axolotl.ai/docs/api/core.training_args.html
- 2025-08-08T12:18:39.791Z
+ 2025-08-08T12:27:41.398Zhttps://docs.axolotl.ai/docs/api/kernels.quantize.html
- 2025-08-08T12:18:40.495Z
+ 2025-08-08T12:27:42.082Zhttps://docs.axolotl.ai/docs/api/convert.html
- 2025-08-08T12:18:39.711Z
+ 2025-08-08T12:27:41.320Zhttps://docs.axolotl.ai/docs/api/integrations.grokfast.optimizer.html
- 2025-08-08T12:18:40.978Z
+ 2025-08-08T12:27:42.551Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.stepwise_supervised.html
- 2025-08-08T12:18:40.284Z
+ 2025-08-08T12:27:41.877Zhttps://docs.axolotl.ai/docs/api/utils.schemas.model.html
- 2025-08-08T12:18:40.763Z
+ 2025-08-08T12:27:42.342Zhttps://docs.axolotl.ai/docs/api/utils.callbacks.qat.html
- 2025-08-08T12:18:41.112Z
+ 2025-08-08T12:27:42.682Zhttps://docs.axolotl.ai/docs/api/loaders.constants.html
- 2025-08-08T12:18:40.143Z
+ 2025-08-08T12:27:41.740Zhttps://docs.axolotl.ai/docs/api/cli.utils.sweeps.html
- 2025-08-08T12:18:40.032Z
+ 2025-08-08T12:27:41.632Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.dpo.llama3.html
- 2025-08-08T12:18:40.318Z
+ 2025-08-08T12:27:41.910Zhttps://docs.axolotl.ai/docs/api/core.datasets.transforms.chat_builder.html
- 2025-08-08T12:18:39.832Z
+ 2025-08-08T12:27:41.438Zhttps://docs.axolotl.ai/docs/api/cli.utils.fetch.html
- 2025-08-08T12:18:40.021Z
+ 2025-08-08T12:27:41.620Zhttps://docs.axolotl.ai/docs/api/core.trainers.mamba.html
- 2025-08-08T12:18:40.074Z
+ 2025-08-08T12:27:41.673Zhttps://docs.axolotl.ai/docs/api/utils.schemas.enums.html
- 2025-08-08T12:18:40.829Z
+ 2025-08-08T12:27:42.406Zhttps://docs.axolotl.ai/docs/api/utils.callbacks.profiler.html
- 2025-08-08T12:18:41.097Z
+ 2025-08-08T12:27:42.667Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.metharme.html
- 2025-08-08T12:18:40.291Z
+ 2025-08-08T12:27:41.884Zhttps://docs.axolotl.ai/docs/api/core.trainers.trl.html
- 2025-08-08T12:18:40.069Z
+ 2025-08-08T12:27:41.668Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.orcamini.html
- 2025-08-08T12:18:40.295Z
+ 2025-08-08T12:27:41.888Zhttps://docs.axolotl.ai/docs/api/utils.samplers.multipack.html
- 2025-08-08T12:18:41.086Z
+ 2025-08-08T12:27:42.657Zhttps://docs.axolotl.ai/docs/api/utils.schedulers.html
- 2025-08-08T12:18:40.678Z
+ 2025-08-08T12:27:42.260Zhttps://docs.axolotl.ai/docs/api/core.trainers.grpo.trainer.html
- 2025-08-08T12:18:40.092Z
+ 2025-08-08T12:27:41.691Zhttps://docs.axolotl.ai/docs/api/prompt_tokenizers.html
- 2025-08-08T12:18:39.753Z
+ 2025-08-08T12:27:41.361Zhttps://docs.axolotl.ai/docs/config-reference.html
- 2025-08-08T12:18:55.501Z
+ 2025-08-08T12:27:56.123Zhttps://docs.axolotl.ai/docs/multimodal.html
- 2025-08-08T12:15:25.305Z
+ 2025-08-08T12:24:17.837Zhttps://docs.axolotl.ai/docs/mixed_precision.html
- 2025-08-08T12:15:25.305Z
+ 2025-08-08T12:24:17.837Zhttps://docs.axolotl.ai/docs/unsloth.html
- 2025-08-08T12:15:25.305Z
+ 2025-08-08T12:24:17.838Zhttps://docs.axolotl.ai/docs/ray-integration.html
- 2025-08-08T12:15:25.305Z
+ 2025-08-08T12:24:17.838Zhttps://docs.axolotl.ai/docs/dataset-formats/stepwise_supervised.html
- 2025-08-08T12:15:25.301Z
+ 2025-08-08T12:24:17.834Zhttps://docs.axolotl.ai/docs/dataset-formats/template_free.html
- 2025-08-08T12:15:25.301Z
+ 2025-08-08T12:24:17.834Zhttps://docs.axolotl.ai/docs/dataset-formats/index.html
- 2025-08-08T12:15:25.301Z
+ 2025-08-08T12:24:17.834Zhttps://docs.axolotl.ai/docs/dataset-formats/pretraining.html
- 2025-08-08T12:15:25.301Z
+ 2025-08-08T12:24:17.834Zhttps://docs.axolotl.ai/docs/nd_parallelism.html
- 2025-08-08T12:15:25.305Z
+ 2025-08-08T12:24:17.838Zhttps://docs.axolotl.ai/docs/sequence_parallelism.html
- 2025-08-08T12:15:25.305Z
+ 2025-08-08T12:24:17.838Zhttps://docs.axolotl.ai/docs/inference.html
- 2025-08-08T12:15:25.304Z
+ 2025-08-08T12:24:17.837Zhttps://docs.axolotl.ai/docs/fsdp_qlora.html
- 2025-08-08T12:15:25.302Z
+ 2025-08-08T12:24:17.834Zhttps://docs.axolotl.ai/docs/multi-node.html
- 2025-08-08T12:15:25.305Z
+ 2025-08-08T12:24:17.837Zhttps://docs.axolotl.ai/docs/lora_optims.html
- 2025-08-08T12:15:25.304Z
+ 2025-08-08T12:24:17.837Zhttps://docs.axolotl.ai/docs/getting-started.html
- 2025-08-08T12:15:25.302Z
+ 2025-08-08T12:24:17.834Zhttps://docs.axolotl.ai/docs/dataset_loading.html
- 2025-08-08T12:15:25.301Z
+ 2025-08-08T12:24:17.834Zhttps://docs.axolotl.ai/docs/lr_groups.html
- 2025-08-08T12:15:25.304Z
+ 2025-08-08T12:24:17.837Zhttps://docs.axolotl.ai/docs/input_output.html
- 2025-08-08T12:15:25.304Z
+ 2025-08-08T12:24:17.837Zhttps://docs.axolotl.ai/src/axolotl/integrations/LICENSE.html
- 2025-08-08T12:15:25.325Z
+ 2025-08-08T12:24:17.858Zhttps://docs.axolotl.ai/src/axolotl/integrations/cut_cross_entropy/ACKNOWLEDGEMENTS.html
- 2025-08-08T12:15:25.325Z
+ 2025-08-08T12:24:17.858Zhttps://docs.axolotl.ai/docs/mac.html
- 2025-08-08T12:15:25.305Z
+ 2025-08-08T12:24:17.837Zhttps://docs.axolotl.ai/docs/optimizers.html
- 2025-08-08T12:15:25.305Z
+ 2025-08-08T12:24:17.838Zhttps://docs.axolotl.ai/docs/gradient_checkpointing.html
- 2025-08-08T12:15:25.302Z
+ 2025-08-08T12:24:17.834Zhttps://docs.axolotl.ai/docs/qat.html
- 2025-08-08T12:15:25.305Z
+ 2025-08-08T12:24:17.838Zhttps://docs.axolotl.ai/docs/faq.html
- 2025-08-08T12:15:25.302Z
+ 2025-08-08T12:24:17.834Zhttps://docs.axolotl.ai/docs/dataset_preprocessing.html
- 2025-08-08T12:15:25.301Z
+ 2025-08-08T12:24:17.834Zhttps://docs.axolotl.ai/docs/nccl.html
- 2025-08-08T12:15:25.305Z
+ 2025-08-08T12:24:17.837Zhttps://docs.axolotl.ai/docs/cli.html
- 2025-08-08T12:15:25.301Z
+ 2025-08-08T12:24:17.833Zhttps://docs.axolotl.ai/docs/torchao.html
- 2025-08-08T12:15:25.305Z
+ 2025-08-08T12:24:17.838Zhttps://docs.axolotl.ai/docs/multi-gpu.html
- 2025-08-08T12:15:25.305Z
+ 2025-08-08T12:24:17.837Zhttps://docs.axolotl.ai/docs/rlhf.html
- 2025-08-08T12:15:25.305Z
+ 2025-08-08T12:24:17.838Zhttps://docs.axolotl.ai/docs/dataset-formats/tokenized.html
- 2025-08-08T12:15:25.301Z
+ 2025-08-08T12:24:17.834Zhttps://docs.axolotl.ai/docs/dataset-formats/conversation.html
- 2025-08-08T12:15:25.301Z
+ 2025-08-08T12:24:17.834Zhttps://docs.axolotl.ai/docs/dataset-formats/inst_tune.html
- 2025-08-08T12:15:25.301Z
+ 2025-08-08T12:24:17.834Zhttps://docs.axolotl.ai/docs/reward_modelling.html
- 2025-08-08T12:15:25.305Z
+ 2025-08-08T12:24:17.838Zhttps://docs.axolotl.ai/docs/docker.html
- 2025-08-08T12:15:25.302Z
+ 2025-08-08T12:24:17.834Zhttps://docs.axolotl.ai/docs/installation.html
- 2025-08-08T12:15:25.304Z
+ 2025-08-08T12:24:17.837Zhttps://docs.axolotl.ai/docs/quantize.html
- 2025-08-08T12:15:25.305Z
+ 2025-08-08T12:24:17.838Zhttps://docs.axolotl.ai/docs/custom_integrations.html
- 2025-08-08T12:15:25.301Z
+ 2025-08-08T12:24:17.833Zhttps://docs.axolotl.ai/docs/batch_vs_grad.html
- 2025-08-08T12:15:25.301Z
+ 2025-08-08T12:24:17.833Zhttps://docs.axolotl.ai/docs/api/cli.utils.train.html
- 2025-08-08T12:18:40.043Z
+ 2025-08-08T12:27:41.642Zhttps://docs.axolotl.ai/docs/api/cli.art.html
- 2025-08-08T12:18:39.906Z
+ 2025-08-08T12:27:41.510Zhttps://docs.axolotl.ai/docs/api/core.trainers.grpo.sampler.html
- 2025-08-08T12:18:40.105Z
+ 2025-08-08T12:27:41.703Zhttps://docs.axolotl.ai/docs/api/loaders.model.html
- 2025-08-08T12:18:40.116Z
+ 2025-08-08T12:27:41.714Zhttps://docs.axolotl.ai/docs/api/cli.preprocess.html
- 2025-08-08T12:18:39.980Z
+ 2025-08-08T12:27:41.581Zhttps://docs.axolotl.ai/docs/api/cli.utils.html
- 2025-08-08T12:18:40.003Z
+ 2025-08-08T12:27:41.604Zhttps://docs.axolotl.ai/docs/api/cli.inference.html
- 2025-08-08T12:18:39.950Z
+ 2025-08-08T12:27:41.553Zhttps://docs.axolotl.ai/docs/api/monkeypatch.btlm_attn_hijack_flash.html
- 2025-08-08T12:18:40.550Z
+ 2025-08-08T12:27:42.136Zhttps://docs.axolotl.ai/docs/api/datasets.html
- 2025-08-08T12:18:39.698Z
+ 2025-08-08T12:27:41.307Zhttps://docs.axolotl.ai/docs/api/monkeypatch.transformers_fa_utils.html
- 2025-08-08T12:18:40.567Z
+ 2025-08-08T12:27:42.152Zhttps://docs.axolotl.ai/docs/api/monkeypatch.llama_attn_hijack_flash.html
- 2025-08-08T12:18:40.502Z
+ 2025-08-08T12:27:42.089Zhttps://docs.axolotl.ai/docs/api/monkeypatch.relora.html
- 2025-08-08T12:18:40.510Z
+ 2025-08-08T12:27:42.097Zhttps://docs.axolotl.ai/docs/api/monkeypatch.stablelm_attn_hijack_flash.html
- 2025-08-08T12:18:40.558Z
+ 2025-08-08T12:27:42.143Zhttps://docs.axolotl.ai/docs/api/loaders.adapter.html
- 2025-08-08T12:18:40.132Z
+ 2025-08-08T12:27:41.729Zhttps://docs.axolotl.ai/docs/api/core.trainers.dpo.trainer.html
- 2025-08-08T12:18:40.081Z
+ 2025-08-08T12:27:41.680Zhttps://docs.axolotl.ai/docs/api/integrations.cut_cross_entropy.args.html
- 2025-08-08T12:18:40.977Z
+ 2025-08-08T12:27:42.550Zhttps://docs.axolotl.ai/docs/api/monkeypatch.utils.html
- 2025-08-08T12:18:40.549Z
+ 2025-08-08T12:27:42.134Zhttps://docs.axolotl.ai/docs/api/loaders.processor.html
- 2025-08-08T12:18:40.126Z
+ 2025-08-08T12:27:41.724Zhttps://docs.axolotl.ai/docs/api/cli.config.html
- 2025-08-08T12:18:39.931Z
+ 2025-08-08T12:27:41.534Zhttps://docs.axolotl.ai/docs/api/integrations.liger.args.html
- 2025-08-08T12:18:40.989Z
+ 2025-08-08T12:27:42.562Zhttps://docs.axolotl.ai/docs/api/loaders.tokenizer.html
- 2025-08-08T12:18:40.124Z
+ 2025-08-08T12:27:41.722Zhttps://docs.axolotl.ai/docs/api/utils.schemas.config.html
- 2025-08-08T12:18:40.756Z
+ 2025-08-08T12:27:42.335Zhttps://docs.axolotl.ai/docs/api/utils.ctx_managers.sequence_parallel.html
- 2025-08-08T12:18:40.183Z
+ 2025-08-08T12:27:41.779Zhttps://docs.axolotl.ai/docs/api/core.trainers.mixins.scheduler.html
- 2025-08-08T12:18:40.159Z
+ 2025-08-08T12:27:41.756Zhttps://docs.axolotl.ai/docs/api/core.trainers.base.html
- 2025-08-08T12:18:40.054Z
+ 2025-08-08T12:27:41.653Zhttps://docs.axolotl.ai/docs/api/cli.utils.args.html
- 2025-08-08T12:18:40.015Z
+ 2025-08-08T12:27:41.615Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.messages.chat.html
- 2025-08-08T12:18:40.306Z
+ 2025-08-08T12:27:41.899Zhttps://docs.axolotl.ai/docs/api/monkeypatch.lora_kernels.html
- 2025-08-08T12:18:40.541Z
+ 2025-08-08T12:27:42.126Zhttps://docs.axolotl.ai/docs/api/kernels.lora.html
- 2025-08-08T12:18:40.466Z
+ 2025-08-08T12:27:42.054Zhttps://docs.axolotl.ai/docs/api/cli.vllm_serve.html
- 2025-08-08T12:18:39.992Z
+ 2025-08-08T12:27:41.593Zhttps://docs.axolotl.ai/docs/api/utils.schemas.multimodal.html
- 2025-08-08T12:18:40.806Z
+ 2025-08-08T12:27:42.383Zhttps://docs.axolotl.ai/docs/api/utils.schemas.utils.html
- 2025-08-08T12:18:40.835Z
+ 2025-08-08T12:27:42.411Zhttps://docs.axolotl.ai/docs/api/monkeypatch.llama_attn_hijack_xformers.html
- 2025-08-08T12:18:40.503Z
+ 2025-08-08T12:27:42.090Zhttps://docs.axolotl.ai/docs/api/integrations.lm_eval.args.html
- 2025-08-08T12:18:40.992Z
+ 2025-08-08T12:27:42.566Zhttps://docs.axolotl.ai/docs/api/monkeypatch.mistral_attn_hijack_flash.html
- 2025-08-08T12:18:40.505Z
+ 2025-08-08T12:27:42.092Zhttps://docs.axolotl.ai/docs/api/utils.collators.core.html
- 2025-08-08T12:18:41.017Z
+ 2025-08-08T12:27:42.589Zhttps://docs.axolotl.ai/docs/api/core.chat.format.chatml.html
- 2025-08-08T12:18:39.816Z
+ 2025-08-08T12:27:41.422Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.dpo.passthrough.html
- 2025-08-08T12:18:40.333Z
+ 2025-08-08T12:27:41.925Zhttps://docs.axolotl.ai/docs/api/core.datasets.chat.html
- 2025-08-08T12:18:39.824Z
+ 2025-08-08T12:27:41.430Zhttps://docs.axolotl.ai/docs/api/utils.bench.html
- 2025-08-08T12:18:40.624Z
+ 2025-08-08T12:27:42.208Zhttps://docs.axolotl.ai/docs/api/utils.schemas.training.html
- 2025-08-08T12:18:40.770Z
+ 2025-08-08T12:27:42.349Zhttps://docs.axolotl.ai/docs/api/utils.collators.batching.html
- 2025-08-08T12:18:41.036Z
+ 2025-08-08T12:27:42.608Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.input_output.html
- 2025-08-08T12:18:40.280Z
+ 2025-08-08T12:27:41.873Zhttps://docs.axolotl.ai/docs/api/utils.lora.html
- 2025-08-08T12:18:40.615Z
+ 2025-08-08T12:27:42.199Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.base.html
- 2025-08-08T12:18:40.185Z
+ 2025-08-08T12:27:41.781Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.alpaca_w_system.html
- 2025-08-08T12:18:40.246Z
+ 2025-08-08T12:27:41.841Zhttps://docs.axolotl.ai/docs/api/utils.schemas.datasets.html
- 2025-08-08T12:18:40.788Z
+ 2025-08-08T12:27:42.366Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.dpo.user_defined.html
- 2025-08-08T12:18:40.332Z
+ 2025-08-08T12:27:41.924Zhttps://docs.axolotl.ai/docs/api/utils.schemas.peft.html
- 2025-08-08T12:18:40.797Z
+ 2025-08-08T12:27:42.375Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.pygmalion.html
- 2025-08-08T12:18:40.302Z
+ 2025-08-08T12:27:41.894Zhttps://docs.axolotl.ai/docs/api/common.architectures.html
- 2025-08-08T12:18:40.998Z
+ 2025-08-08T12:27:42.570Zhttps://docs.axolotl.ai/docs/api/monkeypatch.gradient_checkpointing.offload_cpu.html
- 2025-08-08T12:18:40.575Z
+ 2025-08-08T12:27:42.160Zhttps://docs.axolotl.ai/docs/api/utils.callbacks.comet_.html
- 2025-08-08T12:18:41.105Z
+ 2025-08-08T12:27:42.675Zhttps://docs.axolotl.ai/docs/api/integrations.spectrum.args.html
- 2025-08-08T12:18:40.996Z
+ 2025-08-08T12:27:42.569Zhttps://docs.axolotl.ai/docs/api/cli.quantize.html
- 2025-08-08T12:18:39.985Z
+ 2025-08-08T12:27:41.586Zhttps://docs.axolotl.ai/docs/api/cli.checks.html
- 2025-08-08T12:18:39.913Z
+ 2025-08-08T12:27:41.516Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.kto.llama3.html
- 2025-08-08T12:18:40.342Z
+ 2025-08-08T12:27:41.933Zhttps://docs.axolotl.ai/docs/api/utils.model_shard_quant.html
- 2025-08-08T12:18:40.621Z
+ 2025-08-08T12:27:42.204Zhttps://docs.axolotl.ai/docs/api/utils.quantization.html
- 2025-08-08T12:18:40.741Z
+ 2025-08-08T12:27:42.321Zhttps://docs.axolotl.ai/docs/api/core.trainers.mixins.rng_state_loader.html
- 2025-08-08T12:18:40.152Z
+ 2025-08-08T12:27:41.749Zhttps://docs.axolotl.ai/docs/api/kernels.geglu.html
- 2025-08-08T12:18:40.477Z
+ 2025-08-08T12:27:42.064Zhttps://docs.axolotl.ai/docs/api/utils.data.pretraining.html
- 2025-08-08T12:18:40.713Z
+ 2025-08-08T12:27:42.294Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.kto.user_defined.html
- 2025-08-08T12:18:40.351Z
+ 2025-08-08T12:27:41.943Zhttps://docs.axolotl.ai/docs/api/core.builders.base.html
- 2025-08-08T12:18:39.769Z
+ 2025-08-08T12:27:41.377Zhttps://docs.axolotl.ai/docs/api/cli.merge_lora.html
- 2025-08-08T12:18:39.959Z
+ 2025-08-08T12:27:41.561Zhttps://docs.axolotl.ai/docs/api/cli.utils.load.html
- 2025-08-08T12:18:40.026Z
+ 2025-08-08T12:27:41.626Zhttps://docs.axolotl.ai/docs/api/utils.data.sft.html
- 2025-08-08T12:18:40.720Z
+ 2025-08-08T12:27:42.300Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.user_defined.html
- 2025-08-08T12:18:40.254Z
+ 2025-08-08T12:27:41.849Zhttps://docs.axolotl.ai/docs/api/utils.tokenization.html
- 2025-08-08T12:18:40.608Z
+ 2025-08-08T12:27:42.193Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.dpo.chatml.html
- 2025-08-08T12:18:40.329Z
+ 2025-08-08T12:27:41.921Zhttps://docs.axolotl.ai/docs/api/models.mamba.modeling_mamba.html
- 2025-08-08T12:18:41.016Z
+ 2025-08-08T12:27:42.588Zhttps://docs.axolotl.ai/docs/api/cli.args.html
- 2025-08-08T12:18:39.903Z
+ 2025-08-08T12:27:41.507Zhttps://docs.axolotl.ai/docs/api/evaluate.html
- 2025-08-08T12:18:39.687Z
+ 2025-08-08T12:27:41.296Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.alpaca_instruct.html
- 2025-08-08T12:18:40.234Z
+ 2025-08-08T12:27:41.829Zhttps://docs.axolotl.ai/docs/api/utils.distributed.html
- 2025-08-08T12:18:40.698Z
+ 2025-08-08T12:27:42.279Zhttps://docs.axolotl.ai/docs/multipack.html
- 2025-08-08T12:15:25.305Z
+ 2025-08-08T12:24:17.837Zhttps://docs.axolotl.ai/examples/colab-notebooks/colab-axolotl-example.html
- 2025-08-08T12:15:25.309Z
+ 2025-08-08T12:24:17.842Zhttps://docs.axolotl.ai/FAQS.html
- 2025-08-08T12:15:25.299Z
+ 2025-08-08T12:24:17.832Z