diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 580c4047c..a3a24537c 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -8,6 +8,9 @@ on: - "v*" workflow_dispatch: +permissions: + contents: read + jobs: build-axolotl: if: ${{ ! contains(github.event.commits[0].message, '[skip docker]') && github.repository_owner == 'axolotl-ai-cloud' }} diff --git a/.nojekyll b/.nojekyll index cfedc3aba..09ce6e542 100644 --- a/.nojekyll +++ b/.nojekyll @@ -1 +1 @@ -6e9883a7 \ No newline at end of file +756ab801 \ No newline at end of file diff --git a/docs/attention.html b/docs/attention.html index e3be939e8..0f42d2b5c 100644 --- a/docs/attention.html +++ b/docs/attention.html @@ -756,9 +756,11 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true}); diff --git a/search.json b/search.json index dbb4fd340..dc49843fe 100644 --- a/search.json +++ b/search.json @@ -1247,11 +1247,11 @@ ] }, { - "objectID": "docs/attention.html#flash-attention-2", - "href": "docs/attention.html#flash-attention-2", + "objectID": "docs/attention.html#flash-attention", + "href": "docs/attention.html#flash-attention", "title": "Attention", - "section": "Flash Attention 2", - "text": "Flash Attention 2\nUses efficient kernels to compute attention.\nflash_attention: true\nFor more details: Flash Attention\n\nNvidia\nRequirements: Ampere, Ada, or Hopper GPUs\nNote: For Turing GPUs or lower, please use other attention methods.\npip install flash-attn --no-build-isolation\n\n\n\n\n\n\nTip\n\n\n\nIf you get undefined symbol while training, ensure you installed PyTorch prior to Axolotl. Alternatively, try reinstall or downgrade a version.\n\n\n\nFlash Attention 3\nRequirements: Hopper only and CUDA 12.8 (recommended)\ngit clone https://github.com/Dao-AILab/flash-attention.git\ncd flash-attention/hopper\n\npython setup.py install\n\n\n\nAMD\nRequirements: ROCm 6.0 and above.\nSee Flash Attention AMD docs.", + "section": "Flash Attention", + "text": "Flash Attention\nAxolotl supports Flash Attention 2, 3, and 4. The best available version is used automatically\nbased on your installed packages and GPU.\nflash_attention: true\nFor more details: Flash Attention\n\nFlash Attention 2\nRequirements: Ampere, Ada, or Hopper GPUs (Turing or lower not supported)\npip install flash-attn --no-build-isolation\n\n\n\n\n\n\nTip\n\n\n\nIf you get undefined symbol while training, ensure you installed PyTorch prior to Axolotl.\nAlternatively, try reinstall or downgrade a version.\n\n\n\n\nFlash Attention 3\nRequirements: Hopper only and CUDA 12.8 (recommended)\ngit clone https://github.com/Dao-AILab/flash-attention.git\ncd flash-attention/hopper\n\npython setup.py install\n\n\nFlash Attention 4\nRequirements: Hopper or Blackwell GPUs\npip install flash-attn-4\nOr from source:\ngit clone https://github.com/Dao-AILab/flash-attention.git\ncd flash-attention/flash_attn/cute\n\npip install -e .\n\n# FA2's flash_attn package includes a cute/ stub that shadows FA4.\n# Remove it so Python can find the real FA4 module:\nrm -r $(python -c \"import flash_attn; print(flash_attn.__path__[0])\")/cute\n\n\n\n\n\n\nNote\n\n\n\nHopper (SM90) users: The backward kernel is not yet included in the pip package. To use FA4\nfor training on Hopper, install from source using the instructions above.\n\n\n\n\n\n\n\n\nWarning\n\n\n\nFA4 only supports head dimensions up to 128 (d ≤ 128). The DeepSeek shape (192, 128) is\nalso supported but only on Blackwell. Axolotl automatically detects incompatible head dimensions\nand falls back to FA2/3.\n\n\nFor more details: flash-attention/flash_attn/cute\n\n\nAMD\nRequirements: ROCm 6.0 and above.\nSee Flash Attention AMD docs.", "crumbs": [ "Core Concepts", "Attention" @@ -3109,7 +3109,7 @@ "href": "index.html#overview", "title": "Axolotl", "section": "✨ Overview", - "text": "✨ Overview\nAxolotl is a free and open-source tool designed to streamline post-training and fine-tuning for the latest large language models (LLMs).\nFeatures:\n\nMultiple Model Support: Train various models like GPT-OSS, LLaMA, Mistral, Mixtral, Pythia, and many more models available on the Hugging Face Hub.\nMultimodal Training: Fine-tune vision-language models (VLMs) including LLaMA-Vision, Qwen2-VL, Pixtral, LLaVA, SmolVLM2, GLM-4.6V, InternVL 3.5, Gemma 3n, and audio models like Voxtral with image, video, and audio support.\nTraining Methods: Full fine-tuning, LoRA, QLoRA, GPTQ, QAT, Preference Tuning (DPO, IPO, KTO, ORPO), RL (GRPO, GDPO), and Reward Modelling (RM) / Process Reward Modelling (PRM).\nEasy Configuration: Re-use a single YAML configuration file across the full fine-tuning pipeline: dataset preprocessing, training, evaluation, quantization, and inference.\nPerformance Optimizations: Multipacking, Flash Attention, Xformers, Flex Attention, SageAttention, Liger Kernel, Cut Cross Entropy, ScatterMoE, Sequence Parallelism (SP), LoRA optimizations, Multi-GPU training (FSDP1, FSDP2, DeepSpeed), Multi-node training (Torchrun, Ray), and many more!\nFlexible Dataset Handling: Load from local, HuggingFace, and cloud (S3, Azure, GCP, OCI) datasets.\nCloud Ready: We ship Docker images and also PyPI packages for use on cloud platforms and local hardware.", + "text": "✨ Overview\nAxolotl is a free and open-source tool designed to streamline post-training and fine-tuning for the latest large language models (LLMs).\nFeatures:\n\nMultiple Model Support: Train various models like GPT-OSS, LLaMA, Mistral, Mixtral, Pythia, and many more models available on the Hugging Face Hub.\nMultimodal Training: Fine-tune vision-language models (VLMs) including LLaMA-Vision, Qwen2-VL, Pixtral, LLaVA, SmolVLM2, GLM-4.6V, InternVL 3.5, Gemma 3n, and audio models like Voxtral with image, video, and audio support.\nTraining Methods: Full fine-tuning, LoRA, QLoRA, GPTQ, QAT, Preference Tuning (DPO, IPO, KTO, ORPO), RL (GRPO, GDPO), and Reward Modelling (RM) / Process Reward Modelling (PRM).\nEasy Configuration: Re-use a single YAML configuration file across the full fine-tuning pipeline: dataset preprocessing, training, evaluation, quantization, and inference.\nPerformance Optimizations: Multipacking, Flash Attention 2/3/4, Xformers, Flex Attention, SageAttention, Liger Kernel, Cut Cross Entropy, ScatterMoE, Sequence Parallelism (SP), LoRA optimizations, Multi-GPU training (FSDP1, FSDP2, DeepSpeed), Multi-node training (Torchrun, Ray), and many more!\nFlexible Dataset Handling: Load from local, HuggingFace, and cloud (S3, Azure, GCP, OCI) datasets.\nCloud Ready: We ship Docker images and also PyPI packages for use on cloud platforms and local hardware.", "crumbs": [ "Home" ] diff --git a/sitemap.xml b/sitemap.xml index 4a6d9e258..c66221807 100644 --- a/sitemap.xml +++ b/sitemap.xml @@ -2,950 +2,950 @@ https://docs.axolotl.ai/examples/colab-notebooks/colab-axolotl-example.html - 2026-03-16T02:11:28.988Z + 2026-03-16T04:14:26.693Z https://docs.axolotl.ai/src/axolotl/integrations/cut_cross_entropy/ACKNOWLEDGEMENTS.html - 2026-03-16T02:11:29.010Z + 2026-03-16T04:14:26.717Z https://docs.axolotl.ai/docs/inference.html - 2026-03-16T02:11:28.982Z + 2026-03-16T04:14:26.686Z https://docs.axolotl.ai/docs/expert_quantization.html - 2026-03-16T02:11:28.980Z + 2026-03-16T04:14:26.683Z https://docs.axolotl.ai/docs/installation.html - 2026-03-16T02:11:28.982Z + 2026-03-16T04:14:26.686Z https://docs.axolotl.ai/docs/models/ministral3/think.html - 2026-03-16T02:15:32.142Z + 2026-03-16T04:18:58.114Z https://docs.axolotl.ai/docs/models/granite4.html - 2026-03-16T02:15:32.152Z + 2026-03-16T04:18:58.124Z https://docs.axolotl.ai/docs/models/seed-oss.html - 2026-03-16T02:15:32.150Z + 2026-03-16T04:18:58.123Z https://docs.axolotl.ai/docs/models/orpheus.html - 2026-03-16T02:15:32.153Z + 2026-03-16T04:18:58.125Z https://docs.axolotl.ai/docs/models/internvl3_5.html - 2026-03-16T02:15:32.139Z + 2026-03-16T04:18:58.111Z https://docs.axolotl.ai/docs/models/magistral/vision.html - 2026-03-16T02:15:32.145Z + 2026-03-16T04:18:58.117Z https://docs.axolotl.ai/docs/models/mimo.html - 2026-03-16T02:15:32.139Z + 2026-03-16T04:18:58.110Z https://docs.axolotl.ai/docs/models/gpt-oss.html - 2026-03-16T02:15:32.150Z + 2026-03-16T04:18:58.122Z https://docs.axolotl.ai/docs/models/qwen3-next.html - 2026-03-16T02:15:32.148Z + 2026-03-16T04:18:58.120Z https://docs.axolotl.ai/docs/models/llama-2.html - 2026-03-16T02:15:32.148Z + 2026-03-16T04:18:58.120Z https://docs.axolotl.ai/docs/models/kimi-linear.html - 2026-03-16T02:15:32.138Z + 2026-03-16T04:18:58.109Z https://docs.axolotl.ai/docs/models/smolvlm2.html - 2026-03-16T02:15:32.151Z + 2026-03-16T04:18:58.123Z https://docs.axolotl.ai/docs/models/olmo3.html - 2026-03-16T02:15:32.140Z + 2026-03-16T04:18:58.112Z https://docs.axolotl.ai/docs/models/jamba.html - 2026-03-16T02:15:32.153Z + 2026-03-16T04:18:58.125Z https://docs.axolotl.ai/docs/models/mistral-small.html - 2026-03-16T02:15:32.145Z + 2026-03-16T04:18:58.117Z https://docs.axolotl.ai/docs/models/devstral.html - 2026-03-16T02:15:32.147Z + 2026-03-16T04:18:58.118Z https://docs.axolotl.ai/docs/models/index.html - 2026-03-16T02:15:32.153Z + 2026-03-16T04:18:58.125Z https://docs.axolotl.ai/docs/lora_optims.html - 2026-03-16T02:11:28.982Z + 2026-03-16T04:14:26.686Z https://docs.axolotl.ai/docs/cli.html - 2026-03-16T02:11:28.979Z + 2026-03-16T04:14:26.683Z https://docs.axolotl.ai/docs/gradient_checkpointing.html - 2026-03-16T02:11:28.980Z + 2026-03-16T04:14:26.683Z https://docs.axolotl.ai/docs/dataset_preprocessing.html - 2026-03-16T02:11:28.980Z + 2026-03-16T04:14:26.683Z https://docs.axolotl.ai/docs/docker.html - 2026-03-16T02:11:28.980Z + 2026-03-16T04:14:26.683Z https://docs.axolotl.ai/docs/attention.html - 2026-03-16T02:11:28.979Z + 2026-03-16T04:14:26.683Z https://docs.axolotl.ai/docs/api/prompt_strategies.input_output.html - 2026-03-16T02:15:10.774Z + 2026-03-16T04:18:34.775Z https://docs.axolotl.ai/docs/api/loaders.adapter.html - 2026-03-16T02:15:10.580Z + 2026-03-16T04:18:34.583Z https://docs.axolotl.ai/docs/api/monkeypatch.btlm_attn_hijack_flash.html - 2026-03-16T02:15:11.111Z + 2026-03-16T04:18:35.111Z https://docs.axolotl.ai/docs/api/monkeypatch.llama_patch_multipack.html - 2026-03-16T02:15:11.113Z + 2026-03-16T04:18:35.113Z https://docs.axolotl.ai/docs/api/core.trainers.utils.html - 2026-03-16T02:15:10.548Z + 2026-03-16T04:18:34.552Z https://docs.axolotl.ai/docs/api/utils.data.sft.html - 2026-03-16T02:15:11.331Z + 2026-03-16T04:18:35.332Z https://docs.axolotl.ai/docs/api/utils.schemas.peft.html - 2026-03-16T02:15:11.426Z + 2026-03-16T04:18:35.425Z https://docs.axolotl.ai/docs/api/monkeypatch.lora_kernels.html - 2026-03-16T02:15:11.099Z + 2026-03-16T04:18:35.099Z https://docs.axolotl.ai/docs/api/utils.collators.mamba.html - 2026-03-16T02:15:11.730Z + 2026-03-16T04:18:35.728Z https://docs.axolotl.ai/docs/api/prompt_strategies.base.html - 2026-03-16T02:15:10.656Z + 2026-03-16T04:18:34.658Z https://docs.axolotl.ai/docs/api/loaders.processor.html - 2026-03-16T02:15:10.573Z + 2026-03-16T04:18:34.576Z https://docs.axolotl.ai/docs/api/core.training_args.html - 2026-03-16T02:15:10.155Z + 2026-03-16T04:18:34.161Z https://docs.axolotl.ai/docs/api/loaders.tokenizer.html - 2026-03-16T02:15:10.571Z + 2026-03-16T04:18:34.575Z https://docs.axolotl.ai/docs/api/prompt_strategies.completion.html - 2026-03-16T02:15:10.766Z + 2026-03-16T04:18:34.767Z https://docs.axolotl.ai/docs/api/loaders.constants.html - 2026-03-16T02:15:10.604Z + 2026-03-16T04:18:34.607Z https://docs.axolotl.ai/docs/api/cli.utils.train.html - 2026-03-16T02:15:10.465Z + 2026-03-16T04:18:34.469Z https://docs.axolotl.ai/docs/api/monkeypatch.stablelm_attn_hijack_flash.html - 2026-03-16T02:15:11.121Z + 2026-03-16T04:18:35.120Z https://docs.axolotl.ai/docs/api/utils.schemas.datasets.html - 2026-03-16T02:15:11.415Z + 2026-03-16T04:18:35.414Z https://docs.axolotl.ai/docs/api/prompt_strategies.llama2_chat.html - 2026-03-16T02:15:10.758Z + 2026-03-16T04:18:34.760Z https://docs.axolotl.ai/docs/api/monkeypatch.llama_attn_hijack_xformers.html - 2026-03-16T02:15:11.054Z + 2026-03-16T04:18:35.054Z https://docs.axolotl.ai/docs/api/common.datasets.html - 2026-03-16T02:15:11.699Z + 2026-03-16T04:18:35.697Z https://docs.axolotl.ai/docs/api/logging_config.html - 2026-03-16T02:15:10.119Z + 2026-03-16T04:18:34.125Z https://docs.axolotl.ai/docs/api/monkeypatch.unsloth_.html - 2026-03-16T02:15:11.135Z + 2026-03-16T04:18:35.133Z https://docs.axolotl.ai/docs/api/integrations.liger.args.html - 2026-03-16T02:15:11.668Z + 2026-03-16T04:18:35.666Z https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.llama3.html - 2026-03-16T02:15:10.828Z + 2026-03-16T04:18:34.829Z https://docs.axolotl.ai/docs/api/core.trainers.grpo.sampler.html - 2026-03-16T02:15:10.546Z + 2026-03-16T04:18:34.550Z https://docs.axolotl.ai/docs/api/core.chat.messages.html - 2026-03-16T02:15:10.184Z + 2026-03-16T04:18:34.190Z https://docs.axolotl.ai/docs/api/integrations.base.html - 2026-03-16T02:15:11.648Z + 2026-03-16T04:18:35.646Z https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.chat_template.html - 2026-03-16T02:15:10.815Z + 2026-03-16T04:18:34.816Z https://docs.axolotl.ai/docs/api/train.html - 2026-03-16T02:15:10.016Z + 2026-03-16T04:18:34.023Z https://docs.axolotl.ai/docs/api/utils.distributed.html - 2026-03-16T02:15:11.305Z + 2026-03-16T04:18:35.306Z https://docs.axolotl.ai/docs/api/core.builders.causal.html - 2026-03-16T02:15:10.133Z + 2026-03-16T04:18:34.139Z https://docs.axolotl.ai/docs/api/core.builders.rl.html - 2026-03-16T02:15:10.139Z + 2026-03-16T04:18:34.144Z https://docs.axolotl.ai/docs/api/utils.collators.core.html - 2026-03-16T02:15:11.702Z + 2026-03-16T04:18:35.700Z https://docs.axolotl.ai/docs/api/utils.schemas.model.html - 2026-03-16T02:15:11.383Z + 2026-03-16T04:18:35.382Z https://docs.axolotl.ai/docs/api/kernels.quantize.html - 2026-03-16T02:15:11.043Z + 2026-03-16T04:18:35.043Z https://docs.axolotl.ai/docs/api/utils.schemas.enums.html - 2026-03-16T02:15:11.468Z + 2026-03-16T04:18:35.467Z https://docs.axolotl.ai/docs/api/prompt_strategies.metharme.html - 2026-03-16T02:15:10.788Z + 2026-03-16T04:18:34.789Z https://docs.axolotl.ai/docs/api/utils.callbacks.lisa.html - 2026-03-16T02:15:11.802Z + 2026-03-16T04:18:35.799Z https://docs.axolotl.ai/docs/api/cli.preprocess.html - 2026-03-16T02:15:10.386Z + 2026-03-16T04:18:34.391Z https://docs.axolotl.ai/docs/api/loaders.model.html - 2026-03-16T02:15:10.560Z + 2026-03-16T04:18:34.564Z https://docs.axolotl.ai/docs/api/cli.merge_lora.html - 2026-03-16T02:15:10.362Z + 2026-03-16T04:18:34.366Z https://docs.axolotl.ai/docs/api/cli.utils.sweeps.html - 2026-03-16T02:15:10.450Z + 2026-03-16T04:18:34.455Z https://docs.axolotl.ai/docs/api/utils.bench.html - 2026-03-16T02:15:11.214Z + 2026-03-16T04:18:35.213Z https://docs.axolotl.ai/docs/api/core.trainers.mamba.html - 2026-03-16T02:15:10.509Z + 2026-03-16T04:18:34.513Z https://docs.axolotl.ai/docs/api/cli.vllm_serve.html - 2026-03-16T02:15:10.401Z + 2026-03-16T04:18:34.404Z https://docs.axolotl.ai/docs/api/kernels.utils.html - 2026-03-16T02:15:11.045Z + 2026-03-16T04:18:35.045Z https://docs.axolotl.ai/docs/api/prompt_strategies.chat_template.html - 2026-03-16T02:15:10.698Z + 2026-03-16T04:18:34.700Z https://docs.axolotl.ai/docs/api/utils.callbacks.perplexity.html - 2026-03-16T02:15:11.795Z + 2026-03-16T04:18:35.793Z https://docs.axolotl.ai/docs/api/utils.collators.mm_chat.html - 2026-03-16T02:15:11.736Z + 2026-03-16T04:18:35.734Z https://docs.axolotl.ai/docs/api/utils.schemas.integrations.html - 2026-03-16T02:15:11.457Z + 2026-03-16T04:18:35.456Z https://docs.axolotl.ai/docs/api/prompt_strategies.orpo.chat_template.html - 2026-03-16T02:15:10.896Z + 2026-03-16T04:18:34.896Z https://docs.axolotl.ai/docs/api/prompt_strategies.alpaca_w_system.html - 2026-03-16T02:15:10.732Z + 2026-03-16T04:18:34.733Z https://docs.axolotl.ai/docs/api/cli.evaluate.html - 2026-03-16T02:15:10.268Z + 2026-03-16T04:18:34.273Z https://docs.axolotl.ai/docs/api/core.datasets.chat.html - 2026-03-16T02:15:10.196Z + 2026-03-16T04:18:34.202Z https://docs.axolotl.ai/docs/api/prompt_strategies.alpaca_chat.html - 2026-03-16T02:15:10.715Z + 2026-03-16T04:18:34.717Z https://docs.axolotl.ai/docs/api/prompt_strategies.orcamini.html - 2026-03-16T02:15:10.793Z + 2026-03-16T04:18:34.794Z https://docs.axolotl.ai/docs/api/monkeypatch.transformers_fa_utils.html - 2026-03-16T02:15:11.133Z + 2026-03-16T04:18:35.132Z https://docs.axolotl.ai/docs/api/kernels.lora.html - 2026-03-16T02:15:11.008Z + 2026-03-16T04:18:35.008Z https://docs.axolotl.ai/docs/api/utils.callbacks.profiler.html - 2026-03-16T02:15:11.800Z + 2026-03-16T04:18:35.797Z https://docs.axolotl.ai/docs/api/utils.callbacks.mlflow_.html - 2026-03-16T02:15:11.806Z + 2026-03-16T04:18:35.804Z https://docs.axolotl.ai/docs/api/utils.freeze.html - 2026-03-16T02:15:11.224Z + 2026-03-16T04:18:35.223Z https://docs.axolotl.ai/docs/api/integrations.kd.trainer.html - 2026-03-16T02:15:11.664Z + 2026-03-16T04:18:35.661Z https://docs.axolotl.ai/docs/api/monkeypatch.gradient_checkpointing.offload_disk.html - 2026-03-16T02:15:11.186Z + 2026-03-16T04:18:35.185Z https://docs.axolotl.ai/docs/api/utils.data.streaming.html - 2026-03-16T02:15:11.324Z + 2026-03-16T04:18:35.324Z https://docs.axolotl.ai/docs/api/prompt_tokenizers.html - 2026-03-16T02:15:10.107Z + 2026-03-16T04:18:34.113Z https://docs.axolotl.ai/docs/api/core.trainers.mixins.rng_state_loader.html - 2026-03-16T02:15:10.616Z + 2026-03-16T04:18:34.618Z https://docs.axolotl.ai/docs/api/cli.cloud.modal_.html - 2026-03-16T02:15:10.413Z + 2026-03-16T04:18:34.417Z https://docs.axolotl.ai/docs/api/core.trainers.mixins.scheduler.html - 2026-03-16T02:15:10.624Z + 2026-03-16T04:18:34.627Z https://docs.axolotl.ai/docs/api/convert.html - 2026-03-16T02:15:10.054Z + 2026-03-16T04:18:34.061Z https://docs.axolotl.ai/docs/api/models.mamba.modeling_mamba.html - 2026-03-16T02:15:11.701Z + 2026-03-16T04:18:35.698Z https://docs.axolotl.ai/docs/api/cli.args.html - 2026-03-16T02:15:10.293Z + 2026-03-16T04:18:34.298Z https://docs.axolotl.ai/docs/api/core.chat.format.shared.html - 2026-03-16T02:15:10.190Z + 2026-03-16T04:18:34.195Z https://docs.axolotl.ai/docs/api/prompt_strategies.bradley_terry.llama3.html - 2026-03-16T02:15:10.900Z + 2026-03-16T04:18:34.901Z https://docs.axolotl.ai/docs/api/index.html - 2026-03-16T02:15:09.939Z + 2026-03-16T04:18:33.945Z https://docs.axolotl.ai/docs/fsdp_qlora.html - 2026-03-16T02:11:28.980Z + 2026-03-16T04:14:26.683Z https://docs.axolotl.ai/docs/dataset-formats/stepwise_supervised.html - 2026-03-16T02:11:28.980Z + 2026-03-16T04:14:26.683Z https://docs.axolotl.ai/docs/dataset-formats/template_free.html - 2026-03-16T02:11:28.980Z + 2026-03-16T04:14:26.683Z https://docs.axolotl.ai/docs/dataset-formats/index.html - 2026-03-16T02:11:28.980Z + 2026-03-16T04:14:26.683Z https://docs.axolotl.ai/docs/telemetry.html - 2026-03-16T02:11:28.984Z + 2026-03-16T04:14:26.688Z https://docs.axolotl.ai/docs/config-reference.html - 2026-03-16T02:15:31.274Z + 2026-03-16T04:18:57.037Z https://docs.axolotl.ai/docs/ray-integration.html - 2026-03-16T02:11:28.983Z + 2026-03-16T04:14:26.687Z https://docs.axolotl.ai/docs/streaming.html - 2026-03-16T02:11:28.984Z + 2026-03-16T04:14:26.688Z https://docs.axolotl.ai/docs/sequence_parallelism.html - 2026-03-16T02:11:28.984Z + 2026-03-16T04:14:26.688Z https://docs.axolotl.ai/docs/unsloth.html - 2026-03-16T02:11:28.984Z + 2026-03-16T04:14:26.688Z https://docs.axolotl.ai/docs/mixed_precision.html - 2026-03-16T02:11:28.982Z + 2026-03-16T04:14:26.686Z https://docs.axolotl.ai/docs/amd_hpc.html - 2026-03-16T02:11:28.979Z + 2026-03-16T04:14:26.683Z https://docs.axolotl.ai/docs/lr_groups.html - 2026-03-16T02:11:28.982Z + 2026-03-16T04:14:26.686Z https://docs.axolotl.ai/docs/optimizations.html - 2026-03-16T02:11:28.982Z + 2026-03-16T04:14:26.687Z https://docs.axolotl.ai/docs/mac.html - 2026-03-16T02:11:28.982Z + 2026-03-16T04:14:26.686Z https://docs.axolotl.ai/index.html - 2026-03-16T02:11:29.004Z + 2026-03-16T04:14:26.711Z https://docs.axolotl.ai/docs/optimizers.html - 2026-03-16T02:11:28.982Z + 2026-03-16T04:14:26.687Z https://docs.axolotl.ai/docs/getting-started.html - 2026-03-16T02:11:28.980Z + 2026-03-16T04:14:26.683Z https://docs.axolotl.ai/docs/multi-node.html - 2026-03-16T02:11:28.982Z + 2026-03-16T04:14:26.686Z https://docs.axolotl.ai/docs/input_output.html - 2026-03-16T02:11:28.982Z + 2026-03-16T04:14:26.686Z https://docs.axolotl.ai/docs/nd_parallelism.html - 2026-03-16T02:11:28.982Z + 2026-03-16T04:14:26.687Z https://docs.axolotl.ai/docs/dataset_loading.html - 2026-03-16T02:11:28.980Z + 2026-03-16T04:14:26.683Z https://docs.axolotl.ai/docs/quantize.html - 2026-03-16T02:11:28.983Z + 2026-03-16T04:14:26.687Z https://docs.axolotl.ai/docs/rlhf.html - 2026-03-16T02:11:28.983Z + 2026-03-16T04:14:26.687Z https://docs.axolotl.ai/docs/custom_integrations.html - 2026-03-16T02:11:28.979Z + 2026-03-16T04:14:26.683Z https://docs.axolotl.ai/docs/qat.html - 2026-03-16T02:11:28.982Z + 2026-03-16T04:14:26.687Z https://docs.axolotl.ai/docs/checkpoint_saving.html - 2026-03-16T02:11:28.979Z + 2026-03-16T04:14:26.683Z https://docs.axolotl.ai/docs/dataset-formats/conversation.html - 2026-03-16T02:11:28.979Z + 2026-03-16T04:14:26.683Z https://docs.axolotl.ai/docs/dataset-formats/inst_tune.html - 2026-03-16T02:11:28.980Z + 2026-03-16T04:14:26.683Z https://docs.axolotl.ai/docs/dataset-formats/tokenized.html - 2026-03-16T02:11:28.980Z + 2026-03-16T04:14:26.683Z https://docs.axolotl.ai/docs/dataset-formats/pretraining.html - 2026-03-16T02:11:28.980Z + 2026-03-16T04:14:26.683Z https://docs.axolotl.ai/docs/api/cli.main.html - 2026-03-16T02:15:10.248Z + 2026-03-16T04:18:34.253Z https://docs.axolotl.ai/docs/api/utils.schemas.trl.html - 2026-03-16T02:15:11.430Z + 2026-03-16T04:18:35.429Z https://docs.axolotl.ai/docs/api/core.datasets.transforms.chat_builder.html - 2026-03-16T02:15:10.206Z + 2026-03-16T04:18:34.211Z https://docs.axolotl.ai/docs/api/common.const.html - 2026-03-16T02:15:11.680Z + 2026-03-16T04:18:35.678Z https://docs.axolotl.ai/docs/api/cli.utils.load.html - 2026-03-16T02:15:10.443Z + 2026-03-16T04:18:34.447Z https://docs.axolotl.ai/docs/api/loaders.patch_manager.html - 2026-03-16T02:15:10.602Z + 2026-03-16T04:18:34.605Z https://docs.axolotl.ai/docs/api/utils.quantization.html - 2026-03-16T02:15:11.356Z + 2026-03-16T04:18:35.356Z https://docs.axolotl.ai/docs/api/monkeypatch.utils.html - 2026-03-16T02:15:11.110Z + 2026-03-16T04:18:35.109Z https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.user_defined.html - 2026-03-16T02:15:10.845Z + 2026-03-16T04:18:34.846Z https://docs.axolotl.ai/docs/api/cli.quantize.html - 2026-03-16T02:15:10.392Z + 2026-03-16T04:18:34.397Z https://docs.axolotl.ai/docs/api/prompt_strategies.user_defined.html - 2026-03-16T02:15:10.742Z + 2026-03-16T04:18:34.744Z https://docs.axolotl.ai/docs/api/integrations.lm_eval.args.html - 2026-03-16T02:15:11.672Z + 2026-03-16T04:18:35.670Z https://docs.axolotl.ai/docs/api/monkeypatch.gradient_checkpointing.offload_cpu.html - 2026-03-16T02:15:11.152Z + 2026-03-16T04:18:35.152Z https://docs.axolotl.ai/docs/api/utils.schedulers.html - 2026-03-16T02:15:11.280Z + 2026-03-16T04:18:35.280Z https://docs.axolotl.ai/docs/api/kernels.geglu.html - 2026-03-16T02:15:11.021Z + 2026-03-16T04:18:35.021Z https://docs.axolotl.ai/docs/api/monkeypatch.trainer_fsdp_optim.html - 2026-03-16T02:15:11.125Z + 2026-03-16T04:18:35.124Z https://docs.axolotl.ai/docs/api/prompt_strategies.pygmalion.html - 2026-03-16T02:15:10.801Z + 2026-03-16T04:18:34.802Z https://docs.axolotl.ai/docs/api/common.architectures.html - 2026-03-16T02:15:11.679Z + 2026-03-16T04:18:35.676Z https://docs.axolotl.ai/docs/api/cli.utils.html - 2026-03-16T02:15:10.415Z + 2026-03-16T04:18:34.419Z https://docs.axolotl.ai/docs/api/monkeypatch.llama_expand_mask.html - 2026-03-16T02:15:11.064Z + 2026-03-16T04:18:35.064Z https://docs.axolotl.ai/docs/api/monkeypatch.llama_attn_hijack_flash.html - 2026-03-16T02:15:11.052Z + 2026-03-16T04:18:35.052Z https://docs.axolotl.ai/docs/api/prompt_strategies.kto.user_defined.html - 2026-03-16T02:15:10.870Z + 2026-03-16T04:18:34.870Z https://docs.axolotl.ai/docs/api/cli.checks.html - 2026-03-16T02:15:10.306Z + 2026-03-16T04:18:34.310Z https://docs.axolotl.ai/docs/api/core.trainers.trl.html - 2026-03-16T02:15:10.502Z + 2026-03-16T04:18:34.506Z https://docs.axolotl.ai/docs/api/utils.tokenization.html - 2026-03-16T02:15:11.194Z + 2026-03-16T04:18:35.193Z https://docs.axolotl.ai/docs/api/cli.utils.fetch.html - 2026-03-16T02:15:10.436Z + 2026-03-16T04:18:34.440Z https://docs.axolotl.ai/docs/api/core.builders.base.html - 2026-03-16T02:15:10.127Z + 2026-03-16T04:18:34.133Z https://docs.axolotl.ai/docs/api/monkeypatch.mistral_attn_hijack_flash.html - 2026-03-16T02:15:11.056Z + 2026-03-16T04:18:35.055Z https://docs.axolotl.ai/docs/api/utils.trainer.html - 2026-03-16T02:15:11.245Z + 2026-03-16T04:18:35.246Z https://docs.axolotl.ai/docs/api/cli.train.html - 2026-03-16T02:15:10.258Z + 2026-03-16T04:18:34.263Z https://docs.axolotl.ai/docs/api/core.chat.format.llama3x.html - 2026-03-16T02:15:10.188Z + 2026-03-16T04:18:34.193Z https://docs.axolotl.ai/docs/api/utils.lora.html - 2026-03-16T02:15:11.202Z + 2026-03-16T04:18:35.201Z https://docs.axolotl.ai/docs/api/utils.schemas.multimodal.html - 2026-03-16T02:15:11.437Z + 2026-03-16T04:18:35.436Z https://docs.axolotl.ai/docs/api/monkeypatch.mixtral.html - 2026-03-16T02:15:11.148Z + 2026-03-16T04:18:35.148Z https://docs.axolotl.ai/docs/api/core.trainers.dpo.trainer.html - 2026-03-16T02:15:10.517Z + 2026-03-16T04:18:34.521Z https://docs.axolotl.ai/docs/api/utils.samplers.multipack.html - 2026-03-16T02:15:11.787Z + 2026-03-16T04:18:35.785Z https://docs.axolotl.ai/docs/api/core.trainers.grpo.trainer.html - 2026-03-16T02:15:10.531Z + 2026-03-16T04:18:34.535Z https://docs.axolotl.ai/docs/api/utils.ctx_managers.sequence_parallel.html - 2026-03-16T02:15:10.654Z + 2026-03-16T04:18:34.656Z https://docs.axolotl.ai/docs/api/prompt_strategies.kto.chatml.html - 2026-03-16T02:15:10.867Z + 2026-03-16T04:18:34.868Z https://docs.axolotl.ai/docs/api/prompt_strategies.kto.llama3.html - 2026-03-16T02:15:10.857Z + 2026-03-16T04:18:34.858Z https://docs.axolotl.ai/docs/api/datasets.html - 2026-03-16T02:15:10.037Z + 2026-03-16T04:18:34.044Z https://docs.axolotl.ai/docs/api/integrations.grokfast.optimizer.html - 2026-03-16T02:15:11.654Z + 2026-03-16T04:18:35.652Z https://docs.axolotl.ai/docs/api/cli.art.html - 2026-03-16T02:15:10.298Z + 2026-03-16T04:18:34.302Z https://docs.axolotl.ai/docs/api/utils.callbacks.qat.html - 2026-03-16T02:15:11.819Z + 2026-03-16T04:18:35.817Z https://docs.axolotl.ai/docs/api/monkeypatch.relora.html - 2026-03-16T02:15:11.062Z + 2026-03-16T04:18:35.062Z https://docs.axolotl.ai/docs/api/prompt_strategies.messages.chat.html - 2026-03-16T02:15:10.807Z + 2026-03-16T04:18:34.807Z https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.passthrough.html - 2026-03-16T02:15:10.847Z + 2026-03-16T04:18:34.847Z https://docs.axolotl.ai/docs/api/core.chat.format.chatml.html - 2026-03-16T02:15:10.186Z + 2026-03-16T04:18:34.192Z https://docs.axolotl.ai/docs/api/cli.utils.args.html - 2026-03-16T02:15:10.429Z + 2026-03-16T04:18:34.434Z https://docs.axolotl.ai/docs/api/core.trainers.base.html - 2026-03-16T02:15:10.483Z + 2026-03-16T04:18:34.487Z https://docs.axolotl.ai/docs/api/utils.schemas.training.html - 2026-03-16T02:15:11.391Z + 2026-03-16T04:18:35.391Z https://docs.axolotl.ai/docs/api/evaluate.html - 2026-03-16T02:15:10.030Z + 2026-03-16T04:18:34.036Z https://docs.axolotl.ai/docs/api/cli.config.html - 2026-03-16T02:15:10.328Z + 2026-03-16T04:18:34.332Z https://docs.axolotl.ai/docs/api/integrations.spectrum.args.html - 2026-03-16T02:15:11.677Z + 2026-03-16T04:18:35.674Z https://docs.axolotl.ai/docs/api/prompt_strategies.stepwise_supervised.html - 2026-03-16T02:15:10.779Z + 2026-03-16T04:18:34.781Z https://docs.axolotl.ai/docs/api/utils.collators.batching.html - 2026-03-16T02:15:11.725Z + 2026-03-16T04:18:35.723Z https://docs.axolotl.ai/docs/api/cli.cloud.base.html - 2026-03-16T02:15:10.405Z + 2026-03-16T04:18:34.409Z https://docs.axolotl.ai/docs/api/utils.schemas.config.html - 2026-03-16T02:15:11.374Z + 2026-03-16T04:18:35.373Z https://docs.axolotl.ai/docs/api/monkeypatch.multipack.html - 2026-03-16T02:15:11.057Z + 2026-03-16T04:18:35.057Z https://docs.axolotl.ai/docs/api/utils.dict.html - 2026-03-16T02:15:11.312Z + 2026-03-16T04:18:35.313Z https://docs.axolotl.ai/docs/api/integrations.cut_cross_entropy.args.html - 2026-03-16T02:15:11.653Z + 2026-03-16T04:18:35.651Z https://docs.axolotl.ai/docs/api/cli.delinearize_llama4.html - 2026-03-16T02:15:10.334Z + 2026-03-16T04:18:34.338Z https://docs.axolotl.ai/docs/api/cli.merge_sharded_fsdp_weights.html - 2026-03-16T02:15:10.376Z + 2026-03-16T04:18:34.380Z https://docs.axolotl.ai/docs/api/utils.model_shard_quant.html - 2026-03-16T02:15:11.209Z + 2026-03-16T04:18:35.209Z https://docs.axolotl.ai/docs/api/monkeypatch.data.batch_dataset_fetcher.html - 2026-03-16T02:15:11.146Z + 2026-03-16T04:18:35.146Z https://docs.axolotl.ai/docs/api/utils.schemas.utils.html - 2026-03-16T02:15:11.474Z + 2026-03-16T04:18:35.474Z https://docs.axolotl.ai/docs/api/utils.callbacks.comet_.html - 2026-03-16T02:15:11.811Z + 2026-03-16T04:18:35.807Z https://docs.axolotl.ai/docs/api/utils.optimizers.adopt.html - 2026-03-16T02:15:11.322Z + 2026-03-16T04:18:35.322Z https://docs.axolotl.ai/docs/api/kernels.swiglu.html - 2026-03-16T02:15:11.033Z + 2026-03-16T04:18:35.033Z https://docs.axolotl.ai/docs/api/core.trainers.mixins.optimizer.html - 2026-03-16T02:15:10.611Z + 2026-03-16T04:18:34.614Z https://docs.axolotl.ai/docs/api/cli.inference.html - 2026-03-16T02:15:10.351Z + 2026-03-16T04:18:34.355Z https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.chatml.html - 2026-03-16T02:15:10.841Z + 2026-03-16T04:18:34.842Z https://docs.axolotl.ai/docs/api/utils.chat_templates.html - 2026-03-16T02:15:11.196Z + 2026-03-16T04:18:35.194Z https://docs.axolotl.ai/docs/api/prompt_strategies.alpaca_instruct.html - 2026-03-16T02:15:10.717Z + 2026-03-16T04:18:34.719Z https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.zephyr.html - 2026-03-16T02:15:10.843Z + 2026-03-16T04:18:34.844Z https://docs.axolotl.ai/docs/multipack.html - 2026-03-16T02:11:28.982Z + 2026-03-16T04:14:26.686Z https://docs.axolotl.ai/docs/torchao.html - 2026-03-16T02:11:28.984Z + 2026-03-16T04:14:26.688Z https://docs.axolotl.ai/docs/reward_modelling.html - 2026-03-16T02:11:28.983Z + 2026-03-16T04:14:26.687Z https://docs.axolotl.ai/docs/nccl.html - 2026-03-16T02:11:28.982Z + 2026-03-16T04:14:26.686Z https://docs.axolotl.ai/docs/multi-gpu.html - 2026-03-16T02:11:28.982Z + 2026-03-16T04:14:26.686Z https://docs.axolotl.ai/docs/batch_vs_grad.html - 2026-03-16T02:11:28.979Z + 2026-03-16T04:14:26.683Z https://docs.axolotl.ai/docs/multimodal.html - 2026-03-16T02:11:28.982Z + 2026-03-16T04:14:26.686Z https://docs.axolotl.ai/docs/models/LiquidAI.html - 2026-03-16T02:15:32.152Z + 2026-03-16T04:18:58.124Z https://docs.axolotl.ai/docs/models/mistral.html - 2026-03-16T02:15:32.147Z + 2026-03-16T04:18:58.119Z https://docs.axolotl.ai/docs/models/trinity.html - 2026-03-16T02:15:32.141Z + 2026-03-16T04:18:58.112Z https://docs.axolotl.ai/docs/models/hunyuan.html - 2026-03-16T02:15:32.152Z + 2026-03-16T04:18:58.125Z https://docs.axolotl.ai/docs/models/phi.html - 2026-03-16T02:15:32.151Z + 2026-03-16T04:18:58.123Z https://docs.axolotl.ai/docs/models/apertus.html - 2026-03-16T02:15:32.149Z + 2026-03-16T04:18:58.122Z https://docs.axolotl.ai/docs/models/plano.html - 2026-03-16T02:15:32.139Z + 2026-03-16T04:18:58.110Z https://docs.axolotl.ai/docs/models/gemma3n.html - 2026-03-16T02:15:32.149Z + 2026-03-16T04:18:58.121Z https://docs.axolotl.ai/docs/models/arcee.html - 2026-03-16T02:15:32.141Z + 2026-03-16T04:18:58.113Z https://docs.axolotl.ai/docs/models/ministral3.html - 2026-03-16T02:15:32.142Z + 2026-03-16T04:18:58.114Z https://docs.axolotl.ai/docs/models/magistral/think.html - 2026-03-16T02:15:32.144Z + 2026-03-16T04:18:58.116Z https://docs.axolotl.ai/docs/models/llama-4.html - 2026-03-16T02:15:32.148Z + 2026-03-16T04:18:58.119Z https://docs.axolotl.ai/docs/models/voxtral.html - 2026-03-16T02:15:32.146Z + 2026-03-16T04:18:58.118Z https://docs.axolotl.ai/docs/models/magistral.html - 2026-03-16T02:15:32.144Z + 2026-03-16T04:18:58.116Z https://docs.axolotl.ai/docs/models/qwen3.html - 2026-03-16T02:15:32.149Z + 2026-03-16T04:18:58.121Z https://docs.axolotl.ai/docs/models/ministral.html - 2026-03-16T02:15:32.145Z + 2026-03-16T04:18:58.117Z https://docs.axolotl.ai/docs/models/ministral3/vision.html - 2026-03-16T02:15:32.143Z + 2026-03-16T04:18:58.115Z https://docs.axolotl.ai/docs/debugging.html - 2026-03-16T02:11:28.980Z + 2026-03-16T04:14:26.683Z https://docs.axolotl.ai/docs/faq.html - 2026-03-16T02:11:28.980Z + 2026-03-16T04:14:26.683Z https://docs.axolotl.ai/src/axolotl/integrations/LICENSE.html - 2026-03-16T02:11:29.009Z + 2026-03-16T04:14:26.717Z https://docs.axolotl.ai/FAQS.html - 2026-03-16T02:11:28.977Z + 2026-03-16T04:14:26.680Z