diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index e081f2127..0e1ccb89a 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -38,7 +38,7 @@ jobs: cuda_version: 12.9.1 python_version: "3.12" pytorch: 2.9.1 - axolotl_extras: + axolotl_extras: vllm platforms: "linux/amd64,linux/arm64" - cuda: 130 cuda_version: 13.0.0 diff --git a/.nojekyll b/.nojekyll index 90e665f42..a1619d541 100644 --- a/.nojekyll +++ b/.nojekyll @@ -1 +1 @@ -a5d2a80a \ No newline at end of file +cad3747d \ No newline at end of file diff --git a/docs/models/apertus.html b/docs/models/apertus.html index 649bb8049..ae09e9d64 100644 --- a/docs/models/apertus.html +++ b/docs/models/apertus.html @@ -786,7 +786,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true}); git clone https://github.com/axolotl-ai-cloud/axolotl.git cd axolotl -pip3 install packaging==23.2 setuptools==75.8.0 wheel ninja +pip3 install packaging==26.0 setuptools==75.8.0 wheel ninja pip3 install --no-build-isolation -e '.[flash-attn]' # Install CCE https://docs.axolotl.ai/docs/custom_integrations.html#cut-cross-entropy diff --git a/docs/models/arcee.html b/docs/models/arcee.html index 319189121..313a8bd17 100644 --- a/docs/models/arcee.html +++ b/docs/models/arcee.html @@ -786,7 +786,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true}); git clone https://github.com/axolotl-ai-cloud/axolotl.git cd axolotl -pip3 install packaging==23.2 setuptools==75.8.0 wheel ninja +pip3 install packaging==26.0 setuptools==75.8.0 wheel ninja pip3 install --no-build-isolation -e '.[flash-attn]' # Install CCE https://docs.axolotl.ai/docs/custom_integrations.html#cut-cross-entropy diff --git a/docs/models/devstral.html b/docs/models/devstral.html index 09f3aa08c..fd3c2dd45 100644 --- a/docs/models/devstral.html +++ b/docs/models/devstral.html @@ -786,7 +786,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
Here is an example of how to install from pip:
# Ensure you have Pytorch installed (Pytorch 2.6.0 min)
-pip3 install packaging==23.2 setuptools==75.8.0 wheel ninja
+pip3 install packaging==26.0 setuptools==75.8.0 wheel ninja
pip3 install --no-build-isolation 'axolotl[flash-attn]>=0.12.0'Here is an example of how to install from pip:
# Ensure you have Pytorch installed (Pytorch 2.6.0 min)
-pip3 install packaging==23.2 setuptools==75.8.0 wheel ninja
+pip3 install packaging==26.0 setuptools==75.8.0 wheel ninja
pip3 install --no-build-isolation 'axolotl[flash-attn]>=0.12.0'Here is an example of how to install from pip:
# Ensure you have Pytorch installed (Pytorch 2.6.0 min)
-pip3 install packaging==23.2 setuptools==75.8.0 wheel ninja
+pip3 install packaging==26.0 setuptools==75.8.0 wheel ninja
pip3 install --no-build-isolation 'axolotl[flash-attn]>=0.12.0'Here is an example of how to install from pip:
# Ensure you have Pytorch installed (Pytorch 2.7.0 min)
-pip3 install packaging==23.2 setuptools==75.8.0 wheel ninja
+pip3 install packaging==26.0 setuptools==75.8.0 wheel ninja
pip3 install --no-build-isolation 'axolotl[flash-attn]>=0.12.0'Here is an example of how to install from pip:
# Ensure you have Pytorch installed (Pytorch 2.6.0 min)
-pip3 install packaging==23.2 setuptools==75.8.0 wheel ninja
+pip3 install packaging==26.0 setuptools==75.8.0 wheel ninja
pip3 install --no-build-isolation 'axolotl[flash-attn]>=0.12.0'pip3 install -U packaging==23.2 setuptools==75.8.0 wheel ninja
+pip3 install -U packaging==26.0 setuptools==75.8.0 wheel ninja
pip3 install --no-build-isolation axolotl[flash-attn,deepspeed]
# Download example axolotl configs, deepspeed configs
diff --git a/search.json b/search.json
index f5be2a372..162fc1f9b 100644
--- a/search.json
+++ b/search.json
@@ -1174,7 +1174,7 @@
"href": "docs/models/apertus.html#getting-started",
"title": "Apertus",
"section": "Getting started",
- "text": "Getting started\n\nInstall Axolotl following the installation guide. You need to install from main as Apertus is only on nightly or use our latest Docker images.\nHere is an example of how to install from main for pip:\n\n# Ensure you have Pytorch installed (Pytorch 2.6.0 min)\ngit clone https://github.com/axolotl-ai-cloud/axolotl.git\ncd axolotl\n\npip3 install packaging==23.2 setuptools==75.8.0 wheel ninja\npip3 install --no-build-isolation -e '.[flash-attn]'\n\n# Install CCE https://docs.axolotl.ai/docs/custom_integrations.html#cut-cross-entropy\npython scripts/cutcrossentropy_install.py | sh\n\n(Optional, highly recommended) Install XIELU CUDA\n\n## Recommended for reduced VRAM and faster speeds\n\n# Point to CUDA toolkit directory\n# For those using our Docker image, use the below path.\nexport CUDA_HOME=/usr/local/cuda\n\npip3 install git+https://github.com/nickjbrowning/XIELU@59d6031 --no-build-isolation --no-deps\nFor any installation errors, see XIELU Installation Issues\n\nRun the finetuning example:\n\naxolotl train examples/apertus/apertus-8b-qlora.yaml\nThis config uses about 8.7 GiB VRAM.\nLet us know how it goes. Happy finetuning! 🚀\n\nTips\n\nFor inference, the official Apertus team recommends top_p=0.9 and temperature=0.8.\nYou can instead use full paremter fine-tuning by removing the adapter: qlora and load_in_4bit: true from the config.\nRead more on how to load your own dataset at docs.\nThe dataset format follows the OpenAI Messages format as seen here.\n\n\n\nXIELU Installation Issues\n\nModuleNotFoundError: No module named 'torch'\nPlease check these one by one:\n- Running in correct environment\n- Env has PyTorch installed\n- CUDA toolkit is at CUDA_HOME\nIf those didn’t help, please try the below solutions:\n\nPass env for CMAKE and try install again:\nPython_EXECUTABLE=$(which python) pip3 install git+https://github.com/nickjbrowning/XIELU@59d6031 --no-build-isolation --no-deps\nGit clone the repo and manually hardcode python path:\ngit clone https://github.com/nickjbrowning/XIELU\ncd xielu\ngit checkout 59d6031\n\ncd xielu\nnano CMakeLists.txt # or vi depending on your preference\nexecute_process(\n- COMMAND ${Python_EXECUTABLE} -c \"import torch.utils; print(torch.utils.cmake_prefix_path)\"\n+ COMMAND /root/miniconda3/envs/py3.11/bin/python -c \"import torch.utils; print(torch.utils.cmake_prefix_path)\"\n RESULT_VARIABLE TORCH_CMAKE_PATH_RESULT\n OUTPUT_VARIABLE TORCH_CMAKE_PATH_OUTPUT\n ERROR_VARIABLE TORCH_CMAKE_PATH_ERROR\n)\npip3 install . --no-build-isolation --no-deps",
+ "text": "Getting started\n\nInstall Axolotl following the installation guide. You need to install from main as Apertus is only on nightly or use our latest Docker images.\nHere is an example of how to install from main for pip:\n\n# Ensure you have Pytorch installed (Pytorch 2.6.0 min)\ngit clone https://github.com/axolotl-ai-cloud/axolotl.git\ncd axolotl\n\npip3 install packaging==26.0 setuptools==75.8.0 wheel ninja\npip3 install --no-build-isolation -e '.[flash-attn]'\n\n# Install CCE https://docs.axolotl.ai/docs/custom_integrations.html#cut-cross-entropy\npython scripts/cutcrossentropy_install.py | sh\n\n(Optional, highly recommended) Install XIELU CUDA\n\n## Recommended for reduced VRAM and faster speeds\n\n# Point to CUDA toolkit directory\n# For those using our Docker image, use the below path.\nexport CUDA_HOME=/usr/local/cuda\n\npip3 install git+https://github.com/nickjbrowning/XIELU@59d6031 --no-build-isolation --no-deps\nFor any installation errors, see XIELU Installation Issues\n\nRun the finetuning example:\n\naxolotl train examples/apertus/apertus-8b-qlora.yaml\nThis config uses about 8.7 GiB VRAM.\nLet us know how it goes. Happy finetuning! 🚀\n\nTips\n\nFor inference, the official Apertus team recommends top_p=0.9 and temperature=0.8.\nYou can instead use full paremter fine-tuning by removing the adapter: qlora and load_in_4bit: true from the config.\nRead more on how to load your own dataset at docs.\nThe dataset format follows the OpenAI Messages format as seen here.\n\n\n\nXIELU Installation Issues\n\nModuleNotFoundError: No module named 'torch'\nPlease check these one by one:\n- Running in correct environment\n- Env has PyTorch installed\n- CUDA toolkit is at CUDA_HOME\nIf those didn’t help, please try the below solutions:\n\nPass env for CMAKE and try install again:\nPython_EXECUTABLE=$(which python) pip3 install git+https://github.com/nickjbrowning/XIELU@59d6031 --no-build-isolation --no-deps\nGit clone the repo and manually hardcode python path:\ngit clone https://github.com/nickjbrowning/XIELU\ncd xielu\ngit checkout 59d6031\n\ncd xielu\nnano CMakeLists.txt # or vi depending on your preference\nexecute_process(\n- COMMAND ${Python_EXECUTABLE} -c \"import torch.utils; print(torch.utils.cmake_prefix_path)\"\n+ COMMAND /root/miniconda3/envs/py3.11/bin/python -c \"import torch.utils; print(torch.utils.cmake_prefix_path)\"\n RESULT_VARIABLE TORCH_CMAKE_PATH_RESULT\n OUTPUT_VARIABLE TORCH_CMAKE_PATH_OUTPUT\n ERROR_VARIABLE TORCH_CMAKE_PATH_ERROR\n)\npip3 install . --no-build-isolation --no-deps",
"crumbs": [
"Getting Started",
"Model Guides",
@@ -1274,7 +1274,7 @@
"href": "docs/models/gpt-oss.html#getting-started",
"title": "GPT-OSS",
"section": "Getting started",
- "text": "Getting started\n\nInstall Axolotl following the installation guide.\nHere is an example of how to install from pip:\n\n# Ensure you have Pytorch installed (Pytorch 2.6.0 min)\npip3 install packaging==23.2 setuptools==75.8.0 wheel ninja\npip3 install --no-build-isolation 'axolotl[flash-attn]>=0.12.0'\n\nChoose one of the following configs below for training the 20B model. (for 120B, see below)\n\n# LoRA SFT linear layers (1x48GB @ ~44GiB)\naxolotl train examples/gpt-oss/gpt-oss-20b-sft-lora-singlegpu.yaml\n\n# FFT SFT with offloading (2x24GB @ ~21GiB/GPU)\naxolotl train examples/gpt-oss/gpt-oss-20b-fft-fsdp2-offload.yaml\n\n# FFT SFT (8x48GB @ ~36GiB/GPU or 4x80GB @ ~46GiB/GPU)\naxolotl train examples/gpt-oss/gpt-oss-20b-fft-fsdp2.yaml\nNote: Memory usage taken from device_mem_reserved(gib) from logs.\n\nTraining 120B\nOn 8xH100s, make sure you have ~3TB of free disk space. With each checkpoint clocking in at ~720GB, along with the base\nmodel, and final model output, you may need at least 3TB of free disk space to keep at least 2 checkpoints.\n# FFT SFT with offloading (8x80GB @ ~49GiB/GPU)\naxolotl train examples/gpt-oss/gpt-oss-120b-fft-fsdp2-offload.yaml\nTo simplify fine-tuning across 2 nodes × 8x H100 (80GB) GPUs, we’ve partnered with Baseten to showcase multi-node\ntraining of the 120B model using Baseten Truss. You can read more about this recipe on\nBaseten’s blog. The recipe can\nbe found on their\nGitHub.\nERRATA: Transformers saves the model Architecture prefixed with FSDP which needs to be manually renamed in config.json.\nSee https://github.com/huggingface/transformers/pull/40207 for the status of this issue.\nsed -i 's/FSDPGptOssForCausalLM/GptOssForCausalLM/g' ./outputs/gpt-oss-out/config.json\nWhen using SHARDED_STATE_DICT with FSDP, the final checkpoint should automatically merge the sharded weights to your\nconfigured output_dir. However, if that step fails due to a disk space error, you can take an additional step to\nmerge the sharded weights. This step will automatically determine the last checkpoint directory and merge the sharded\nweights to {output_dir}/merged.\naxolotl merge-sharded-fsdp-weights examples/gpt-oss/gpt-oss-120b-fft-fsdp2-offload.yaml\nmv ./outputs/gpt-oss-out/merged/* ./outputs/gpt-oss-out/\n\n\nHow to set reasoning_effort in template?\nThe harmony template has a feature to set the reasoning_effort during prompt building. The default is medium. If you would like to adjust this, you can add the following to your config:\nchat_template_kwargs:\n reasoning_effort: \"high\" # low | medium | high\nCurrently, this applies globally. There is no method to apply per sample yet. If you are interested in adding this, please feel free to create an Issue to discuss.\n\n\nInferencing your fine-tuned model\n\nvLLM\nGPT-OSS support in vLLM does not exist in a stable release yet. See https://x.com/MaziyarPanahi/status/1955741905515323425\nfor more information about using a special vllm-openai docker image for inferencing with vLLM.\nOptionally, vLLM can be installed from nightly:\npip install --no-build-isolation --pre -U vllm --extra-index-url https://wheels.vllm.ai/nightly\nand the vLLM server can be started with the following command (modify --tensor-parallel-size 8 to match your environment):\nvllm serve ./outputs/gpt-oss-out/ --served-model-name axolotl/gpt-oss-20b --host 0.0.0.0 --port 8888 --tensor-parallel-size 8\n\n\nSGLang\nSGLang has 0-day support in main, see https://github.com/sgl-project/sglang/issues/8833 for infomation on installing\nSGLang from source. Once you’ve installed SGLang, run the following command to launch a SGLang server:\npython3 -m sglang.launch_server --model ./outputs/gpt-oss-out/ --served-model-name axolotl/gpt-oss-120b --host 0.0.0.0 --port 8888 --tp 8\n\n\n\nTool use\nGPT-OSS has a comprehensive tool understanding. Axolotl supports tool calling datasets for Supervised Fine-tuning.\nHere is an example dataset config:\ndatasets:\n - path: Nanobit/text-tools-2k-test\n type: chat_template\nSee Nanobit/text-tools-2k-test for the sample dataset.\nRefer to our docs for more info.\n\n\nThinking and chat_template masking conflict\nOpenAI’s Harmony template hides thinking in all non-final turns, which conflicts with Axolotl’s chat_template masking.\nIf your dataset has thinking content mid-turn, there are two paths we recommend:\n\nTrain only on the last turn. This can be accomplished via chat_template’s train on last doc.\nAdjust your dataset to only have thinking content in the last turn.\n\n\n\nTIPS\n\nRead more on how to load your own dataset at docs.\nThe dataset format follows the OpenAI Messages format as seen here.",
+ "text": "Getting started\n\nInstall Axolotl following the installation guide.\nHere is an example of how to install from pip:\n\n# Ensure you have Pytorch installed (Pytorch 2.6.0 min)\npip3 install packaging==26.0 setuptools==75.8.0 wheel ninja\npip3 install --no-build-isolation 'axolotl[flash-attn]>=0.12.0'\n\nChoose one of the following configs below for training the 20B model. (for 120B, see below)\n\n# LoRA SFT linear layers (1x48GB @ ~44GiB)\naxolotl train examples/gpt-oss/gpt-oss-20b-sft-lora-singlegpu.yaml\n\n# FFT SFT with offloading (2x24GB @ ~21GiB/GPU)\naxolotl train examples/gpt-oss/gpt-oss-20b-fft-fsdp2-offload.yaml\n\n# FFT SFT (8x48GB @ ~36GiB/GPU or 4x80GB @ ~46GiB/GPU)\naxolotl train examples/gpt-oss/gpt-oss-20b-fft-fsdp2.yaml\nNote: Memory usage taken from device_mem_reserved(gib) from logs.\n\nTraining 120B\nOn 8xH100s, make sure you have ~3TB of free disk space. With each checkpoint clocking in at ~720GB, along with the base\nmodel, and final model output, you may need at least 3TB of free disk space to keep at least 2 checkpoints.\n# FFT SFT with offloading (8x80GB @ ~49GiB/GPU)\naxolotl train examples/gpt-oss/gpt-oss-120b-fft-fsdp2-offload.yaml\nTo simplify fine-tuning across 2 nodes × 8x H100 (80GB) GPUs, we’ve partnered with Baseten to showcase multi-node\ntraining of the 120B model using Baseten Truss. You can read more about this recipe on\nBaseten’s blog. The recipe can\nbe found on their\nGitHub.\nERRATA: Transformers saves the model Architecture prefixed with FSDP which needs to be manually renamed in config.json.\nSee https://github.com/huggingface/transformers/pull/40207 for the status of this issue.\nsed -i 's/FSDPGptOssForCausalLM/GptOssForCausalLM/g' ./outputs/gpt-oss-out/config.json\nWhen using SHARDED_STATE_DICT with FSDP, the final checkpoint should automatically merge the sharded weights to your\nconfigured output_dir. However, if that step fails due to a disk space error, you can take an additional step to\nmerge the sharded weights. This step will automatically determine the last checkpoint directory and merge the sharded\nweights to {output_dir}/merged.\naxolotl merge-sharded-fsdp-weights examples/gpt-oss/gpt-oss-120b-fft-fsdp2-offload.yaml\nmv ./outputs/gpt-oss-out/merged/* ./outputs/gpt-oss-out/\n\n\nHow to set reasoning_effort in template?\nThe harmony template has a feature to set the reasoning_effort during prompt building. The default is medium. If you would like to adjust this, you can add the following to your config:\nchat_template_kwargs:\n reasoning_effort: \"high\" # low | medium | high\nCurrently, this applies globally. There is no method to apply per sample yet. If you are interested in adding this, please feel free to create an Issue to discuss.\n\n\nInferencing your fine-tuned model\n\nvLLM\nGPT-OSS support in vLLM does not exist in a stable release yet. See https://x.com/MaziyarPanahi/status/1955741905515323425\nfor more information about using a special vllm-openai docker image for inferencing with vLLM.\nOptionally, vLLM can be installed from nightly:\npip install --no-build-isolation --pre -U vllm --extra-index-url https://wheels.vllm.ai/nightly\nand the vLLM server can be started with the following command (modify --tensor-parallel-size 8 to match your environment):\nvllm serve ./outputs/gpt-oss-out/ --served-model-name axolotl/gpt-oss-20b --host 0.0.0.0 --port 8888 --tensor-parallel-size 8\n\n\nSGLang\nSGLang has 0-day support in main, see https://github.com/sgl-project/sglang/issues/8833 for infomation on installing\nSGLang from source. Once you’ve installed SGLang, run the following command to launch a SGLang server:\npython3 -m sglang.launch_server --model ./outputs/gpt-oss-out/ --served-model-name axolotl/gpt-oss-120b --host 0.0.0.0 --port 8888 --tp 8\n\n\n\nTool use\nGPT-OSS has a comprehensive tool understanding. Axolotl supports tool calling datasets for Supervised Fine-tuning.\nHere is an example dataset config:\ndatasets:\n - path: Nanobit/text-tools-2k-test\n type: chat_template\nSee Nanobit/text-tools-2k-test for the sample dataset.\nRefer to our docs for more info.\n\n\nThinking and chat_template masking conflict\nOpenAI’s Harmony template hides thinking in all non-final turns, which conflicts with Axolotl’s chat_template masking.\nIf your dataset has thinking content mid-turn, there are two paths we recommend:\n\nTrain only on the last turn. This can be accomplished via chat_template’s train on last doc.\nAdjust your dataset to only have thinking content in the last turn.\n\n\n\nTIPS\n\nRead more on how to load your own dataset at docs.\nThe dataset format follows the OpenAI Messages format as seen here.",
"crumbs": [
"Getting Started",
"Model Guides",
@@ -1382,7 +1382,7 @@
"href": "docs/models/granite4.html#getting-started",
"title": "Granite 4",
"section": "Getting started",
- "text": "Getting started\n\nInstall Axolotl following the installation guide. You need to install from main as Granite4 is only on nightly or use our latest Docker images.\nHere is an example of how to install from main for pip:\n\n# Ensure you have Pytorch installed (Pytorch 2.7.1 min)\ngit clone https://github.com/axolotl-ai-cloud/axolotl.git\ncd axolotl\n\npip3 install packaging==23.2 setuptools==75.8.0 wheel ninja\npip3 install --no-build-isolation -e '.[flash-attn]'\n\n# Install CCE https://docs.axolotl.ai/docs/custom_integrations.html#cut-cross-entropy\npython scripts/cutcrossentropy_install.py | sh\n\nRun the finetuning example:\n\naxolotl train examples/granite4/granite-4.0-tiny-fft.yaml\nThis config uses about 40.8GiB VRAM.\nLet us know how it goes. Happy finetuning! 🚀\n\nTIPS\n\nRead more on how to load your own dataset at docs.\nThe dataset format follows the OpenAI Messages format as seen here.\n\n\n\nLimitation\nAdapter finetuning does not work at the moment. It would error with\nRuntimeError: mat1 and mat2 shapes cannot be multiplied (4096x3072 and 1x1179648)\nIn addition, if adapter training works, lora_target_linear: true will not work due to:\nValueError: Target module GraniteMoeHybridParallelExperts() is not supported.",
+ "text": "Getting started\n\nInstall Axolotl following the installation guide. You need to install from main as Granite4 is only on nightly or use our latest Docker images.\nHere is an example of how to install from main for pip:\n\n# Ensure you have Pytorch installed (Pytorch 2.7.1 min)\ngit clone https://github.com/axolotl-ai-cloud/axolotl.git\ncd axolotl\n\npip3 install packaging==26.0 setuptools==75.8.0 wheel ninja\npip3 install --no-build-isolation -e '.[flash-attn]'\n\n# Install CCE https://docs.axolotl.ai/docs/custom_integrations.html#cut-cross-entropy\npython scripts/cutcrossentropy_install.py | sh\n\nRun the finetuning example:\n\naxolotl train examples/granite4/granite-4.0-tiny-fft.yaml\nThis config uses about 40.8GiB VRAM.\nLet us know how it goes. Happy finetuning! 🚀\n\nTIPS\n\nRead more on how to load your own dataset at docs.\nThe dataset format follows the OpenAI Messages format as seen here.\n\n\n\nLimitation\nAdapter finetuning does not work at the moment. It would error with\nRuntimeError: mat1 and mat2 shapes cannot be multiplied (4096x3072 and 1x1179648)\nIn addition, if adapter training works, lora_target_linear: true will not work due to:\nValueError: Target module GraniteMoeHybridParallelExperts() is not supported.",
"crumbs": [
"Getting Started",
"Model Guides",
@@ -1586,7 +1586,7 @@
"href": "docs/models/hunyuan.html#getting-started",
"title": "Hunyuan",
"section": "Getting started",
- "text": "Getting started\n\nInstall Axolotl following the installation guide. You need to install from main as HunYuan is only on nightly or use our latest Docker images.\nHere is an example of how to install from main for pip:\n\n# Ensure you have Pytorch installed (Pytorch 2.6.0 min)\ngit clone https://github.com/axolotl-ai-cloud/axolotl.git\ncd axolotl\n\npip3 install packaging==23.2 setuptools==75.8.0 wheel ninja\npip3 install --no-build-isolation -e '.[flash-attn]'\n\n# Install CCE https://docs.axolotl.ai/docs/custom_integrations.html#cut-cross-entropy\npython scripts/cutcrossentropy_install.py | sh\n\nRun the finetuning example:\n\naxolotl train examples/hunyuan/hunyuan-v1-dense-qlora.yaml\nThis config uses about 4.7 GB VRAM.\nLet us know how it goes. Happy finetuning! 🚀\n\nDataset\nHunYuan Instruct models can choose to enter a slow think or fast think pattern. For best performance on fine-tuning their Instruct models, your dataset should be adjusted to match their pattern.\n# fast think pattern\nmessages = [\n {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n {\"role\": \"user\", \"content\": \"/no_think What color is the sun?\" },\n {\"role\": \"assistant\", \"content\": \"<think>\\n\\n</think>\\n<answer>\\nThe sun is yellow.\\n</answer>\"}\n]\n\n# slow think pattern\nmessages = [\n {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n {\"role\": \"user\", \"content\": \"/no_think What color is the sun?\" },\n {\"role\": \"assistant\", \"content\": \"<think>\\nThe user is asking about the color of the sun. I need to ...\\n</think>\\n<answer>\\nThe sun is yellow.\\n</answer>\"}\n]\n\n\nTIPS\n\nFor inference, the official Tencent team recommends\n\n\n{\n \"do_sample\": true,\n \"top_k\": 20,\n \"top_p\": 0.8,\n \"repetition_penalty\": 1.05,\n \"temperature\": 0.7\n}\n\nYou can run a full finetuning by removing the adapter: qlora and load_in_4bit: true from the config.\nRead more on how to load your own dataset at docs.\nThe dataset format follows the OpenAI Messages format as seen here.",
+ "text": "Getting started\n\nInstall Axolotl following the installation guide. You need to install from main as HunYuan is only on nightly or use our latest Docker images.\nHere is an example of how to install from main for pip:\n\n# Ensure you have Pytorch installed (Pytorch 2.6.0 min)\ngit clone https://github.com/axolotl-ai-cloud/axolotl.git\ncd axolotl\n\npip3 install packaging==26.0 setuptools==75.8.0 wheel ninja\npip3 install --no-build-isolation -e '.[flash-attn]'\n\n# Install CCE https://docs.axolotl.ai/docs/custom_integrations.html#cut-cross-entropy\npython scripts/cutcrossentropy_install.py | sh\n\nRun the finetuning example:\n\naxolotl train examples/hunyuan/hunyuan-v1-dense-qlora.yaml\nThis config uses about 4.7 GB VRAM.\nLet us know how it goes. Happy finetuning! 🚀\n\nDataset\nHunYuan Instruct models can choose to enter a slow think or fast think pattern. For best performance on fine-tuning their Instruct models, your dataset should be adjusted to match their pattern.\n# fast think pattern\nmessages = [\n {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n {\"role\": \"user\", \"content\": \"/no_think What color is the sun?\" },\n {\"role\": \"assistant\", \"content\": \"<think>\\n\\n</think>\\n<answer>\\nThe sun is yellow.\\n</answer>\"}\n]\n\n# slow think pattern\nmessages = [\n {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n {\"role\": \"user\", \"content\": \"/no_think What color is the sun?\" },\n {\"role\": \"assistant\", \"content\": \"<think>\\nThe user is asking about the color of the sun. I need to ...\\n</think>\\n<answer>\\nThe sun is yellow.\\n</answer>\"}\n]\n\n\nTIPS\n\nFor inference, the official Tencent team recommends\n\n\n{\n \"do_sample\": true,\n \"top_k\": 20,\n \"top_p\": 0.8,\n \"repetition_penalty\": 1.05,\n \"temperature\": 0.7\n}\n\nYou can run a full finetuning by removing the adapter: qlora and load_in_4bit: true from the config.\nRead more on how to load your own dataset at docs.\nThe dataset format follows the OpenAI Messages format as seen here.",
"crumbs": [
"Getting Started",
"Model Guides",
@@ -3126,7 +3126,7 @@
"href": "index.html#quick-start---llm-fine-tuning-in-minutes",
"title": "Axolotl",
"section": "🚀 Quick Start - LLM Fine-tuning in Minutes",
- "text": "🚀 Quick Start - LLM Fine-tuning in Minutes\nRequirements:\n\nNVIDIA GPU (Ampere or newer for bf16 and Flash Attention) or AMD GPU\nPython 3.11\nPyTorch ≥2.8.0\n\n\nGoogle Colab\n\n\n\nOpen In Colab\n\n\n\n\nInstallation\n\nUsing pip\npip3 install -U packaging==23.2 setuptools==75.8.0 wheel ninja\npip3 install --no-build-isolation axolotl[flash-attn,deepspeed]\n\n# Download example axolotl configs, deepspeed configs\naxolotl fetch examples\naxolotl fetch deepspeed_configs # OPTIONAL\n\n\nUsing Docker\nInstalling with Docker can be less error prone than installing in your own environment.\ndocker run --gpus '\"all\"' --rm -it axolotlai/axolotl:main-latest\nOther installation approaches are described here.\n\n\nCloud Providers\n\n\nRunPod\nVast.ai\nPRIME Intellect\nModal\nNovita\nJarvisLabs.ai\nLatitude.sh\n\n\n\n\n\nYour First Fine-tune\n# Fetch axolotl examples\naxolotl fetch examples\n\n# Or, specify a custom path\naxolotl fetch examples --dest path/to/folder\n\n# Train a model using LoRA\naxolotl train examples/llama-3/lora-1b.yml\nThat’s it! Check out our Getting Started Guide for a more detailed walkthrough.",
+ "text": "🚀 Quick Start - LLM Fine-tuning in Minutes\nRequirements:\n\nNVIDIA GPU (Ampere or newer for bf16 and Flash Attention) or AMD GPU\nPython 3.11\nPyTorch ≥2.8.0\n\n\nGoogle Colab\n\n\n\nOpen In Colab\n\n\n\n\nInstallation\n\nUsing pip\npip3 install -U packaging==26.0 setuptools==75.8.0 wheel ninja\npip3 install --no-build-isolation axolotl[flash-attn,deepspeed]\n\n# Download example axolotl configs, deepspeed configs\naxolotl fetch examples\naxolotl fetch deepspeed_configs # OPTIONAL\n\n\nUsing Docker\nInstalling with Docker can be less error prone than installing in your own environment.\ndocker run --gpus '\"all\"' --rm -it axolotlai/axolotl:main-latest\nOther installation approaches are described here.\n\n\nCloud Providers\n\n\nRunPod\nVast.ai\nPRIME Intellect\nModal\nNovita\nJarvisLabs.ai\nLatitude.sh\n\n\n\n\n\nYour First Fine-tune\n# Fetch axolotl examples\naxolotl fetch examples\n\n# Or, specify a custom path\naxolotl fetch examples --dest path/to/folder\n\n# Train a model using LoRA\naxolotl train examples/llama-3/lora-1b.yml\nThat’s it! Check out our Getting Started Guide for a more detailed walkthrough.",
"crumbs": [
"Home"
]
@@ -4351,7 +4351,7 @@
"href": "docs/models/gemma3n.html#getting-started",
"title": "Gemma 3n",
"section": "Getting started",
- "text": "Getting started\n\nInstall Axolotl following the installation guide.\nHere is an example of how to install from pip:\n\n# Ensure you have Pytorch installed (Pytorch 2.6.0 min)\npip3 install packaging==23.2 setuptools==75.8.0 wheel ninja\npip3 install --no-build-isolation 'axolotl[flash-attn]>=0.12.0'\n\nIn addition to Axolotl’s requirements, Gemma-3n requires:\n\npip3 install timm==1.0.17\n\n# for loading audio data\npip3 install librosa==0.11.0\n\nDownload sample dataset files\n\n# for text + vision + audio only\nwget https://huggingface.co/datasets/Nanobit/text-vision-audio-2k-test/resolve/main/African_elephant.jpg\nwget https://huggingface.co/datasets/Nanobit/text-vision-audio-2k-test/resolve/main/En-us-African_elephant.oga\n\nRun the finetuning example:\n\n# text only\naxolotl train examples/gemma3n/gemma-3n-e2b-qlora.yml\n\n# text + vision\naxolotl train examples/gemma3n/gemma-3n-e2b-vision-qlora.yml\n\n# text + vision + audio\naxolotl train examples/gemma3n/gemma-3n-e2b-vision-audio-qlora.yml\nLet us know how it goes. Happy finetuning! 🚀\nWARNING: The loss and grad norm will be much higher than normal. We suspect this to be inherent to the model as of the moment. If anyone would like to submit a fix for this, we are happy to take a look.\n\nTIPS\n\nYou can run a full finetuning by removing the adapter: qlora and load_in_4bit: true from the config.\nRead more on how to load your own dataset at docs.\nThe text dataset format follows the OpenAI Messages format as seen here.\nThe multimodal dataset format follows the OpenAI multi-content Messages format as seen here.",
+ "text": "Getting started\n\nInstall Axolotl following the installation guide.\nHere is an example of how to install from pip:\n\n# Ensure you have Pytorch installed (Pytorch 2.6.0 min)\npip3 install packaging==26.0 setuptools==75.8.0 wheel ninja\npip3 install --no-build-isolation 'axolotl[flash-attn]>=0.12.0'\n\nIn addition to Axolotl’s requirements, Gemma-3n requires:\n\npip3 install timm==1.0.17\n\n# for loading audio data\npip3 install librosa==0.11.0\n\nDownload sample dataset files\n\n# for text + vision + audio only\nwget https://huggingface.co/datasets/Nanobit/text-vision-audio-2k-test/resolve/main/African_elephant.jpg\nwget https://huggingface.co/datasets/Nanobit/text-vision-audio-2k-test/resolve/main/En-us-African_elephant.oga\n\nRun the finetuning example:\n\n# text only\naxolotl train examples/gemma3n/gemma-3n-e2b-qlora.yml\n\n# text + vision\naxolotl train examples/gemma3n/gemma-3n-e2b-vision-qlora.yml\n\n# text + vision + audio\naxolotl train examples/gemma3n/gemma-3n-e2b-vision-audio-qlora.yml\nLet us know how it goes. Happy finetuning! 🚀\nWARNING: The loss and grad norm will be much higher than normal. We suspect this to be inherent to the model as of the moment. If anyone would like to submit a fix for this, we are happy to take a look.\n\nTIPS\n\nYou can run a full finetuning by removing the adapter: qlora and load_in_4bit: true from the config.\nRead more on how to load your own dataset at docs.\nThe text dataset format follows the OpenAI Messages format as seen here.\nThe multimodal dataset format follows the OpenAI multi-content Messages format as seen here.",
"crumbs": [
"Getting Started",
"Model Guides",
@@ -4399,7 +4399,7 @@
"href": "docs/models/qwen3-next.html#getting-started",
"title": "Qwen 3 Next",
"section": "Getting started",
- "text": "Getting started\n\nInstall Axolotl following the installation guide. You need to install from main as Qwen3-Next is only on nightly or use our latest Docker images.\nHere is an example of how to install from main for pip:\n\n# Ensure you have Pytorch installed (Pytorch 2.6.0 min)\ngit clone https://github.com/axolotl-ai-cloud/axolotl.git\ncd axolotl\n\npip3 install packaging==23.2 setuptools==75.8.0 wheel ninja\npip3 install --no-build-isolation -e '.[flash-attn]'\n\n# Install CCE https://docs.axolotl.ai/docs/custom_integrations.html#cut-cross-entropy\npython scripts/cutcrossentropy_install.py | sh\n\nInstall Qwen3-Next transformers commit\n\npip3 uninstall -y transformers && pip3 install \"git+https://github.com/huggingface/transformers.git@b9282355bea846b54ed850a066901496b19da654\"\n\nInstall FLA for improved performance\n\npip3 uninstall -y causal-conv1d && pip3 install flash-linear-attention==0.3.2\n\nRun the finetuning example:\n\naxolotl train examples/qwen3-next/qwen3-next-80b-a3b-qlora.yaml\nThis config uses about 45.62 GiB VRAM.\nLet us know how it goes. Happy finetuning! 🚀\n\nTIPS\n\nFor inference, you can experiment with temperature: 0.7, top_p: 0.8, top_k: 20, and min_p: 0.\nYou can run a full finetuning by removing the adapter: qlora and load_in_4bit: true from the config. See Multi-GPU section below.\nRead more on how to load your own dataset at docs.\nThe dataset format follows the OpenAI Messages format as seen here.",
+ "text": "Getting started\n\nInstall Axolotl following the installation guide. You need to install from main as Qwen3-Next is only on nightly or use our latest Docker images.\nHere is an example of how to install from main for pip:\n\n# Ensure you have Pytorch installed (Pytorch 2.6.0 min)\ngit clone https://github.com/axolotl-ai-cloud/axolotl.git\ncd axolotl\n\npip3 install packaging==26.0 setuptools==75.8.0 wheel ninja\npip3 install --no-build-isolation -e '.[flash-attn]'\n\n# Install CCE https://docs.axolotl.ai/docs/custom_integrations.html#cut-cross-entropy\npython scripts/cutcrossentropy_install.py | sh\n\nInstall Qwen3-Next transformers commit\n\npip3 uninstall -y transformers && pip3 install \"git+https://github.com/huggingface/transformers.git@b9282355bea846b54ed850a066901496b19da654\"\n\nInstall FLA for improved performance\n\npip3 uninstall -y causal-conv1d && pip3 install flash-linear-attention==0.3.2\n\nRun the finetuning example:\n\naxolotl train examples/qwen3-next/qwen3-next-80b-a3b-qlora.yaml\nThis config uses about 45.62 GiB VRAM.\nLet us know how it goes. Happy finetuning! 🚀\n\nTIPS\n\nFor inference, you can experiment with temperature: 0.7, top_p: 0.8, top_k: 20, and min_p: 0.\nYou can run a full finetuning by removing the adapter: qlora and load_in_4bit: true from the config. See Multi-GPU section below.\nRead more on how to load your own dataset at docs.\nThe dataset format follows the OpenAI Messages format as seen here.",
"crumbs": [
"Getting Started",
"Model Guides",
@@ -4614,7 +4614,7 @@
"href": "docs/models/arcee.html#getting-started",
"title": "Arcee AFM",
"section": "Getting started",
- "text": "Getting started\n\nInstall Axolotl following the installation guide. You need to install from main as AFM is only on nightly or use our latest Docker images.\nHere is an example of how to install from main for pip:\n\n# Ensure you have Pytorch installed (Pytorch 2.6.0 min)\ngit clone https://github.com/axolotl-ai-cloud/axolotl.git\ncd axolotl\n\npip3 install packaging==23.2 setuptools==75.8.0 wheel ninja\npip3 install --no-build-isolation -e '.[flash-attn]'\n\n# Install CCE https://docs.axolotl.ai/docs/custom_integrations.html#cut-cross-entropy\npython scripts/cutcrossentropy_install.py | sh\n\nRun the finetuning example:\n\naxolotl train examples/arcee/afm-4.5b-qlora.yaml\nThis config uses about 7.8GiB VRAM.\nLet us know how it goes. Happy finetuning! 🚀\n\nTIPS\n\nFor inference, the official Arcee.ai team recommends top_p: 0.95, temperature: 0.5, top_k: 50, and repeat_penalty: 1.1.\nYou can run a full finetuning by removing the adapter: qlora and load_in_4bit: true from the config.\nRead more on how to load your own dataset at docs.\nThe dataset format follows the OpenAI Messages format as seen here.",
+ "text": "Getting started\n\nInstall Axolotl following the installation guide. You need to install from main as AFM is only on nightly or use our latest Docker images.\nHere is an example of how to install from main for pip:\n\n# Ensure you have Pytorch installed (Pytorch 2.6.0 min)\ngit clone https://github.com/axolotl-ai-cloud/axolotl.git\ncd axolotl\n\npip3 install packaging==26.0 setuptools==75.8.0 wheel ninja\npip3 install --no-build-isolation -e '.[flash-attn]'\n\n# Install CCE https://docs.axolotl.ai/docs/custom_integrations.html#cut-cross-entropy\npython scripts/cutcrossentropy_install.py | sh\n\nRun the finetuning example:\n\naxolotl train examples/arcee/afm-4.5b-qlora.yaml\nThis config uses about 7.8GiB VRAM.\nLet us know how it goes. Happy finetuning! 🚀\n\nTIPS\n\nFor inference, the official Arcee.ai team recommends top_p: 0.95, temperature: 0.5, top_k: 50, and repeat_penalty: 1.1.\nYou can run a full finetuning by removing the adapter: qlora and load_in_4bit: true from the config.\nRead more on how to load your own dataset at docs.\nThe dataset format follows the OpenAI Messages format as seen here.",
"crumbs": [
"Getting Started",
"Model Guides",
@@ -4711,7 +4711,7 @@
"href": "docs/models/magistral.html#getting-started",
"title": "Magistral",
"section": "Getting started",
- "text": "Getting started\n\nInstall Axolotl following the installation guide.\nHere is an example of how to install from pip:\n\n# Ensure you have Pytorch installed (Pytorch 2.7.0 min)\npip3 install packaging==23.2 setuptools==75.8.0 wheel ninja\npip3 install --no-build-isolation 'axolotl[flash-attn]>=0.12.0'\n\nInstall Cut Cross Entropy to reduce training VRAM usage\n\npython scripts/cutcrossentropy_install.py | sh\n\nRun the finetuning example:\n\naxolotl train examples/magistral/magistral-small-qlora.yaml\nThis config uses about 24GB VRAM.\nLet us know how it goes. Happy finetuning! 🚀\n\nThinking\nMistralAI has released their 2507 model with thinking capabilities, enabling Chain-of-Thought reasoning with explicit thinking steps.\n📚 See the Thinking fine-tuning guide →\n\n\nVision\nMistralAI has released their 2509 model with vision capabilities.\n📚 See the Vision fine-tuning guide →\n\n\nTips\n\nWe recommend adding the same/similar SystemPrompt that the model is tuned for. You can find this within the repo’s files titled SYSTEM_PROMPT.txt.\nFor inference, the official MistralAI team recommends top_p: 0.95 and temperature: 0.7 with max_tokens: 40960.\nYou can run a full finetuning by removing the adapter: qlora and load_in_4bit: true from the config.\nRead more on how to load your own dataset at docs.\nThe text dataset format follows the OpenAI Messages format as seen here.",
+ "text": "Getting started\n\nInstall Axolotl following the installation guide.\nHere is an example of how to install from pip:\n\n# Ensure you have Pytorch installed (Pytorch 2.7.0 min)\npip3 install packaging==26.0 setuptools==75.8.0 wheel ninja\npip3 install --no-build-isolation 'axolotl[flash-attn]>=0.12.0'\n\nInstall Cut Cross Entropy to reduce training VRAM usage\n\npython scripts/cutcrossentropy_install.py | sh\n\nRun the finetuning example:\n\naxolotl train examples/magistral/magistral-small-qlora.yaml\nThis config uses about 24GB VRAM.\nLet us know how it goes. Happy finetuning! 🚀\n\nThinking\nMistralAI has released their 2507 model with thinking capabilities, enabling Chain-of-Thought reasoning with explicit thinking steps.\n📚 See the Thinking fine-tuning guide →\n\n\nVision\nMistralAI has released their 2509 model with vision capabilities.\n📚 See the Vision fine-tuning guide →\n\n\nTips\n\nWe recommend adding the same/similar SystemPrompt that the model is tuned for. You can find this within the repo’s files titled SYSTEM_PROMPT.txt.\nFor inference, the official MistralAI team recommends top_p: 0.95 and temperature: 0.7 with max_tokens: 40960.\nYou can run a full finetuning by removing the adapter: qlora and load_in_4bit: true from the config.\nRead more on how to load your own dataset at docs.\nThe text dataset format follows the OpenAI Messages format as seen here.",
"crumbs": [
"Getting Started",
"Model Guides",
@@ -4788,7 +4788,7 @@
"href": "docs/models/voxtral.html#getting-started",
"title": "Voxtral",
"section": "Getting started",
- "text": "Getting started\n\nInstall Axolotl following the installation guide.\nHere is an example of how to install from pip:\n\n# Ensure you have Pytorch installed (Pytorch 2.6.0 min)\npip3 install packaging==23.2 setuptools==75.8.0 wheel ninja\npip3 install --no-build-isolation 'axolotl[flash-attn]>=0.12.0'\n\nPlease install the below.\n\n# audio\npip3 install librosa==0.11.0\npip3 install 'mistral_common[audio]==1.8.3'\n\n# Install CCE https://docs.axolotl.ai/docs/custom_integrations.html#cut-cross-entropy\npython scripts/cutcrossentropy_install.py | sh\n\nDownload sample dataset files\n\n# for text + audio only\nwget https://huggingface.co/datasets/Nanobit/text-audio-2k-test/resolve/main/En-us-African_elephant.oga\n\nRun the finetuning example:\n\n# text only\naxolotl train examples/voxtral/voxtral-mini-qlora.yml\n\n# text + audio\naxolotl train examples/voxtral/voxtral-mini-audio-qlora.yml\nThese configs use about 4.8 GB VRAM.\nLet us know how it goes. Happy finetuning! 🚀\n\nTIPS\n\nFor inference, the official MistralAI team recommends temperature: 0.2 and top_p: 0.95 for audio understanding and temperature: 0.0 for transcription.\nYou can run a full finetuning by removing the adapter: qlora and load_in_4bit: true from the config.\nRead more on how to load your own dataset at docs.\nThe text dataset format follows the OpenAI Messages format as seen here.\nThe multimodal dataset format follows the OpenAI multi-content Messages format as seen here.",
+ "text": "Getting started\n\nInstall Axolotl following the installation guide.\nHere is an example of how to install from pip:\n\n# Ensure you have Pytorch installed (Pytorch 2.6.0 min)\npip3 install packaging==26.0 setuptools==75.8.0 wheel ninja\npip3 install --no-build-isolation 'axolotl[flash-attn]>=0.12.0'\n\nPlease install the below.\n\n# audio\npip3 install librosa==0.11.0\npip3 install 'mistral_common[audio]==1.8.3'\n\n# Install CCE https://docs.axolotl.ai/docs/custom_integrations.html#cut-cross-entropy\npython scripts/cutcrossentropy_install.py | sh\n\nDownload sample dataset files\n\n# for text + audio only\nwget https://huggingface.co/datasets/Nanobit/text-audio-2k-test/resolve/main/En-us-African_elephant.oga\n\nRun the finetuning example:\n\n# text only\naxolotl train examples/voxtral/voxtral-mini-qlora.yml\n\n# text + audio\naxolotl train examples/voxtral/voxtral-mini-audio-qlora.yml\nThese configs use about 4.8 GB VRAM.\nLet us know how it goes. Happy finetuning! 🚀\n\nTIPS\n\nFor inference, the official MistralAI team recommends temperature: 0.2 and top_p: 0.95 for audio understanding and temperature: 0.0 for transcription.\nYou can run a full finetuning by removing the adapter: qlora and load_in_4bit: true from the config.\nRead more on how to load your own dataset at docs.\nThe text dataset format follows the OpenAI Messages format as seen here.\nThe multimodal dataset format follows the OpenAI multi-content Messages format as seen here.",
"crumbs": [
"Getting Started",
"Model Guides",
@@ -5040,7 +5040,7 @@
"href": "docs/models/devstral.html#getting-started",
"title": "Devstral",
"section": "Getting started",
- "text": "Getting started\n\nInstall Axolotl following the installation guide.\nHere is an example of how to install from pip:\n\n# Ensure you have Pytorch installed (Pytorch 2.6.0 min)\npip3 install packaging==23.2 setuptools==75.8.0 wheel ninja\npip3 install --no-build-isolation 'axolotl[flash-attn]>=0.12.0'\n\nInstall Cut Cross Entropy to reduce training VRAM usage\n\npython scripts/cutcrossentropy_install.py | sh\n\nRun the finetuning example:\n\naxolotl train examples/devstral/devstral-small-qlora.yml\nThis config uses about 21GB VRAM.\nLet us know how it goes. Happy finetuning! 🚀\n\nTIPS\n\nYou can run a full finetuning by removing the adapter: qlora and load_in_4bit: true from the config.\nRead more on how to load your own dataset at docs.\nThe dataset format follows the OpenAI Messages format as seen here.\nLearn how to use function calling with Axolotl at docs.",
+ "text": "Getting started\n\nInstall Axolotl following the installation guide.\nHere is an example of how to install from pip:\n\n# Ensure you have Pytorch installed (Pytorch 2.6.0 min)\npip3 install packaging==26.0 setuptools==75.8.0 wheel ninja\npip3 install --no-build-isolation 'axolotl[flash-attn]>=0.12.0'\n\nInstall Cut Cross Entropy to reduce training VRAM usage\n\npython scripts/cutcrossentropy_install.py | sh\n\nRun the finetuning example:\n\naxolotl train examples/devstral/devstral-small-qlora.yml\nThis config uses about 21GB VRAM.\nLet us know how it goes. Happy finetuning! 🚀\n\nTIPS\n\nYou can run a full finetuning by removing the adapter: qlora and load_in_4bit: true from the config.\nRead more on how to load your own dataset at docs.\nThe dataset format follows the OpenAI Messages format as seen here.\nLearn how to use function calling with Axolotl at docs.",
"crumbs": [
"Getting Started",
"Model Guides",
diff --git a/sitemap.xml b/sitemap.xml
index 69a09ea21..261d89090 100644
--- a/sitemap.xml
+++ b/sitemap.xml
@@ -2,942 +2,942 @@
https://docs.axolotl.ai/src/axolotl/integrations/cut_cross_entropy/ACKNOWLEDGEMENTS.html
- 2026-01-21T22:25:37.044Z
+ 2026-01-22T01:01:27.576Z
https://docs.axolotl.ai/docs/mac.html
- 2026-01-21T22:25:37.017Z
+ 2026-01-22T01:01:27.542Z
https://docs.axolotl.ai/docs/cli.html
- 2026-01-21T22:25:37.012Z
+ 2026-01-22T01:01:27.537Z
https://docs.axolotl.ai/docs/mixed_precision.html
- 2026-01-21T22:25:37.017Z
+ 2026-01-22T01:01:27.542Z
https://docs.axolotl.ai/docs/installation.html
- 2026-01-21T22:25:37.017Z
+ 2026-01-22T01:01:27.542Z
https://docs.axolotl.ai/docs/dataset_loading.html
- 2026-01-21T22:25:37.013Z
+ 2026-01-22T01:01:27.538Z
https://docs.axolotl.ai/docs/sequence_parallelism.html
- 2026-01-21T22:25:37.018Z
+ 2026-01-22T01:01:27.544Z
https://docs.axolotl.ai/docs/optimizations.html
- 2026-01-21T22:25:37.017Z
+ 2026-01-22T01:01:27.543Z
https://docs.axolotl.ai/docs/gradient_checkpointing.html
- 2026-01-21T22:25:37.014Z
+ 2026-01-22T01:01:27.538Z
https://docs.axolotl.ai/docs/streaming.html
- 2026-01-21T22:25:37.018Z
+ 2026-01-22T01:01:27.544Z
https://docs.axolotl.ai/docs/lora_optims.html
- 2026-01-21T22:25:37.017Z
+ 2026-01-22T01:01:27.542Z
https://docs.axolotl.ai/docs/amd_hpc.html
- 2026-01-21T22:25:37.012Z
+ 2026-01-22T01:01:27.537Z
https://docs.axolotl.ai/docs/debugging.html
- 2026-01-21T22:25:37.013Z
+ 2026-01-22T01:01:27.538Z
https://docs.axolotl.ai/docs/dataset-formats/conversation.html
- 2026-01-21T22:25:37.013Z
+ 2026-01-22T01:01:27.537Z
https://docs.axolotl.ai/docs/dataset-formats/inst_tune.html
- 2026-01-21T22:25:37.013Z
+ 2026-01-22T01:01:27.538Z
https://docs.axolotl.ai/docs/dataset-formats/index.html
- 2026-01-21T22:25:37.013Z
+ 2026-01-22T01:01:27.538Z
https://docs.axolotl.ai/docs/config-reference.html
- 2026-01-21T22:29:08.085Z
+ 2026-01-22T01:05:22.214Z
https://docs.axolotl.ai/docs/multimodal.html
- 2026-01-21T22:25:37.017Z
+ 2026-01-22T01:01:27.542Z
https://docs.axolotl.ai/docs/ray-integration.html
- 2026-01-21T22:25:37.018Z
+ 2026-01-22T01:01:27.543Z
https://docs.axolotl.ai/docs/faq.html
- 2026-01-21T22:25:37.013Z
+ 2026-01-22T01:01:27.538Z
https://docs.axolotl.ai/docs/dataset_preprocessing.html
- 2026-01-21T22:25:37.013Z
+ 2026-01-22T01:01:27.538Z
https://docs.axolotl.ai/docs/torchao.html
- 2026-01-21T22:25:37.019Z
+ 2026-01-22T01:01:27.544Z
https://docs.axolotl.ai/docs/optimizers.html
- 2026-01-21T22:25:37.018Z
+ 2026-01-22T01:01:27.543Z
https://docs.axolotl.ai/docs/models/apertus.html
- 2026-01-21T22:29:08.616Z
+ 2026-01-22T01:05:22.799Z
https://docs.axolotl.ai/docs/models/ministral3/think.html
- 2026-01-21T22:29:08.611Z
+ 2026-01-22T01:05:22.790Z
https://docs.axolotl.ai/docs/models/gpt-oss.html
- 2026-01-21T22:29:08.616Z
+ 2026-01-22T01:05:22.799Z
https://docs.axolotl.ai/docs/models/phi.html
- 2026-01-21T22:29:08.618Z
+ 2026-01-22T01:05:22.800Z
https://docs.axolotl.ai/docs/models/olmo3.html
- 2026-01-21T22:29:08.609Z
+ 2026-01-22T01:05:22.788Z
https://docs.axolotl.ai/docs/models/granite4.html
- 2026-01-21T22:29:08.618Z
+ 2026-01-22T01:05:22.801Z
https://docs.axolotl.ai/docs/models/seed-oss.html
- 2026-01-21T22:29:08.617Z
+ 2026-01-22T01:05:22.799Z
https://docs.axolotl.ai/docs/models/qwen3.html
- 2026-01-21T22:29:08.615Z
+ 2026-01-22T01:05:22.798Z
https://docs.axolotl.ai/docs/models/orpheus.html
- 2026-01-21T22:29:08.620Z
+ 2026-01-22T01:05:22.802Z
https://docs.axolotl.ai/docs/models/hunyuan.html
- 2026-01-21T22:29:08.619Z
+ 2026-01-22T01:05:22.802Z
https://docs.axolotl.ai/docs/models/mistral.html
- 2026-01-21T22:29:08.614Z
+ 2026-01-22T01:05:22.794Z
https://docs.axolotl.ai/docs/models/mistral-small.html
- 2026-01-21T22:29:08.613Z
+ 2026-01-22T01:05:22.793Z
https://docs.axolotl.ai/docs/models/smolvlm2.html
- 2026-01-21T22:29:08.618Z
+ 2026-01-22T01:05:22.800Z
https://docs.axolotl.ai/docs/models/llama-2.html
- 2026-01-21T22:29:08.615Z
+ 2026-01-22T01:05:22.794Z
https://docs.axolotl.ai/docs/models/magistral/vision.html
- 2026-01-21T22:29:08.612Z
+ 2026-01-22T01:05:22.792Z
https://docs.axolotl.ai/docs/models/jamba.html
- 2026-01-21T22:29:08.619Z
+ 2026-01-22T01:05:22.802Z
https://docs.axolotl.ai/docs/models/mimo.html
- 2026-01-21T22:29:08.608Z
+ 2026-01-22T01:05:22.788Z
https://docs.axolotl.ai/docs/api/utils.schedulers.html
- 2026-01-21T22:28:51.891Z
+ 2026-01-22T01:05:05.076Z
https://docs.axolotl.ai/docs/api/cli.utils.sweeps.html
- 2026-01-21T22:28:51.083Z
+ 2026-01-22T01:05:04.270Z
https://docs.axolotl.ai/docs/api/datasets.html
- 2026-01-21T22:28:50.676Z
+ 2026-01-22T01:05:03.862Z
https://docs.axolotl.ai/docs/api/utils.tokenization.html
- 2026-01-21T22:28:51.807Z
+ 2026-01-22T01:05:04.993Z
https://docs.axolotl.ai/docs/api/loaders.tokenizer.html
- 2026-01-21T22:28:51.200Z
+ 2026-01-22T01:05:04.386Z
https://docs.axolotl.ai/docs/api/monkeypatch.llama_expand_mask.html
- 2026-01-21T22:28:51.678Z
+ 2026-01-22T01:05:04.865Z
https://docs.axolotl.ai/docs/api/monkeypatch.gradient_checkpointing.offload_cpu.html
- 2026-01-21T22:28:51.766Z
+ 2026-01-22T01:05:04.952Z
https://docs.axolotl.ai/docs/api/utils.data.sft.html
- 2026-01-21T22:28:51.942Z
+ 2026-01-22T01:05:05.128Z
https://docs.axolotl.ai/docs/api/monkeypatch.transformers_fa_utils.html
- 2026-01-21T22:28:51.746Z
+ 2026-01-22T01:05:04.932Z
https://docs.axolotl.ai/docs/api/loaders.patch_manager.html
- 2026-01-21T22:28:51.228Z
+ 2026-01-22T01:05:04.414Z
https://docs.axolotl.ai/docs/api/integrations.liger.args.html
- 2026-01-21T22:28:52.269Z
+ 2026-01-22T01:05:05.452Z
https://docs.axolotl.ai/docs/api/utils.schemas.peft.html
- 2026-01-21T22:28:52.033Z
+ 2026-01-22T01:05:05.217Z
https://docs.axolotl.ai/docs/api/prompt_strategies.pygmalion.html
- 2026-01-21T22:28:51.422Z
+ 2026-01-22T01:05:04.608Z
https://docs.axolotl.ai/docs/api/prompt_strategies.alpaca_instruct.html
- 2026-01-21T22:28:51.339Z
+ 2026-01-22T01:05:04.525Z
https://docs.axolotl.ai/docs/api/cli.cloud.base.html
- 2026-01-21T22:28:51.039Z
+ 2026-01-22T01:05:04.226Z
https://docs.axolotl.ai/docs/api/monkeypatch.gradient_checkpointing.offload_disk.html
- 2026-01-21T22:28:51.798Z
+ 2026-01-22T01:05:04.984Z
https://docs.axolotl.ai/docs/api/kernels.swiglu.html
- 2026-01-21T22:28:51.649Z
+ 2026-01-22T01:05:04.835Z
https://docs.axolotl.ai/docs/api/integrations.cut_cross_entropy.args.html
- 2026-01-21T22:28:52.254Z
+ 2026-01-22T01:05:05.437Z
https://docs.axolotl.ai/docs/api/prompt_strategies.kto.user_defined.html
- 2026-01-21T22:28:51.489Z
+ 2026-01-22T01:05:04.675Z
https://docs.axolotl.ai/docs/api/monkeypatch.utils.html
- 2026-01-21T22:28:51.723Z
+ 2026-01-22T01:05:04.909Z
https://docs.axolotl.ai/docs/api/core.builders.rl.html
- 2026-01-21T22:28:50.776Z
+ 2026-01-22T01:05:03.962Z
https://docs.axolotl.ai/docs/api/loaders.processor.html
- 2026-01-21T22:28:51.202Z
+ 2026-01-22T01:05:04.388Z
https://docs.axolotl.ai/docs/api/utils.callbacks.lisa.html
- 2026-01-21T22:28:52.399Z
+ 2026-01-22T01:05:05.582Z
https://docs.axolotl.ai/docs/api/core.training_args.html
- 2026-01-21T22:28:50.792Z
+ 2026-01-22T01:05:03.979Z
https://docs.axolotl.ai/docs/api/loaders.adapter.html
- 2026-01-21T22:28:51.209Z
+ 2026-01-22T01:05:04.395Z
https://docs.axolotl.ai/docs/api/cli.merge_sharded_fsdp_weights.html
- 2026-01-21T22:28:51.010Z
+ 2026-01-22T01:05:04.197Z
https://docs.axolotl.ai/docs/api/cli.train.html
- 2026-01-21T22:28:50.894Z
+ 2026-01-22T01:05:04.080Z
https://docs.axolotl.ai/docs/api/core.trainers.mixins.rng_state_loader.html
- 2026-01-21T22:28:51.241Z
+ 2026-01-22T01:05:04.427Z
https://docs.axolotl.ai/docs/api/prompt_strategies.completion.html
- 2026-01-21T22:28:51.387Z
+ 2026-01-22T01:05:04.574Z
https://docs.axolotl.ai/docs/api/prompt_strategies.stepwise_supervised.html
- 2026-01-21T22:28:51.400Z
+ 2026-01-22T01:05:04.587Z
https://docs.axolotl.ai/docs/api/monkeypatch.lora_kernels.html
- 2026-01-21T22:28:51.713Z
+ 2026-01-22T01:05:04.899Z
https://docs.axolotl.ai/docs/api/prompt_strategies.messages.chat.html
- 2026-01-21T22:28:51.427Z
+ 2026-01-22T01:05:04.613Z
https://docs.axolotl.ai/docs/api/prompt_strategies.user_defined.html
- 2026-01-21T22:28:51.364Z
+ 2026-01-22T01:05:04.550Z
https://docs.axolotl.ai/docs/api/core.chat.messages.html
- 2026-01-21T22:28:50.822Z
+ 2026-01-22T01:05:04.008Z
https://docs.axolotl.ai/docs/api/core.trainers.mixins.scheduler.html
- 2026-01-21T22:28:51.249Z
+ 2026-01-22T01:05:04.435Z
https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.user_defined.html
- 2026-01-21T22:28:51.465Z
+ 2026-01-22T01:05:04.651Z
https://docs.axolotl.ai/docs/api/prompt_strategies.kto.llama3.html
- 2026-01-21T22:28:51.477Z
+ 2026-01-22T01:05:04.663Z
https://docs.axolotl.ai/docs/api/utils.schemas.integrations.html
- 2026-01-21T22:28:52.064Z
+ 2026-01-22T01:05:05.248Z
https://docs.axolotl.ai/docs/api/convert.html
- 2026-01-21T22:28:50.692Z
+ 2026-01-22T01:05:03.879Z
https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.passthrough.html
- 2026-01-21T22:28:51.467Z
+ 2026-01-22T01:05:04.653Z
https://docs.axolotl.ai/docs/api/utils.schemas.config.html
- 2026-01-21T22:28:51.982Z
+ 2026-01-22T01:05:05.167Z
https://docs.axolotl.ai/docs/api/utils.schemas.enums.html
- 2026-01-21T22:28:52.075Z
+ 2026-01-22T01:05:05.259Z
https://docs.axolotl.ai/docs/api/monkeypatch.btlm_attn_hijack_flash.html
- 2026-01-21T22:28:51.725Z
+ 2026-01-22T01:05:04.911Z
https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.chat_template.html
- 2026-01-21T22:28:51.435Z
+ 2026-01-22T01:05:04.621Z
https://docs.axolotl.ai/docs/api/core.trainers.grpo.trainer.html
- 2026-01-21T22:28:51.162Z
+ 2026-01-22T01:05:04.348Z
https://docs.axolotl.ai/docs/api/integrations.lm_eval.args.html
- 2026-01-21T22:28:52.273Z
+ 2026-01-22T01:05:05.456Z
https://docs.axolotl.ai/docs/api/utils.collators.core.html
- 2026-01-21T22:28:52.302Z
+ 2026-01-22T01:05:05.485Z
https://docs.axolotl.ai/docs/api/core.chat.format.shared.html
- 2026-01-21T22:28:50.827Z
+ 2026-01-22T01:05:04.014Z
https://docs.axolotl.ai/docs/api/prompt_strategies.orpo.chat_template.html
- 2026-01-21T22:28:51.515Z
+ 2026-01-22T01:05:04.701Z
https://docs.axolotl.ai/docs/api/utils.samplers.multipack.html
- 2026-01-21T22:28:52.385Z
+ 2026-01-22T01:05:05.568Z
https://docs.axolotl.ai/docs/api/utils.callbacks.qat.html
- 2026-01-21T22:28:52.417Z
+ 2026-01-22T01:05:05.599Z
https://docs.axolotl.ai/docs/api/prompt_strategies.chat_template.html
- 2026-01-21T22:28:51.320Z
+ 2026-01-22T01:05:04.506Z
https://docs.axolotl.ai/docs/api/utils.schemas.multimodal.html
- 2026-01-21T22:28:52.044Z
+ 2026-01-22T01:05:05.228Z
https://docs.axolotl.ai/docs/api/utils.callbacks.comet_.html
- 2026-01-21T22:28:52.408Z
+ 2026-01-22T01:05:05.591Z
https://docs.axolotl.ai/docs/api/prompt_strategies.base.html
- 2026-01-21T22:28:51.279Z
+ 2026-01-22T01:05:04.465Z
https://docs.axolotl.ai/docs/api/kernels.utils.html
- 2026-01-21T22:28:51.660Z
+ 2026-01-22T01:05:04.846Z
https://docs.axolotl.ai/docs/api/cli.merge_lora.html
- 2026-01-21T22:28:50.996Z
+ 2026-01-22T01:05:04.182Z
https://docs.axolotl.ai/docs/api/cli.utils.html
- 2026-01-21T22:28:51.049Z
+ 2026-01-22T01:05:04.235Z
https://docs.axolotl.ai/docs/api/utils.ctx_managers.sequence_parallel.html
- 2026-01-21T22:28:51.277Z
+ 2026-01-22T01:05:04.463Z
https://docs.axolotl.ai/docs/api/index.html
- 2026-01-21T22:28:50.577Z
+ 2026-01-22T01:05:03.763Z
https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.llama3.html
- 2026-01-21T22:28:51.448Z
+ 2026-01-22T01:05:04.635Z
https://docs.axolotl.ai/docs/api/monkeypatch.mixtral.html
- 2026-01-21T22:28:51.761Z
+ 2026-01-22T01:05:04.947Z
https://docs.axolotl.ai/docs/api/prompt_strategies.orcamini.html
- 2026-01-21T22:28:51.413Z
+ 2026-01-22T01:05:04.600Z
https://docs.axolotl.ai/docs/api/core.trainers.grpo.sampler.html
- 2026-01-21T22:28:51.177Z
+ 2026-01-22T01:05:04.363Z
https://docs.axolotl.ai/docs/api/utils.lora.html
- 2026-01-21T22:28:51.815Z
+ 2026-01-22T01:05:05.001Z
https://docs.axolotl.ai/docs/api/core.trainers.mixins.optimizer.html
- 2026-01-21T22:28:51.237Z
+ 2026-01-22T01:05:04.423Z
https://docs.axolotl.ai/docs/api/cli.config.html
- 2026-01-21T22:28:50.962Z
+ 2026-01-22T01:05:04.149Z
https://docs.axolotl.ai/docs/api/monkeypatch.multipack.html
- 2026-01-21T22:28:51.672Z
+ 2026-01-22T01:05:04.858Z
https://docs.axolotl.ai/docs/api/utils.collators.batching.html
- 2026-01-21T22:28:52.325Z
+ 2026-01-22T01:05:05.508Z
https://docs.axolotl.ai/docs/api/utils.quantization.html
- 2026-01-21T22:28:51.966Z
+ 2026-01-22T01:05:05.151Z
https://docs.axolotl.ai/docs/api/utils.dict.html
- 2026-01-21T22:28:51.923Z
+ 2026-01-22T01:05:05.108Z
https://docs.axolotl.ai/docs/api/kernels.quantize.html
- 2026-01-21T22:28:51.658Z
+ 2026-01-22T01:05:04.844Z
https://docs.axolotl.ai/docs/api/utils.schemas.training.html
- 2026-01-21T22:28:52.000Z
+ 2026-01-22T01:05:05.184Z
https://docs.axolotl.ai/docs/api/train.html
- 2026-01-21T22:28:50.656Z
+ 2026-01-22T01:05:03.841Z
https://docs.axolotl.ai/docs/api/core.datasets.transforms.chat_builder.html
- 2026-01-21T22:28:50.843Z
+ 2026-01-22T01:05:04.029Z
https://docs.axolotl.ai/docs/inference.html
- 2026-01-21T22:25:37.017Z
+ 2026-01-22T01:01:27.542Z
https://docs.axolotl.ai/FAQS.html
- 2026-01-21T22:25:37.010Z
+ 2026-01-22T01:01:27.535Z
https://docs.axolotl.ai/examples/colab-notebooks/colab-axolotl-example.html
- 2026-01-21T22:25:37.024Z
+ 2026-01-22T01:01:27.549Z
https://docs.axolotl.ai/index.html
- 2026-01-21T22:25:37.039Z
+ 2026-01-22T01:01:27.568Z
https://docs.axolotl.ai/docs/custom_integrations.html
- 2026-01-21T22:25:37.012Z
+ 2026-01-22T01:01:27.537Z
https://docs.axolotl.ai/docs/api/utils.schemas.utils.html
- 2026-01-21T22:28:52.082Z
+ 2026-01-22T01:05:05.265Z
https://docs.axolotl.ai/docs/api/kernels.geglu.html
- 2026-01-21T22:28:51.637Z
+ 2026-01-22T01:05:04.823Z
https://docs.axolotl.ai/docs/api/core.builders.causal.html
- 2026-01-21T22:28:50.770Z
+ 2026-01-22T01:05:03.956Z
https://docs.axolotl.ai/docs/api/core.trainers.mamba.html
- 2026-01-21T22:28:51.140Z
+ 2026-01-22T01:05:04.327Z
https://docs.axolotl.ai/docs/api/prompt_strategies.bradley_terry.llama3.html
- 2026-01-21T22:28:51.520Z
+ 2026-01-22T01:05:04.706Z
https://docs.axolotl.ai/docs/api/core.datasets.chat.html
- 2026-01-21T22:28:50.834Z
+ 2026-01-22T01:05:04.020Z
https://docs.axolotl.ai/docs/api/utils.collators.mm_chat.html
- 2026-01-21T22:28:52.335Z
+ 2026-01-22T01:05:05.518Z
https://docs.axolotl.ai/docs/api/prompt_strategies.llama2_chat.html
- 2026-01-21T22:28:51.380Z
+ 2026-01-22T01:05:04.566Z
https://docs.axolotl.ai/docs/api/common.const.html
- 2026-01-21T22:28:52.281Z
+ 2026-01-22T01:05:05.464Z
https://docs.axolotl.ai/docs/api/cli.quantize.html
- 2026-01-21T22:28:51.026Z
+ 2026-01-22T01:05:04.213Z
https://docs.axolotl.ai/docs/api/utils.trainer.html
- 2026-01-21T22:28:51.857Z
+ 2026-01-22T01:05:05.043Z
https://docs.axolotl.ai/docs/api/cli.delinearize_llama4.html
- 2026-01-21T22:28:50.968Z
+ 2026-01-22T01:05:04.155Z
https://docs.axolotl.ai/docs/api/evaluate.html
- 2026-01-21T22:28:50.668Z
+ 2026-01-22T01:05:03.854Z
https://docs.axolotl.ai/docs/api/monkeypatch.mistral_attn_hijack_flash.html
- 2026-01-21T22:28:51.670Z
+ 2026-01-22T01:05:04.856Z
https://docs.axolotl.ai/docs/api/loaders.model.html
- 2026-01-21T22:28:51.190Z
+ 2026-01-22T01:05:04.376Z
https://docs.axolotl.ai/docs/api/utils.distributed.html
- 2026-01-21T22:28:51.916Z
+ 2026-01-22T01:05:05.101Z
https://docs.axolotl.ai/docs/api/utils.model_shard_quant.html
- 2026-01-21T22:28:51.822Z
+ 2026-01-22T01:05:05.008Z
https://docs.axolotl.ai/docs/api/kernels.lora.html
- 2026-01-21T22:28:51.624Z
+ 2026-01-22T01:05:04.810Z
https://docs.axolotl.ai/docs/api/cli.main.html
- 2026-01-21T22:28:50.884Z
+ 2026-01-22T01:05:04.070Z
https://docs.axolotl.ai/docs/api/integrations.spectrum.args.html
- 2026-01-21T22:28:52.277Z
+ 2026-01-22T01:05:05.460Z
https://docs.axolotl.ai/docs/api/utils.optimizers.adopt.html
- 2026-01-21T22:28:51.933Z
+ 2026-01-22T01:05:05.118Z
https://docs.axolotl.ai/docs/api/cli.cloud.modal_.html
- 2026-01-21T22:28:51.047Z
+ 2026-01-22T01:05:04.233Z
https://docs.axolotl.ai/docs/api/monkeypatch.llama_attn_hijack_flash.html
- 2026-01-21T22:28:51.667Z
+ 2026-01-22T01:05:04.853Z
https://docs.axolotl.ai/docs/api/core.builders.base.html
- 2026-01-21T22:28:50.765Z
+ 2026-01-22T01:05:03.951Z
https://docs.axolotl.ai/docs/api/utils.schemas.trl.html
- 2026-01-21T22:28:52.038Z
+ 2026-01-22T01:05:05.221Z
https://docs.axolotl.ai/docs/api/cli.utils.args.html
- 2026-01-21T22:28:51.063Z
+ 2026-01-22T01:05:04.249Z
https://docs.axolotl.ai/docs/api/core.trainers.base.html
- 2026-01-21T22:28:51.115Z
+ 2026-01-22T01:05:04.302Z
https://docs.axolotl.ai/docs/api/monkeypatch.llama_patch_multipack.html
- 2026-01-21T22:28:51.727Z
+ 2026-01-22T01:05:04.913Z
https://docs.axolotl.ai/docs/api/monkeypatch.llama_attn_hijack_xformers.html
- 2026-01-21T22:28:51.668Z
+ 2026-01-22T01:05:04.855Z
https://docs.axolotl.ai/docs/api/utils.schemas.model.html
- 2026-01-21T22:28:51.991Z
+ 2026-01-22T01:05:05.176Z
https://docs.axolotl.ai/docs/api/prompt_strategies.kto.chatml.html
- 2026-01-21T22:28:51.487Z
+ 2026-01-22T01:05:04.674Z
https://docs.axolotl.ai/docs/api/utils.callbacks.mlflow_.html
- 2026-01-21T22:28:52.404Z
+ 2026-01-22T01:05:05.587Z
https://docs.axolotl.ai/docs/api/common.datasets.html
- 2026-01-21T22:28:52.299Z
+ 2026-01-22T01:05:05.482Z
https://docs.axolotl.ai/docs/api/utils.schemas.datasets.html
- 2026-01-21T22:28:52.022Z
+ 2026-01-22T01:05:05.207Z
https://docs.axolotl.ai/docs/api/cli.utils.fetch.html
- 2026-01-21T22:28:51.069Z
+ 2026-01-22T01:05:04.256Z
https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.chatml.html
- 2026-01-21T22:28:51.461Z
+ 2026-01-22T01:05:04.648Z
https://docs.axolotl.ai/docs/api/monkeypatch.relora.html
- 2026-01-21T22:28:51.677Z
+ 2026-01-22T01:05:04.863Z
https://docs.axolotl.ai/docs/api/cli.evaluate.html
- 2026-01-21T22:28:50.904Z
+ 2026-01-22T01:05:04.090Z
https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.zephyr.html
- 2026-01-21T22:28:51.463Z
+ 2026-01-22T01:05:04.650Z
https://docs.axolotl.ai/docs/api/core.trainers.utils.html
- 2026-01-21T22:28:51.178Z
+ 2026-01-22T01:05:04.364Z
https://docs.axolotl.ai/docs/api/prompt_strategies.alpaca_w_system.html
- 2026-01-21T22:28:51.354Z
+ 2026-01-22T01:05:04.540Z
https://docs.axolotl.ai/docs/api/utils.chat_templates.html
- 2026-01-21T22:28:51.809Z
+ 2026-01-22T01:05:04.995Z
https://docs.axolotl.ai/docs/api/utils.data.streaming.html
- 2026-01-21T22:28:51.935Z
+ 2026-01-22T01:05:05.120Z
https://docs.axolotl.ai/docs/api/utils.bench.html
- 2026-01-21T22:28:51.826Z
+ 2026-01-22T01:05:05.012Z
https://docs.axolotl.ai/docs/api/common.architectures.html
- 2026-01-21T22:28:52.279Z
+ 2026-01-22T01:05:05.462Z
https://docs.axolotl.ai/docs/api/cli.checks.html
- 2026-01-21T22:28:50.941Z
+ 2026-01-22T01:05:04.128Z
https://docs.axolotl.ai/docs/api/core.trainers.dpo.trainer.html
- 2026-01-21T22:28:51.148Z
+ 2026-01-22T01:05:04.335Z
https://docs.axolotl.ai/docs/api/integrations.base.html
- 2026-01-21T22:28:52.250Z
+ 2026-01-22T01:05:05.433Z
https://docs.axolotl.ai/docs/api/cli.utils.train.html
- 2026-01-21T22:28:51.097Z
+ 2026-01-22T01:05:04.284Z
https://docs.axolotl.ai/docs/api/utils.collators.mamba.html
- 2026-01-21T22:28:52.329Z
+ 2026-01-22T01:05:05.512Z
https://docs.axolotl.ai/docs/api/cli.art.html
- 2026-01-21T22:28:50.933Z
+ 2026-01-22T01:05:04.120Z
https://docs.axolotl.ai/docs/api/monkeypatch.trainer_fsdp_optim.html
- 2026-01-21T22:28:51.738Z
+ 2026-01-22T01:05:04.924Z
https://docs.axolotl.ai/docs/api/logging_config.html
- 2026-01-21T22:28:50.757Z
+ 2026-01-22T01:05:03.943Z
https://docs.axolotl.ai/docs/api/utils.freeze.html
- 2026-01-21T22:28:51.836Z
+ 2026-01-22T01:05:05.022Z
https://docs.axolotl.ai/docs/api/prompt_strategies.metharme.html
- 2026-01-21T22:28:51.409Z
+ 2026-01-22T01:05:04.595Z
https://docs.axolotl.ai/docs/api/prompt_strategies.alpaca_chat.html
- 2026-01-21T22:28:51.337Z
+ 2026-01-22T01:05:04.523Z
https://docs.axolotl.ai/docs/api/monkeypatch.stablelm_attn_hijack_flash.html
- 2026-01-21T22:28:51.734Z
+ 2026-01-22T01:05:04.920Z
https://docs.axolotl.ai/docs/api/models.mamba.modeling_mamba.html
- 2026-01-21T22:28:52.301Z
+ 2026-01-22T01:05:05.483Z
https://docs.axolotl.ai/docs/api/core.trainers.trl.html
- 2026-01-21T22:28:51.133Z
+ 2026-01-22T01:05:04.320Z
https://docs.axolotl.ai/docs/api/prompt_strategies.input_output.html
- 2026-01-21T22:28:51.394Z
+ 2026-01-22T01:05:04.581Z
https://docs.axolotl.ai/docs/api/loaders.constants.html
- 2026-01-21T22:28:51.230Z
+ 2026-01-22T01:05:04.416Z
https://docs.axolotl.ai/docs/api/monkeypatch.data.batch_dataset_fetcher.html
- 2026-01-21T22:28:51.759Z
+ 2026-01-22T01:05:04.945Z
https://docs.axolotl.ai/docs/api/cli.vllm_serve.html
- 2026-01-21T22:28:51.035Z
+ 2026-01-22T01:05:04.221Z
https://docs.axolotl.ai/docs/api/prompt_tokenizers.html
- 2026-01-21T22:28:50.745Z
+ 2026-01-22T01:05:03.931Z
https://docs.axolotl.ai/docs/api/cli.args.html
- 2026-01-21T22:28:50.929Z
+ 2026-01-22T01:05:04.115Z
https://docs.axolotl.ai/docs/api/cli.inference.html
- 2026-01-21T22:28:50.986Z
+ 2026-01-22T01:05:04.172Z
https://docs.axolotl.ai/docs/api/cli.utils.load.html
- 2026-01-21T22:28:51.076Z
+ 2026-01-22T01:05:04.262Z
https://docs.axolotl.ai/docs/api/cli.preprocess.html
- 2026-01-21T22:28:51.020Z
+ 2026-01-22T01:05:04.207Z
https://docs.axolotl.ai/docs/api/utils.callbacks.profiler.html
- 2026-01-21T22:28:52.397Z
+ 2026-01-22T01:05:05.580Z
https://docs.axolotl.ai/docs/api/utils.callbacks.perplexity.html
- 2026-01-21T22:28:52.393Z
+ 2026-01-22T01:05:05.576Z
https://docs.axolotl.ai/docs/api/core.chat.format.chatml.html
- 2026-01-21T22:28:50.824Z
+ 2026-01-22T01:05:04.010Z
https://docs.axolotl.ai/docs/api/integrations.grokfast.optimizer.html
- 2026-01-21T22:28:52.255Z
+ 2026-01-22T01:05:05.438Z
https://docs.axolotl.ai/docs/api/integrations.kd.trainer.html
- 2026-01-21T22:28:52.265Z
+ 2026-01-22T01:05:05.447Z
https://docs.axolotl.ai/docs/api/monkeypatch.unsloth_.html
- 2026-01-21T22:28:51.748Z
+ 2026-01-22T01:05:04.934Z
https://docs.axolotl.ai/docs/api/core.chat.format.llama3x.html
- 2026-01-21T22:28:50.825Z
+ 2026-01-22T01:05:04.012Z
https://docs.axolotl.ai/docs/models/gemma3n.html
- 2026-01-21T22:29:08.616Z
+ 2026-01-22T01:05:22.798Z
https://docs.axolotl.ai/docs/models/qwen3-next.html
- 2026-01-21T22:29:08.615Z
+ 2026-01-22T01:05:22.798Z
https://docs.axolotl.ai/docs/models/index.html
- 2026-01-21T22:29:08.620Z
+ 2026-01-22T01:05:22.802Z
https://docs.axolotl.ai/docs/models/magistral/think.html
- 2026-01-21T22:29:08.612Z
+ 2026-01-22T01:05:22.791Z
https://docs.axolotl.ai/docs/models/kimi-linear.html
- 2026-01-21T22:29:08.607Z
+ 2026-01-22T01:05:22.787Z
https://docs.axolotl.ai/docs/models/internvl3_5.html
- 2026-01-21T22:29:08.608Z
+ 2026-01-22T01:05:22.788Z
https://docs.axolotl.ai/docs/models/arcee.html
- 2026-01-21T22:29:08.610Z
+ 2026-01-22T01:05:22.789Z
https://docs.axolotl.ai/docs/models/LiquidAI.html
- 2026-01-21T22:29:08.618Z
+ 2026-01-22T01:05:22.801Z
https://docs.axolotl.ai/docs/models/magistral.html
- 2026-01-21T22:29:08.612Z
+ 2026-01-22T01:05:22.791Z
https://docs.axolotl.ai/docs/models/voxtral.html
- 2026-01-21T22:29:08.613Z
+ 2026-01-22T01:05:22.793Z
https://docs.axolotl.ai/docs/models/trinity.html
- 2026-01-21T22:29:08.609Z
+ 2026-01-22T01:05:22.789Z
https://docs.axolotl.ai/docs/models/ministral.html
- 2026-01-21T22:29:08.613Z
+ 2026-01-22T01:05:22.792Z
https://docs.axolotl.ai/docs/models/llama-4.html
- 2026-01-21T22:29:08.614Z
+ 2026-01-22T01:05:22.794Z
https://docs.axolotl.ai/docs/models/devstral.html
- 2026-01-21T22:29:08.614Z
+ 2026-01-22T01:05:22.793Z
https://docs.axolotl.ai/docs/models/ministral3.html
- 2026-01-21T22:29:08.610Z
+ 2026-01-22T01:05:22.790Z
https://docs.axolotl.ai/docs/models/ministral3/vision.html
- 2026-01-21T22:29:08.611Z
+ 2026-01-22T01:05:22.790Z
https://docs.axolotl.ai/docs/models/plano.html
- 2026-01-21T22:29:08.608Z
+ 2026-01-22T01:05:22.787Z
https://docs.axolotl.ai/docs/reward_modelling.html
- 2026-01-21T22:25:37.018Z
+ 2026-01-22T01:01:27.543Z
https://docs.axolotl.ai/docs/quantize.html
- 2026-01-21T22:25:37.018Z
+ 2026-01-22T01:01:27.543Z
https://docs.axolotl.ai/docs/fsdp_qlora.html
- 2026-01-21T22:25:37.013Z
+ 2026-01-22T01:01:27.538Z
https://docs.axolotl.ai/docs/nd_parallelism.html
- 2026-01-21T22:25:37.017Z
+ 2026-01-22T01:01:27.543Z
https://docs.axolotl.ai/docs/batch_vs_grad.html
- 2026-01-21T22:25:37.012Z
+ 2026-01-22T01:01:27.537Z
https://docs.axolotl.ai/docs/multi-node.html
- 2026-01-21T22:25:37.017Z
+ 2026-01-22T01:01:27.542Z
https://docs.axolotl.ai/docs/rlhf.html
- 2026-01-21T22:25:37.018Z
+ 2026-01-22T01:01:27.543Z
https://docs.axolotl.ai/docs/dataset-formats/stepwise_supervised.html
- 2026-01-21T22:25:37.013Z
+ 2026-01-22T01:01:27.538Z
https://docs.axolotl.ai/docs/dataset-formats/pretraining.html
- 2026-01-21T22:25:37.013Z
+ 2026-01-22T01:01:27.538Z
https://docs.axolotl.ai/docs/dataset-formats/tokenized.html
- 2026-01-21T22:25:37.013Z
+ 2026-01-22T01:01:27.538Z
https://docs.axolotl.ai/docs/dataset-formats/template_free.html
- 2026-01-21T22:25:37.013Z
+ 2026-01-22T01:01:27.538Z
https://docs.axolotl.ai/docs/multi-gpu.html
- 2026-01-21T22:25:37.017Z
+ 2026-01-22T01:01:27.542Z
https://docs.axolotl.ai/docs/input_output.html
- 2026-01-21T22:25:37.017Z
+ 2026-01-22T01:01:27.542Z
https://docs.axolotl.ai/docs/docker.html
- 2026-01-21T22:25:37.013Z
+ 2026-01-22T01:01:27.538Z
https://docs.axolotl.ai/docs/checkpoint_saving.html
- 2026-01-21T22:25:37.012Z
+ 2026-01-22T01:01:27.537Z
https://docs.axolotl.ai/docs/multipack.html
- 2026-01-21T22:25:37.017Z
+ 2026-01-22T01:01:27.542Z
https://docs.axolotl.ai/docs/qat.html
- 2026-01-21T22:25:37.018Z
+ 2026-01-22T01:01:27.543Z
https://docs.axolotl.ai/docs/lr_groups.html
- 2026-01-21T22:25:37.017Z
+ 2026-01-22T01:01:27.542Z
https://docs.axolotl.ai/docs/getting-started.html
- 2026-01-21T22:25:37.013Z
+ 2026-01-22T01:01:27.538Z
https://docs.axolotl.ai/docs/nccl.html
- 2026-01-21T22:25:37.017Z
+ 2026-01-22T01:01:27.542Z
https://docs.axolotl.ai/docs/telemetry.html
- 2026-01-21T22:25:37.018Z
+ 2026-01-22T01:01:27.544Z
https://docs.axolotl.ai/docs/unsloth.html
- 2026-01-21T22:25:37.019Z
+ 2026-01-22T01:01:27.544Z
https://docs.axolotl.ai/src/axolotl/integrations/LICENSE.html
- 2026-01-21T22:25:37.044Z
+ 2026-01-22T01:01:27.576Z