From 372f4717923dd24fa87ad04104b1028be9930a98 Mon Sep 17 00:00:00 2001 From: Quarto GHA Workflow Runner Date: Fri, 22 Aug 2025 18:34:46 +0000 Subject: [PATCH] Built site for gh-pages --- .nojekyll | 2 +- docs/api/index.html | 2 +- ...onkeypatch.data.batch_dataset_fetcher.html | 98 ++++- search.json | 13 +- sitemap.xml | 394 +++++++++--------- 5 files changed, 305 insertions(+), 204 deletions(-) diff --git a/.nojekyll b/.nojekyll index fc6ad5e48..841139098 100644 --- a/.nojekyll +++ b/.nojekyll @@ -1 +1 @@ -59230ab8 \ No newline at end of file +29a669bc \ No newline at end of file diff --git a/docs/api/index.html b/docs/api/index.html index abb4e4431..a58fca786 100644 --- a/docs/api/index.html +++ b/docs/api/index.html @@ -953,7 +953,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true}); monkeypatch.data.batch_dataset_fetcher -monkey patches for the dataset fetcher to handle batches of packed indexes +Monkey patches for the dataset fetcher to handle batches of packed indexes. monkeypatch.mixtral diff --git a/docs/api/monkeypatch.data.batch_dataset_fetcher.html b/docs/api/monkeypatch.data.batch_dataset_fetcher.html index 8e2ff246b..976b080dc 100644 --- a/docs/api/monkeypatch.data.batch_dataset_fetcher.html +++ b/docs/api/monkeypatch.data.batch_dataset_fetcher.html @@ -20,6 +20,41 @@ ul.task-list li input[type="checkbox"] { margin: 0 0.8em 0.2em -1em; /* quarto-specific, see https://github.com/quarto-dev/quarto-cli/issues/4556 */ vertical-align: middle; } +/* CSS for syntax highlighting */ +html { -webkit-text-size-adjust: 100%; } +pre > code.sourceCode { white-space: pre; position: relative; } +pre > code.sourceCode > span { display: inline-block; line-height: 1.25; } +pre > code.sourceCode > span:empty { height: 1.2em; } +.sourceCode { overflow: visible; } +code.sourceCode > span { color: inherit; text-decoration: inherit; } +div.sourceCode { margin: 1em 0; } +pre.sourceCode { margin: 0; } +@media screen { +div.sourceCode { overflow: auto; } +} +@media print { +pre > code.sourceCode { white-space: pre-wrap; } +pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; } +} +pre.numberSource code + { counter-reset: source-line 0; } +pre.numberSource code > span + { position: relative; left: -4em; counter-increment: source-line; } +pre.numberSource code > span > a:first-child::before + { content: counter(source-line); + position: relative; left: -1em; text-align: right; vertical-align: baseline; + border: none; display: inline-block; + -webkit-touch-callout: none; -webkit-user-select: none; + -khtml-user-select: none; -moz-user-select: none; + -ms-user-select: none; user-select: none; + padding: 0 4px; width: 4em; + } +pre.numberSource { margin-left: 3em; padding-left: 4px; } +div.sourceCode + { } +@media screen { +pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; } +} @@ -456,7 +491,16 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});

On this page

@@ -469,9 +513,59 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});

monkeypatch.data.batch_dataset_fetcher

monkeypatch.data.batch_dataset_fetcher

-

monkey patches for the dataset fetcher to handle batches of packed indexes

+

Monkey patches for the dataset fetcher to handle batches of packed indexes.

+
+

Functions

+ + + + + + + + + + + + + + + + + + + + + + + + + +
NameDescription
apply_multipack_dataloader_patchThis patch allows DataLoader to correctly process batches that contain multiple bins
patch_fetchersApply patches to PyTorch’s DataLoader components.
patched_worker_loopWorker loop that ensures patches are applied in worker processes.
remove_multipack_dataloader_patchRemove the monkeypatch and restore original PyTorch DataLoader behavior.
+
+

apply_multipack_dataloader_patch

+
monkeypatch.data.batch_dataset_fetcher.apply_multipack_dataloader_patch()
+

This patch allows DataLoader to correctly process batches that contain multiple bins +of packed sequences.

+
+
+

patch_fetchers

+
monkeypatch.data.batch_dataset_fetcher.patch_fetchers()
+

Apply patches to PyTorch’s DataLoader components.

+
+
+

patched_worker_loop

+
monkeypatch.data.batch_dataset_fetcher.patched_worker_loop(*args, **kwargs)
+

Worker loop that ensures patches are applied in worker processes.

+
+
+

remove_multipack_dataloader_patch

+
monkeypatch.data.batch_dataset_fetcher.remove_multipack_dataloader_patch()
+

Remove the monkeypatch and restore original PyTorch DataLoader behavior.

+
+
diff --git a/search.json b/search.json index 942879b78..b239925ad 100644 --- a/search.json +++ b/search.json @@ -1069,7 +1069,7 @@ "href": "docs/api/index.html", "title": "API Reference", "section": "", - "text": "Core functionality for training\n\n\n\ntrain\nPrepare and train a model on a dataset. Can also infer from a model or merge lora\n\n\nevaluate\nModule for evaluating models.\n\n\ndatasets\nModule containing Dataset functionality\n\n\nconvert\nModule containing File Reader, File Writer, Json Parser, and Jsonl Serializer classes\n\n\nprompt_tokenizers\nModule containing PromptTokenizingStrategy and Prompter classes\n\n\nlogging_config\nCommon logging module for axolotl\n\n\ncore.builders.base\nBase class for trainer builder\n\n\ncore.builders.causal\nBuilder for causal trainers\n\n\ncore.builders.rl\nBuilder for RLHF trainers\n\n\ncore.training_args\nextra axolotl specific training args\n\n\ncore.chat.messages\ninternal message representations of chat messages\n\n\ncore.chat.format.chatml\nChatML transformation functions for MessageContents\n\n\ncore.chat.format.llama3x\nLlama 3.x chat formatting functions for MessageContents\n\n\ncore.chat.format.shared\nshared functions for format transforms\n\n\ncore.datasets.chat\nchat dataset module\n\n\ncore.datasets.transforms.chat_builder\nThis module contains a function that builds a transform that takes a row from the dataset and converts it to a Chat.\n\n\n\n\n\n\nCommand-line interface\n\n\n\ncli.main\nClick CLI definitions for various axolotl commands.\n\n\ncli.train\nCLI to run training on a model.\n\n\ncli.evaluate\nCLI to run evaluation on a model.\n\n\ncli.args\nModule for axolotl CLI command arguments.\n\n\ncli.art\nAxolotl ASCII logo utils.\n\n\ncli.checks\nVarious checks for Axolotl CLI.\n\n\ncli.config\nConfiguration loading and processing.\n\n\ncli.delinearize_llama4\nCLI tool to delinearize quantized/Linearized Llama-4 models.\n\n\ncli.inference\nCLI to run inference on a trained model.\n\n\ncli.merge_lora\nCLI to merge a trained LoRA into a base model.\n\n\ncli.merge_sharded_fsdp_weights\nCLI to merge sharded FSDP model checkpoints into a single combined checkpoint.\n\n\ncli.preprocess\nCLI to run preprocessing of a dataset.\n\n\ncli.quantize\nCLI to post-training quantize a model using torchao\n\n\ncli.vllm_serve\nCLI to start the vllm server for online RL\n\n\ncli.cloud.base\nbase class for cloud platforms from cli\n\n\ncli.cloud.modal_\nModal Cloud support from CLI\n\n\ncli.utils\nInit for axolotl.cli.utils module.\n\n\ncli.utils.args\nUtilities for axolotl CLI args.\n\n\ncli.utils.fetch\nUtilities for axolotl fetch CLI command.\n\n\ncli.utils.load\nUtilities for model, tokenizer, etc. loading.\n\n\ncli.utils.sweeps\nUtilities for handling sweeps over configs for axolotl train CLI command\n\n\ncli.utils.train\nUtilities for axolotl train CLI command.\n\n\n\n\n\n\nTraining implementations\n\n\n\ncore.trainers.base\nModule for customized trainers\n\n\ncore.trainers.trl\nModule for TRL RL trainers\n\n\ncore.trainers.mamba\nModule for mamba trainer\n\n\ncore.trainers.dpo.trainer\nDPO trainer for axolotl\n\n\ncore.trainers.grpo.trainer\nAxolotl GRPO trainers (with and without sequence parallelism handling)\n\n\ncore.trainers.grpo.sampler\nRepeat random sampler (similar to the one implemented in\n\n\ncore.trainers.utils\nUtils for Axolotl trainers\n\n\n\n\n\n\nFunctionality for loading and patching models, tokenizers, etc.\n\n\n\nloaders.model\nModel loader class implementation for loading, configuring, and patching various models.\n\n\nloaders.tokenizer\nTokenizer loading functionality and associated utils\n\n\nloaders.processor\nProcessor loading functionality for multi-modal models\n\n\nloaders.adapter\nAdapter loading functionality, including LoRA / QLoRA and associated utils\n\n\nloaders.patch_manager\nPatch manager class implementation to complement axolotl.loaders.ModelLoader.\n\n\nloaders.constants\nShared constants for axolotl.loaders module\n\n\n\n\n\n\nMixin classes for augmenting trainers\n\n\n\ncore.trainers.mixins.optimizer\nModule for Axolotl trainer optimizer mixin\n\n\ncore.trainers.mixins.rng_state_loader\nTemporary fix/override for bug in resume from checkpoint\n\n\ncore.trainers.mixins.scheduler\nModule for Axolotl trainer scheduler mixin\n\n\n\n\n\n\nContext managers for altering trainer behaviors\n\n\n\nutils.ctx_managers.sequence_parallel\nModule for Axolotl trainer sequence parallelism manager and utilities\n\n\n\n\n\n\nPrompt formatting strategies\n\n\n\nprompt_strategies.base\nmodule for base dataset transform strategies\n\n\nprompt_strategies.chat_template\nHF Chat Templates prompt strategy\n\n\nprompt_strategies.alpaca_chat\nModule for Alpaca prompt strategy classes\n\n\nprompt_strategies.alpaca_instruct\nModule loading the AlpacaInstructPromptTokenizingStrategy class\n\n\nprompt_strategies.alpaca_w_system\nPrompt strategies loader for alpaca instruction datasets with system prompts\n\n\nprompt_strategies.user_defined\nUser Defined prompts with configuration from the YML config\n\n\nprompt_strategies.llama2_chat\nPrompt Strategy for finetuning Llama2 chat models\n\n\nprompt_strategies.completion\nBasic completion text\n\n\nprompt_strategies.input_output\nModule for plain input/output prompt pairs\n\n\nprompt_strategies.stepwise_supervised\nModule for stepwise datasets, typically including a prompt and reasoning traces,\n\n\nprompt_strategies.metharme\nModule containing the MetharmenPromptTokenizingStrategy and MetharmePrompter class\n\n\nprompt_strategies.orcamini\nPrompt Strategy for finetuning Orca Mini (v2) models\n\n\nprompt_strategies.pygmalion\nModule containing the PygmalionPromptTokenizingStrategy and PygmalionPrompter class\n\n\nprompt_strategies.messages.chat\nChat dataset wrapping strategy for new internal messages representations\n\n\nprompt_strategies.dpo.chat_template\nDPO prompt strategies for using tokenizer chat templates.\n\n\nprompt_strategies.dpo.llama3\nDPO strategies for llama-3 chat template\n\n\nprompt_strategies.dpo.chatml\nDPO strategies for chatml\n\n\nprompt_strategies.dpo.zephyr\nDPO strategies for zephyr\n\n\nprompt_strategies.dpo.user_defined\nUser-defined DPO strategies\n\n\nprompt_strategies.dpo.passthrough\nDPO prompt strategies passthrough/zero-processing strategy\n\n\nprompt_strategies.kto.llama3\nKTO strategies for llama-3 chat template\n\n\nprompt_strategies.kto.chatml\nKTO strategies for chatml\n\n\nprompt_strategies.kto.user_defined\nUser-defined KTO strategies\n\n\nprompt_strategies.orpo.chat_template\nchatml prompt tokenization strategy for ORPO\n\n\nprompt_strategies.bradley_terry.llama3\nchatml transforms for datasets with system, input, chosen, rejected to match llama3 chat template\n\n\n\n\n\n\nLow-level performance optimizations\n\n\n\nkernels.lora\nModule for definition of Low-Rank Adaptation (LoRA) Triton kernels.\n\n\nkernels.geglu\nModule for definition of GEGLU Triton kernels.\n\n\nkernels.swiglu\nModule for definition of SwiGLU Triton kernels.\n\n\nkernels.quantize\nDequantization utilities for bitsandbytes integration.\n\n\nkernels.utils\nUtilities for axolotl.kernels submodules.\n\n\n\n\n\n\nRuntime patches for model optimizations\n\n\n\nmonkeypatch.llama_attn_hijack_flash\nFlash attention monkey patch for llama model\n\n\nmonkeypatch.llama_attn_hijack_xformers\nDirectly copied the code from https://raw.githubusercontent.com/oobabooga/text-generation-webui/main/modules/llama_attn_hijack.py and made some adjustments\n\n\nmonkeypatch.mistral_attn_hijack_flash\nFlash attention monkey patch for mistral model\n\n\nmonkeypatch.multipack\nmultipack patching for v2 of sample packing\n\n\nmonkeypatch.relora\nImplements the ReLoRA training procedure from https://arxiv.org/abs/2307.05695, minus the initial full fine-tune.\n\n\nmonkeypatch.llama_expand_mask\nexpands the binary attention mask per 3.2.2 of https://arxiv.org/pdf/2107.02027.pdf\n\n\nmonkeypatch.lora_kernels\nModule for patching custom LoRA Triton kernels and torch.autograd functions.\n\n\nmonkeypatch.utils\nShared utils for the monkeypatches\n\n\nmonkeypatch.btlm_attn_hijack_flash\nFlash attention monkey patch for cerebras btlm model\n\n\nmonkeypatch.llama_patch_multipack\nPatched LlamaAttention to use torch.nn.functional.scaled_dot_product_attention\n\n\nmonkeypatch.stablelm_attn_hijack_flash\nPyTorch StableLM Epoch model.\n\n\nmonkeypatch.trainer_fsdp_optim\nfix for FSDP optimizer save in trainer w 4.47.0\n\n\nmonkeypatch.transformers_fa_utils\nsee https://github.com/huggingface/transformers/pull/35834\n\n\nmonkeypatch.unsloth_\nmodule for patching with unsloth optimizations\n\n\nmonkeypatch.data.batch_dataset_fetcher\nmonkey patches for the dataset fetcher to handle batches of packed indexes\n\n\nmonkeypatch.mixtral\nPatches to support multipack for mixtral\n\n\nmonkeypatch.gradient_checkpointing.offload_cpu\nCPU offloaded checkpointing\n\n\nmonkeypatch.gradient_checkpointing.offload_disk\nDISCO - DIsk-based Storage and Checkpointing with Optimized prefetching\n\n\n\n\n\n\nUtility functions\n\n\n\nutils.tokenization\nModule for tokenization utilities\n\n\nutils.chat_templates\nThis module provides functionality for selecting chat templates based on user choices.\n\n\nutils.lora\nmodule to get the state dict of a merged lora model\n\n\nutils.model_shard_quant\nmodule to handle loading model on cpu/meta device for FSDP\n\n\nutils.bench\nBenchmarking and measurement utilities\n\n\nutils.freeze\nmodule to freeze/unfreeze parameters by name\n\n\nutils.trainer\nModule containing the Trainer class and related functions\n\n\nutils.schedulers\nModule for custom LRScheduler class\n\n\nutils.distributed\nUtilities for distributed functionality.\n\n\nutils.dict\nModule containing the DictDefault class\n\n\nutils.optimizers.adopt\nCopied from https://github.com/iShohei220/adopt\n\n\nutils.data.pretraining\ndata handling specific to pretraining\n\n\nutils.data.sft\nData handling specific to SFT.\n\n\nutils.quantization\nUtilities for quantization including QAT and PTQ using torchao.\n\n\n\n\n\n\nPydantic data models for Axolotl config\n\n\n\nutils.schemas.config\nModule with Pydantic models for configuration.\n\n\nutils.schemas.model\nPydantic models for model input / output, etc. configuration\n\n\nutils.schemas.training\nPydantic models for training hyperparameters\n\n\nutils.schemas.datasets\nPydantic models for datasets-related configuration\n\n\nutils.schemas.peft\nPydantic models for PEFT-related configuration\n\n\nutils.schemas.trl\nPydantic models for TRL trainer configuration\n\n\nutils.schemas.multimodal\nPydantic models for multimodal-related configuration\n\n\nutils.schemas.integrations\nPydantic models for Axolotl integrations\n\n\nutils.schemas.enums\nEnums for Axolotl input config\n\n\nutils.schemas.utils\nUtilities for Axolotl Pydantic models\n\n\n\n\n\n\nThird-party integrations and extensions\n\n\n\nintegrations.base\nBase class for all plugins.\n\n\nintegrations.cut_cross_entropy.args\nModule for handling Cut Cross Entropy input arguments.\n\n\nintegrations.grokfast.optimizer\n\n\n\nintegrations.kd.trainer\nKD trainer\n\n\nintegrations.liger.args\nModule for handling LIGER input arguments.\n\n\nintegrations.lm_eval.args\nModule for handling lm eval harness input arguments.\n\n\nintegrations.spectrum.args\nModule for handling Spectrum input arguments.\n\n\n\n\n\n\nCommon utilities and shared functionality\n\n\n\ncommon.architectures\nCommon architecture specific constants\n\n\ncommon.const\nVarious shared constants\n\n\ncommon.datasets\nDataset loading utilities.\n\n\n\n\n\n\nCustom model implementations\n\n\n\nmodels.mamba.modeling_mamba\n\n\n\n\n\n\n\nData processing utilities\n\n\n\nutils.collators.core\nbasic shared collator constants\n\n\nutils.collators.batching\nData collators for axolotl to pad labels and position_ids for packed sequences\n\n\nutils.collators.mamba\ncollators for Mamba\n\n\nutils.collators.mm_chat\nCollators for multi-modal chat messages and packing\n\n\nutils.samplers.multipack\nMultipack Batch Sampler - An efficient batch sampler for packing variable-length sequences\n\n\n\n\n\n\nTraining callbacks\n\n\n\nutils.callbacks.perplexity\ncallback to calculate perplexity as an evaluation metric.\n\n\nutils.callbacks.profiler\nHF Trainer callback for creating pytorch profiling snapshots\n\n\nutils.callbacks.lisa\nmodule for LISA\n\n\nutils.callbacks.mlflow_\nMLFlow module for trainer callbacks\n\n\nutils.callbacks.comet_\nComet module for trainer callbacks\n\n\nutils.callbacks.qat\nQAT Callback for HF Causal Trainer" + "text": "Core functionality for training\n\n\n\ntrain\nPrepare and train a model on a dataset. Can also infer from a model or merge lora\n\n\nevaluate\nModule for evaluating models.\n\n\ndatasets\nModule containing Dataset functionality\n\n\nconvert\nModule containing File Reader, File Writer, Json Parser, and Jsonl Serializer classes\n\n\nprompt_tokenizers\nModule containing PromptTokenizingStrategy and Prompter classes\n\n\nlogging_config\nCommon logging module for axolotl\n\n\ncore.builders.base\nBase class for trainer builder\n\n\ncore.builders.causal\nBuilder for causal trainers\n\n\ncore.builders.rl\nBuilder for RLHF trainers\n\n\ncore.training_args\nextra axolotl specific training args\n\n\ncore.chat.messages\ninternal message representations of chat messages\n\n\ncore.chat.format.chatml\nChatML transformation functions for MessageContents\n\n\ncore.chat.format.llama3x\nLlama 3.x chat formatting functions for MessageContents\n\n\ncore.chat.format.shared\nshared functions for format transforms\n\n\ncore.datasets.chat\nchat dataset module\n\n\ncore.datasets.transforms.chat_builder\nThis module contains a function that builds a transform that takes a row from the dataset and converts it to a Chat.\n\n\n\n\n\n\nCommand-line interface\n\n\n\ncli.main\nClick CLI definitions for various axolotl commands.\n\n\ncli.train\nCLI to run training on a model.\n\n\ncli.evaluate\nCLI to run evaluation on a model.\n\n\ncli.args\nModule for axolotl CLI command arguments.\n\n\ncli.art\nAxolotl ASCII logo utils.\n\n\ncli.checks\nVarious checks for Axolotl CLI.\n\n\ncli.config\nConfiguration loading and processing.\n\n\ncli.delinearize_llama4\nCLI tool to delinearize quantized/Linearized Llama-4 models.\n\n\ncli.inference\nCLI to run inference on a trained model.\n\n\ncli.merge_lora\nCLI to merge a trained LoRA into a base model.\n\n\ncli.merge_sharded_fsdp_weights\nCLI to merge sharded FSDP model checkpoints into a single combined checkpoint.\n\n\ncli.preprocess\nCLI to run preprocessing of a dataset.\n\n\ncli.quantize\nCLI to post-training quantize a model using torchao\n\n\ncli.vllm_serve\nCLI to start the vllm server for online RL\n\n\ncli.cloud.base\nbase class for cloud platforms from cli\n\n\ncli.cloud.modal_\nModal Cloud support from CLI\n\n\ncli.utils\nInit for axolotl.cli.utils module.\n\n\ncli.utils.args\nUtilities for axolotl CLI args.\n\n\ncli.utils.fetch\nUtilities for axolotl fetch CLI command.\n\n\ncli.utils.load\nUtilities for model, tokenizer, etc. loading.\n\n\ncli.utils.sweeps\nUtilities for handling sweeps over configs for axolotl train CLI command\n\n\ncli.utils.train\nUtilities for axolotl train CLI command.\n\n\n\n\n\n\nTraining implementations\n\n\n\ncore.trainers.base\nModule for customized trainers\n\n\ncore.trainers.trl\nModule for TRL RL trainers\n\n\ncore.trainers.mamba\nModule for mamba trainer\n\n\ncore.trainers.dpo.trainer\nDPO trainer for axolotl\n\n\ncore.trainers.grpo.trainer\nAxolotl GRPO trainers (with and without sequence parallelism handling)\n\n\ncore.trainers.grpo.sampler\nRepeat random sampler (similar to the one implemented in\n\n\ncore.trainers.utils\nUtils for Axolotl trainers\n\n\n\n\n\n\nFunctionality for loading and patching models, tokenizers, etc.\n\n\n\nloaders.model\nModel loader class implementation for loading, configuring, and patching various models.\n\n\nloaders.tokenizer\nTokenizer loading functionality and associated utils\n\n\nloaders.processor\nProcessor loading functionality for multi-modal models\n\n\nloaders.adapter\nAdapter loading functionality, including LoRA / QLoRA and associated utils\n\n\nloaders.patch_manager\nPatch manager class implementation to complement axolotl.loaders.ModelLoader.\n\n\nloaders.constants\nShared constants for axolotl.loaders module\n\n\n\n\n\n\nMixin classes for augmenting trainers\n\n\n\ncore.trainers.mixins.optimizer\nModule for Axolotl trainer optimizer mixin\n\n\ncore.trainers.mixins.rng_state_loader\nTemporary fix/override for bug in resume from checkpoint\n\n\ncore.trainers.mixins.scheduler\nModule for Axolotl trainer scheduler mixin\n\n\n\n\n\n\nContext managers for altering trainer behaviors\n\n\n\nutils.ctx_managers.sequence_parallel\nModule for Axolotl trainer sequence parallelism manager and utilities\n\n\n\n\n\n\nPrompt formatting strategies\n\n\n\nprompt_strategies.base\nmodule for base dataset transform strategies\n\n\nprompt_strategies.chat_template\nHF Chat Templates prompt strategy\n\n\nprompt_strategies.alpaca_chat\nModule for Alpaca prompt strategy classes\n\n\nprompt_strategies.alpaca_instruct\nModule loading the AlpacaInstructPromptTokenizingStrategy class\n\n\nprompt_strategies.alpaca_w_system\nPrompt strategies loader for alpaca instruction datasets with system prompts\n\n\nprompt_strategies.user_defined\nUser Defined prompts with configuration from the YML config\n\n\nprompt_strategies.llama2_chat\nPrompt Strategy for finetuning Llama2 chat models\n\n\nprompt_strategies.completion\nBasic completion text\n\n\nprompt_strategies.input_output\nModule for plain input/output prompt pairs\n\n\nprompt_strategies.stepwise_supervised\nModule for stepwise datasets, typically including a prompt and reasoning traces,\n\n\nprompt_strategies.metharme\nModule containing the MetharmenPromptTokenizingStrategy and MetharmePrompter class\n\n\nprompt_strategies.orcamini\nPrompt Strategy for finetuning Orca Mini (v2) models\n\n\nprompt_strategies.pygmalion\nModule containing the PygmalionPromptTokenizingStrategy and PygmalionPrompter class\n\n\nprompt_strategies.messages.chat\nChat dataset wrapping strategy for new internal messages representations\n\n\nprompt_strategies.dpo.chat_template\nDPO prompt strategies for using tokenizer chat templates.\n\n\nprompt_strategies.dpo.llama3\nDPO strategies for llama-3 chat template\n\n\nprompt_strategies.dpo.chatml\nDPO strategies for chatml\n\n\nprompt_strategies.dpo.zephyr\nDPO strategies for zephyr\n\n\nprompt_strategies.dpo.user_defined\nUser-defined DPO strategies\n\n\nprompt_strategies.dpo.passthrough\nDPO prompt strategies passthrough/zero-processing strategy\n\n\nprompt_strategies.kto.llama3\nKTO strategies for llama-3 chat template\n\n\nprompt_strategies.kto.chatml\nKTO strategies for chatml\n\n\nprompt_strategies.kto.user_defined\nUser-defined KTO strategies\n\n\nprompt_strategies.orpo.chat_template\nchatml prompt tokenization strategy for ORPO\n\n\nprompt_strategies.bradley_terry.llama3\nchatml transforms for datasets with system, input, chosen, rejected to match llama3 chat template\n\n\n\n\n\n\nLow-level performance optimizations\n\n\n\nkernels.lora\nModule for definition of Low-Rank Adaptation (LoRA) Triton kernels.\n\n\nkernels.geglu\nModule for definition of GEGLU Triton kernels.\n\n\nkernels.swiglu\nModule for definition of SwiGLU Triton kernels.\n\n\nkernels.quantize\nDequantization utilities for bitsandbytes integration.\n\n\nkernels.utils\nUtilities for axolotl.kernels submodules.\n\n\n\n\n\n\nRuntime patches for model optimizations\n\n\n\nmonkeypatch.llama_attn_hijack_flash\nFlash attention monkey patch for llama model\n\n\nmonkeypatch.llama_attn_hijack_xformers\nDirectly copied the code from https://raw.githubusercontent.com/oobabooga/text-generation-webui/main/modules/llama_attn_hijack.py and made some adjustments\n\n\nmonkeypatch.mistral_attn_hijack_flash\nFlash attention monkey patch for mistral model\n\n\nmonkeypatch.multipack\nmultipack patching for v2 of sample packing\n\n\nmonkeypatch.relora\nImplements the ReLoRA training procedure from https://arxiv.org/abs/2307.05695, minus the initial full fine-tune.\n\n\nmonkeypatch.llama_expand_mask\nexpands the binary attention mask per 3.2.2 of https://arxiv.org/pdf/2107.02027.pdf\n\n\nmonkeypatch.lora_kernels\nModule for patching custom LoRA Triton kernels and torch.autograd functions.\n\n\nmonkeypatch.utils\nShared utils for the monkeypatches\n\n\nmonkeypatch.btlm_attn_hijack_flash\nFlash attention monkey patch for cerebras btlm model\n\n\nmonkeypatch.llama_patch_multipack\nPatched LlamaAttention to use torch.nn.functional.scaled_dot_product_attention\n\n\nmonkeypatch.stablelm_attn_hijack_flash\nPyTorch StableLM Epoch model.\n\n\nmonkeypatch.trainer_fsdp_optim\nfix for FSDP optimizer save in trainer w 4.47.0\n\n\nmonkeypatch.transformers_fa_utils\nsee https://github.com/huggingface/transformers/pull/35834\n\n\nmonkeypatch.unsloth_\nmodule for patching with unsloth optimizations\n\n\nmonkeypatch.data.batch_dataset_fetcher\nMonkey patches for the dataset fetcher to handle batches of packed indexes.\n\n\nmonkeypatch.mixtral\nPatches to support multipack for mixtral\n\n\nmonkeypatch.gradient_checkpointing.offload_cpu\nCPU offloaded checkpointing\n\n\nmonkeypatch.gradient_checkpointing.offload_disk\nDISCO - DIsk-based Storage and Checkpointing with Optimized prefetching\n\n\n\n\n\n\nUtility functions\n\n\n\nutils.tokenization\nModule for tokenization utilities\n\n\nutils.chat_templates\nThis module provides functionality for selecting chat templates based on user choices.\n\n\nutils.lora\nmodule to get the state dict of a merged lora model\n\n\nutils.model_shard_quant\nmodule to handle loading model on cpu/meta device for FSDP\n\n\nutils.bench\nBenchmarking and measurement utilities\n\n\nutils.freeze\nmodule to freeze/unfreeze parameters by name\n\n\nutils.trainer\nModule containing the Trainer class and related functions\n\n\nutils.schedulers\nModule for custom LRScheduler class\n\n\nutils.distributed\nUtilities for distributed functionality.\n\n\nutils.dict\nModule containing the DictDefault class\n\n\nutils.optimizers.adopt\nCopied from https://github.com/iShohei220/adopt\n\n\nutils.data.pretraining\ndata handling specific to pretraining\n\n\nutils.data.sft\nData handling specific to SFT.\n\n\nutils.quantization\nUtilities for quantization including QAT and PTQ using torchao.\n\n\n\n\n\n\nPydantic data models for Axolotl config\n\n\n\nutils.schemas.config\nModule with Pydantic models for configuration.\n\n\nutils.schemas.model\nPydantic models for model input / output, etc. configuration\n\n\nutils.schemas.training\nPydantic models for training hyperparameters\n\n\nutils.schemas.datasets\nPydantic models for datasets-related configuration\n\n\nutils.schemas.peft\nPydantic models for PEFT-related configuration\n\n\nutils.schemas.trl\nPydantic models for TRL trainer configuration\n\n\nutils.schemas.multimodal\nPydantic models for multimodal-related configuration\n\n\nutils.schemas.integrations\nPydantic models for Axolotl integrations\n\n\nutils.schemas.enums\nEnums for Axolotl input config\n\n\nutils.schemas.utils\nUtilities for Axolotl Pydantic models\n\n\n\n\n\n\nThird-party integrations and extensions\n\n\n\nintegrations.base\nBase class for all plugins.\n\n\nintegrations.cut_cross_entropy.args\nModule for handling Cut Cross Entropy input arguments.\n\n\nintegrations.grokfast.optimizer\n\n\n\nintegrations.kd.trainer\nKD trainer\n\n\nintegrations.liger.args\nModule for handling LIGER input arguments.\n\n\nintegrations.lm_eval.args\nModule for handling lm eval harness input arguments.\n\n\nintegrations.spectrum.args\nModule for handling Spectrum input arguments.\n\n\n\n\n\n\nCommon utilities and shared functionality\n\n\n\ncommon.architectures\nCommon architecture specific constants\n\n\ncommon.const\nVarious shared constants\n\n\ncommon.datasets\nDataset loading utilities.\n\n\n\n\n\n\nCustom model implementations\n\n\n\nmodels.mamba.modeling_mamba\n\n\n\n\n\n\n\nData processing utilities\n\n\n\nutils.collators.core\nbasic shared collator constants\n\n\nutils.collators.batching\nData collators for axolotl to pad labels and position_ids for packed sequences\n\n\nutils.collators.mamba\ncollators for Mamba\n\n\nutils.collators.mm_chat\nCollators for multi-modal chat messages and packing\n\n\nutils.samplers.multipack\nMultipack Batch Sampler - An efficient batch sampler for packing variable-length sequences\n\n\n\n\n\n\nTraining callbacks\n\n\n\nutils.callbacks.perplexity\ncallback to calculate perplexity as an evaluation metric.\n\n\nutils.callbacks.profiler\nHF Trainer callback for creating pytorch profiling snapshots\n\n\nutils.callbacks.lisa\nmodule for LISA\n\n\nutils.callbacks.mlflow_\nMLFlow module for trainer callbacks\n\n\nutils.callbacks.comet_\nComet module for trainer callbacks\n\n\nutils.callbacks.qat\nQAT Callback for HF Causal Trainer" }, { "objectID": "docs/api/index.html#core", @@ -1132,7 +1132,7 @@ "href": "docs/api/index.html#monkey-patches", "title": "API Reference", "section": "", - "text": "Runtime patches for model optimizations\n\n\n\nmonkeypatch.llama_attn_hijack_flash\nFlash attention monkey patch for llama model\n\n\nmonkeypatch.llama_attn_hijack_xformers\nDirectly copied the code from https://raw.githubusercontent.com/oobabooga/text-generation-webui/main/modules/llama_attn_hijack.py and made some adjustments\n\n\nmonkeypatch.mistral_attn_hijack_flash\nFlash attention monkey patch for mistral model\n\n\nmonkeypatch.multipack\nmultipack patching for v2 of sample packing\n\n\nmonkeypatch.relora\nImplements the ReLoRA training procedure from https://arxiv.org/abs/2307.05695, minus the initial full fine-tune.\n\n\nmonkeypatch.llama_expand_mask\nexpands the binary attention mask per 3.2.2 of https://arxiv.org/pdf/2107.02027.pdf\n\n\nmonkeypatch.lora_kernels\nModule for patching custom LoRA Triton kernels and torch.autograd functions.\n\n\nmonkeypatch.utils\nShared utils for the monkeypatches\n\n\nmonkeypatch.btlm_attn_hijack_flash\nFlash attention monkey patch for cerebras btlm model\n\n\nmonkeypatch.llama_patch_multipack\nPatched LlamaAttention to use torch.nn.functional.scaled_dot_product_attention\n\n\nmonkeypatch.stablelm_attn_hijack_flash\nPyTorch StableLM Epoch model.\n\n\nmonkeypatch.trainer_fsdp_optim\nfix for FSDP optimizer save in trainer w 4.47.0\n\n\nmonkeypatch.transformers_fa_utils\nsee https://github.com/huggingface/transformers/pull/35834\n\n\nmonkeypatch.unsloth_\nmodule for patching with unsloth optimizations\n\n\nmonkeypatch.data.batch_dataset_fetcher\nmonkey patches for the dataset fetcher to handle batches of packed indexes\n\n\nmonkeypatch.mixtral\nPatches to support multipack for mixtral\n\n\nmonkeypatch.gradient_checkpointing.offload_cpu\nCPU offloaded checkpointing\n\n\nmonkeypatch.gradient_checkpointing.offload_disk\nDISCO - DIsk-based Storage and Checkpointing with Optimized prefetching" + "text": "Runtime patches for model optimizations\n\n\n\nmonkeypatch.llama_attn_hijack_flash\nFlash attention monkey patch for llama model\n\n\nmonkeypatch.llama_attn_hijack_xformers\nDirectly copied the code from https://raw.githubusercontent.com/oobabooga/text-generation-webui/main/modules/llama_attn_hijack.py and made some adjustments\n\n\nmonkeypatch.mistral_attn_hijack_flash\nFlash attention monkey patch for mistral model\n\n\nmonkeypatch.multipack\nmultipack patching for v2 of sample packing\n\n\nmonkeypatch.relora\nImplements the ReLoRA training procedure from https://arxiv.org/abs/2307.05695, minus the initial full fine-tune.\n\n\nmonkeypatch.llama_expand_mask\nexpands the binary attention mask per 3.2.2 of https://arxiv.org/pdf/2107.02027.pdf\n\n\nmonkeypatch.lora_kernels\nModule for patching custom LoRA Triton kernels and torch.autograd functions.\n\n\nmonkeypatch.utils\nShared utils for the monkeypatches\n\n\nmonkeypatch.btlm_attn_hijack_flash\nFlash attention monkey patch for cerebras btlm model\n\n\nmonkeypatch.llama_patch_multipack\nPatched LlamaAttention to use torch.nn.functional.scaled_dot_product_attention\n\n\nmonkeypatch.stablelm_attn_hijack_flash\nPyTorch StableLM Epoch model.\n\n\nmonkeypatch.trainer_fsdp_optim\nfix for FSDP optimizer save in trainer w 4.47.0\n\n\nmonkeypatch.transformers_fa_utils\nsee https://github.com/huggingface/transformers/pull/35834\n\n\nmonkeypatch.unsloth_\nmodule for patching with unsloth optimizations\n\n\nmonkeypatch.data.batch_dataset_fetcher\nMonkey patches for the dataset fetcher to handle batches of packed indexes.\n\n\nmonkeypatch.mixtral\nPatches to support multipack for mixtral\n\n\nmonkeypatch.gradient_checkpointing.offload_cpu\nCPU offloaded checkpointing\n\n\nmonkeypatch.gradient_checkpointing.offload_disk\nDISCO - DIsk-based Storage and Checkpointing with Optimized prefetching" }, { "objectID": "docs/api/index.html#utils", @@ -3052,7 +3052,14 @@ "href": "docs/api/monkeypatch.data.batch_dataset_fetcher.html", "title": "monkeypatch.data.batch_dataset_fetcher", "section": "", - "text": "monkeypatch.data.batch_dataset_fetcher\nmonkeypatch.data.batch_dataset_fetcher\nmonkey patches for the dataset fetcher to handle batches of packed indexes" + "text": "monkeypatch.data.batch_dataset_fetcher\nMonkey patches for the dataset fetcher to handle batches of packed indexes.\n\n\n\n\n\nName\nDescription\n\n\n\n\napply_multipack_dataloader_patch\nThis patch allows DataLoader to correctly process batches that contain multiple bins\n\n\npatch_fetchers\nApply patches to PyTorch’s DataLoader components.\n\n\npatched_worker_loop\nWorker loop that ensures patches are applied in worker processes.\n\n\nremove_multipack_dataloader_patch\nRemove the monkeypatch and restore original PyTorch DataLoader behavior.\n\n\n\n\n\nmonkeypatch.data.batch_dataset_fetcher.apply_multipack_dataloader_patch()\nThis patch allows DataLoader to correctly process batches that contain multiple bins\nof packed sequences.\n\n\n\nmonkeypatch.data.batch_dataset_fetcher.patch_fetchers()\nApply patches to PyTorch’s DataLoader components.\n\n\n\nmonkeypatch.data.batch_dataset_fetcher.patched_worker_loop(*args, **kwargs)\nWorker loop that ensures patches are applied in worker processes.\n\n\n\nmonkeypatch.data.batch_dataset_fetcher.remove_multipack_dataloader_patch()\nRemove the monkeypatch and restore original PyTorch DataLoader behavior." + }, + { + "objectID": "docs/api/monkeypatch.data.batch_dataset_fetcher.html#functions", + "href": "docs/api/monkeypatch.data.batch_dataset_fetcher.html#functions", + "title": "monkeypatch.data.batch_dataset_fetcher", + "section": "", + "text": "Name\nDescription\n\n\n\n\napply_multipack_dataloader_patch\nThis patch allows DataLoader to correctly process batches that contain multiple bins\n\n\npatch_fetchers\nApply patches to PyTorch’s DataLoader components.\n\n\npatched_worker_loop\nWorker loop that ensures patches are applied in worker processes.\n\n\nremove_multipack_dataloader_patch\nRemove the monkeypatch and restore original PyTorch DataLoader behavior.\n\n\n\n\n\nmonkeypatch.data.batch_dataset_fetcher.apply_multipack_dataloader_patch()\nThis patch allows DataLoader to correctly process batches that contain multiple bins\nof packed sequences.\n\n\n\nmonkeypatch.data.batch_dataset_fetcher.patch_fetchers()\nApply patches to PyTorch’s DataLoader components.\n\n\n\nmonkeypatch.data.batch_dataset_fetcher.patched_worker_loop(*args, **kwargs)\nWorker loop that ensures patches are applied in worker processes.\n\n\n\nmonkeypatch.data.batch_dataset_fetcher.remove_multipack_dataloader_patch()\nRemove the monkeypatch and restore original PyTorch DataLoader behavior." }, { "objectID": "docs/api/utils.lora.html", diff --git a/sitemap.xml b/sitemap.xml index 1f98ccbb9..dd2292cc8 100644 --- a/sitemap.xml +++ b/sitemap.xml @@ -2,790 +2,790 @@ https://docs.axolotl.ai/index.html - 2025-08-22T11:26:37.313Z + 2025-08-22T18:29:19.174Z https://docs.axolotl.ai/src/axolotl/integrations/LICENSE.html - 2025-08-22T11:26:37.317Z + 2025-08-22T18:29:19.178Z https://docs.axolotl.ai/docs/gradient_checkpointing.html - 2025-08-22T11:26:37.293Z + 2025-08-22T18:29:19.154Z https://docs.axolotl.ai/docs/mixed_precision.html - 2025-08-22T11:26:37.296Z + 2025-08-22T18:29:19.157Z https://docs.axolotl.ai/docs/sequence_parallelism.html - 2025-08-22T11:26:37.297Z + 2025-08-22T18:29:19.158Z https://docs.axolotl.ai/docs/docker.html - 2025-08-22T11:26:37.293Z + 2025-08-22T18:29:19.154Z https://docs.axolotl.ai/docs/torchao.html - 2025-08-22T11:26:37.297Z + 2025-08-22T18:29:19.158Z https://docs.axolotl.ai/docs/multi-gpu.html - 2025-08-22T11:26:37.296Z + 2025-08-22T18:29:19.157Z https://docs.axolotl.ai/docs/dataset_preprocessing.html - 2025-08-22T11:26:37.293Z + 2025-08-22T18:29:19.154Z https://docs.axolotl.ai/docs/debugging.html - 2025-08-22T11:26:37.293Z + 2025-08-22T18:29:19.154Z https://docs.axolotl.ai/docs/rlhf.html - 2025-08-22T11:26:37.296Z + 2025-08-22T18:29:19.158Z https://docs.axolotl.ai/docs/lr_groups.html - 2025-08-22T11:26:37.296Z + 2025-08-22T18:29:19.157Z https://docs.axolotl.ai/docs/multimodal.html - 2025-08-22T11:26:37.296Z + 2025-08-22T18:29:19.157Z https://docs.axolotl.ai/docs/ray-integration.html - 2025-08-22T11:26:37.296Z + 2025-08-22T18:29:19.157Z https://docs.axolotl.ai/docs/input_output.html - 2025-08-22T11:26:37.296Z + 2025-08-22T18:29:19.157Z https://docs.axolotl.ai/docs/inference.html - 2025-08-22T11:26:37.296Z + 2025-08-22T18:29:19.157Z https://docs.axolotl.ai/docs/fsdp_qlora.html - 2025-08-22T11:26:37.293Z + 2025-08-22T18:29:19.154Z https://docs.axolotl.ai/docs/multipack.html - 2025-08-22T11:26:37.296Z + 2025-08-22T18:29:19.157Z https://docs.axolotl.ai/docs/api/prompt_strategies.input_output.html - 2025-08-22T11:29:50.982Z + 2025-08-22T18:32:33.192Z https://docs.axolotl.ai/docs/api/monkeypatch.llama_patch_multipack.html - 2025-08-22T11:29:51.247Z + 2025-08-22T18:32:33.459Z https://docs.axolotl.ai/docs/api/cli.art.html - 2025-08-22T11:29:50.615Z + 2025-08-22T18:32:32.820Z https://docs.axolotl.ai/docs/api/cli.quantize.html - 2025-08-22T11:29:50.692Z + 2025-08-22T18:32:32.900Z https://docs.axolotl.ai/docs/api/monkeypatch.llama_attn_hijack_flash.html - 2025-08-22T11:29:51.198Z + 2025-08-22T18:32:33.410Z https://docs.axolotl.ai/docs/api/utils.callbacks.profiler.html - 2025-08-22T11:29:51.791Z + 2025-08-22T18:32:34.002Z https://docs.axolotl.ai/docs/api/prompt_strategies.stepwise_supervised.html - 2025-08-22T11:29:50.986Z + 2025-08-22T18:32:33.196Z https://docs.axolotl.ai/docs/api/integrations.cut_cross_entropy.args.html - 2025-08-22T11:29:51.672Z + 2025-08-22T18:32:33.885Z https://docs.axolotl.ai/docs/api/utils.data.sft.html - 2025-08-22T11:29:51.413Z + 2025-08-22T18:32:33.632Z https://docs.axolotl.ai/docs/api/monkeypatch.unsloth_.html - 2025-08-22T11:29:51.263Z + 2025-08-22T18:32:33.477Z https://docs.axolotl.ai/docs/api/kernels.geglu.html - 2025-08-22T11:29:51.173Z + 2025-08-22T18:32:33.385Z https://docs.axolotl.ai/docs/api/prompt_strategies.orpo.chat_template.html - 2025-08-22T11:29:51.072Z + 2025-08-22T18:32:33.283Z https://docs.axolotl.ai/docs/api/cli.utils.sweeps.html - 2025-08-22T11:29:50.738Z + 2025-08-22T18:32:32.946Z https://docs.axolotl.ai/docs/api/cli.delinearize_llama4.html - 2025-08-22T11:29:50.645Z + 2025-08-22T18:32:32.851Z https://docs.axolotl.ai/docs/api/prompt_strategies.pygmalion.html - 2025-08-22T11:29:51.003Z + 2025-08-22T18:32:33.214Z https://docs.axolotl.ai/docs/api/evaluate.html - 2025-08-22T11:29:50.399Z + 2025-08-22T18:32:32.602Z https://docs.axolotl.ai/docs/api/utils.data.pretraining.html - 2025-08-22T11:29:51.406Z + 2025-08-22T18:32:33.625Z https://docs.axolotl.ai/docs/api/index.html - 2025-08-22T11:29:50.330Z + 2025-08-22T18:32:32.534Z https://docs.axolotl.ai/docs/api/monkeypatch.stablelm_attn_hijack_flash.html - 2025-08-22T11:29:51.252Z + 2025-08-22T18:32:33.465Z https://docs.axolotl.ai/docs/api/monkeypatch.utils.html - 2025-08-22T11:29:51.244Z + 2025-08-22T18:32:33.456Z https://docs.axolotl.ai/docs/api/cli.checks.html - 2025-08-22T11:29:50.622Z + 2025-08-22T18:32:32.827Z https://docs.axolotl.ai/docs/api/utils.chat_templates.html - 2025-08-22T11:29:51.304Z + 2025-08-22T18:32:33.524Z https://docs.axolotl.ai/docs/api/core.builders.rl.html - 2025-08-22T11:29:50.490Z + 2025-08-22T18:32:32.692Z https://docs.axolotl.ai/docs/api/prompt_strategies.messages.chat.html - 2025-08-22T11:29:51.007Z + 2025-08-22T18:32:33.218Z https://docs.axolotl.ai/docs/api/core.trainers.mixins.optimizer.html - 2025-08-22T11:29:50.854Z + 2025-08-22T18:32:33.062Z https://docs.axolotl.ai/docs/api/prompt_strategies.orcamini.html - 2025-08-22T11:29:50.997Z + 2025-08-22T18:32:33.207Z https://docs.axolotl.ai/docs/api/core.trainers.mixins.scheduler.html - 2025-08-22T11:29:50.864Z + 2025-08-22T18:32:33.073Z https://docs.axolotl.ai/docs/api/cli.utils.fetch.html - 2025-08-22T11:29:50.727Z + 2025-08-22T18:32:32.935Z https://docs.axolotl.ai/docs/api/utils.schemas.datasets.html - 2025-08-22T11:29:51.481Z + 2025-08-22T18:32:33.699Z https://docs.axolotl.ai/docs/api/cli.cloud.base.html - 2025-08-22T11:29:50.702Z + 2025-08-22T18:32:32.910Z https://docs.axolotl.ai/docs/api/cli.utils.args.html - 2025-08-22T11:29:50.721Z + 2025-08-22T18:32:32.929Z https://docs.axolotl.ai/docs/api/utils.callbacks.comet_.html - 2025-08-22T11:29:51.799Z + 2025-08-22T18:32:34.011Z https://docs.axolotl.ai/docs/api/utils.callbacks.mlflow_.html - 2025-08-22T11:29:51.796Z + 2025-08-22T18:32:34.007Z https://docs.axolotl.ai/docs/api/core.builders.causal.html - 2025-08-22T11:29:50.485Z + 2025-08-22T18:32:32.688Z https://docs.axolotl.ai/docs/api/cli.train.html - 2025-08-22T11:29:50.585Z + 2025-08-22T18:32:32.788Z https://docs.axolotl.ai/docs/api/utils.schemas.integrations.html - 2025-08-22T11:29:51.511Z + 2025-08-22T18:32:33.728Z https://docs.axolotl.ai/docs/api/integrations.lm_eval.args.html - 2025-08-22T11:29:51.688Z + 2025-08-22T18:32:33.901Z https://docs.axolotl.ai/docs/api/cli.evaluate.html - 2025-08-22T11:29:50.593Z + 2025-08-22T18:32:32.797Z https://docs.axolotl.ai/docs/api/utils.trainer.html - 2025-08-22T11:29:51.343Z + 2025-08-22T18:32:33.563Z https://docs.axolotl.ai/docs/api/prompt_strategies.kto.llama3.html - 2025-08-22T11:29:51.042Z + 2025-08-22T18:32:33.252Z https://docs.axolotl.ai/docs/api/convert.html - 2025-08-22T11:29:50.423Z + 2025-08-22T18:32:32.626Z https://docs.axolotl.ai/docs/api/utils.schemas.multimodal.html - 2025-08-22T11:29:51.498Z + 2025-08-22T18:32:33.716Z https://docs.axolotl.ai/docs/api/loaders.patch_manager.html - 2025-08-22T11:29:50.847Z + 2025-08-22T18:32:33.055Z https://docs.axolotl.ai/docs/api/utils.schemas.training.html - 2025-08-22T11:29:51.463Z + 2025-08-22T18:32:33.681Z https://docs.axolotl.ai/docs/api/utils.schemas.config.html - 2025-08-22T11:29:51.449Z + 2025-08-22T18:32:33.667Z https://docs.axolotl.ai/docs/api/prompt_strategies.kto.user_defined.html - 2025-08-22T11:29:51.052Z + 2025-08-22T18:32:33.262Z https://docs.axolotl.ai/docs/api/prompt_strategies.bradley_terry.llama3.html - 2025-08-22T11:29:51.076Z + 2025-08-22T18:32:33.287Z https://docs.axolotl.ai/docs/api/cli.vllm_serve.html - 2025-08-22T11:29:50.699Z + 2025-08-22T18:32:32.907Z https://docs.axolotl.ai/docs/api/prompt_strategies.alpaca_w_system.html - 2025-08-22T11:29:50.949Z + 2025-08-22T18:32:33.159Z https://docs.axolotl.ai/docs/api/cli.merge_lora.html - 2025-08-22T11:29:50.667Z + 2025-08-22T18:32:32.874Z https://docs.axolotl.ai/docs/api/utils.ctx_managers.sequence_parallel.html - 2025-08-22T11:29:50.887Z + 2025-08-22T18:32:33.096Z https://docs.axolotl.ai/docs/api/prompt_strategies.alpaca_instruct.html - 2025-08-22T11:29:50.937Z + 2025-08-22T18:32:33.147Z https://docs.axolotl.ai/docs/api/utils.bench.html - 2025-08-22T11:29:51.318Z + 2025-08-22T18:32:33.538Z https://docs.axolotl.ai/docs/api/common.datasets.html - 2025-08-22T11:29:51.709Z + 2025-08-22T18:32:33.922Z https://docs.axolotl.ai/docs/api/cli.utils.train.html - 2025-08-22T11:29:50.750Z + 2025-08-22T18:32:32.958Z https://docs.axolotl.ai/docs/api/monkeypatch.llama_attn_hijack_xformers.html - 2025-08-22T11:29:51.199Z + 2025-08-22T18:32:33.411Z https://docs.axolotl.ai/docs/api/core.chat.messages.html - 2025-08-22T11:29:50.525Z + 2025-08-22T18:32:32.728Z https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.chat_template.html - 2025-08-22T11:29:51.009Z + 2025-08-22T18:32:33.219Z https://docs.axolotl.ai/docs/api/core.trainers.trl.html - 2025-08-22T11:29:50.775Z + 2025-08-22T18:32:32.984Z https://docs.axolotl.ai/docs/api/cli.preprocess.html - 2025-08-22T11:29:50.687Z + 2025-08-22T18:32:32.895Z https://docs.axolotl.ai/docs/api/kernels.swiglu.html - 2025-08-22T11:29:51.183Z + 2025-08-22T18:32:33.395Z https://docs.axolotl.ai/docs/api/kernels.quantize.html - 2025-08-22T11:29:51.191Z + 2025-08-22T18:32:33.403Z https://docs.axolotl.ai/docs/api/prompt_strategies.chat_template.html - 2025-08-22T11:29:50.922Z + 2025-08-22T18:32:33.132Z https://docs.axolotl.ai/docs/api/prompt_strategies.kto.chatml.html - 2025-08-22T11:29:51.050Z + 2025-08-22T18:32:33.261Z https://docs.axolotl.ai/docs/api/core.trainers.grpo.trainer.html - 2025-08-22T11:29:50.798Z + 2025-08-22T18:32:33.007Z https://docs.axolotl.ai/docs/api/monkeypatch.mistral_attn_hijack_flash.html - 2025-08-22T11:29:51.201Z + 2025-08-22T18:32:33.413Z https://docs.axolotl.ai/docs/api/core.datasets.chat.html - 2025-08-22T11:29:50.535Z + 2025-08-22T18:32:32.737Z https://docs.axolotl.ai/docs/api/cli.args.html - 2025-08-22T11:29:50.612Z + 2025-08-22T18:32:32.817Z https://docs.axolotl.ai/docs/api/cli.main.html - 2025-08-22T11:29:50.577Z + 2025-08-22T18:32:32.780Z https://docs.axolotl.ai/docs/api/core.trainers.dpo.trainer.html - 2025-08-22T11:29:50.788Z + 2025-08-22T18:32:32.996Z https://docs.axolotl.ai/docs/api/utils.schemas.trl.html - 2025-08-22T11:29:51.493Z + 2025-08-22T18:32:33.711Z https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.passthrough.html - 2025-08-22T11:29:51.034Z + 2025-08-22T18:32:33.244Z https://docs.axolotl.ai/docs/api/prompt_tokenizers.html - 2025-08-22T11:29:50.465Z + 2025-08-22T18:32:32.668Z https://docs.axolotl.ai/docs/api/prompt_strategies.alpaca_chat.html - 2025-08-22T11:29:50.935Z + 2025-08-22T18:32:33.146Z https://docs.axolotl.ai/docs/api/logging_config.html - 2025-08-22T11:29:50.474Z + 2025-08-22T18:32:32.677Z https://docs.axolotl.ai/docs/dataset-formats/tokenized.html - 2025-08-22T11:26:37.293Z + 2025-08-22T18:29:19.154Z https://docs.axolotl.ai/docs/dataset-formats/index.html - 2025-08-22T11:26:37.292Z + 2025-08-22T18:29:19.153Z https://docs.axolotl.ai/docs/dataset-formats/pretraining.html - 2025-08-22T11:26:37.292Z + 2025-08-22T18:29:19.154Z https://docs.axolotl.ai/docs/dataset-formats/inst_tune.html - 2025-08-22T11:26:37.292Z + 2025-08-22T18:29:19.154Z https://docs.axolotl.ai/docs/qat.html - 2025-08-22T11:26:37.296Z + 2025-08-22T18:29:19.157Z https://docs.axolotl.ai/examples/colab-notebooks/colab-axolotl-example.html - 2025-08-22T11:26:37.301Z + 2025-08-22T18:29:19.162Z https://docs.axolotl.ai/FAQS.html - 2025-08-22T11:26:37.291Z + 2025-08-22T18:29:19.152Z https://docs.axolotl.ai/docs/installation.html - 2025-08-22T11:26:37.296Z + 2025-08-22T18:29:19.157Z https://docs.axolotl.ai/docs/dataset-formats/stepwise_supervised.html - 2025-08-22T11:26:37.292Z + 2025-08-22T18:29:19.154Z https://docs.axolotl.ai/docs/dataset-formats/template_free.html - 2025-08-22T11:26:37.293Z + 2025-08-22T18:29:19.154Z https://docs.axolotl.ai/docs/dataset-formats/conversation.html - 2025-08-22T11:26:37.292Z + 2025-08-22T18:29:19.153Z https://docs.axolotl.ai/docs/api/utils.dict.html - 2025-08-22T11:29:51.397Z + 2025-08-22T18:32:33.616Z https://docs.axolotl.ai/docs/api/prompt_strategies.completion.html - 2025-08-22T11:29:50.976Z + 2025-08-22T18:32:33.186Z https://docs.axolotl.ai/docs/api/utils.collators.core.html - 2025-08-22T11:29:51.711Z + 2025-08-22T18:32:33.925Z https://docs.axolotl.ai/docs/api/cli.inference.html - 2025-08-22T11:29:50.659Z + 2025-08-22T18:32:32.865Z https://docs.axolotl.ai/docs/api/utils.freeze.html - 2025-08-22T11:29:51.326Z + 2025-08-22T18:32:33.546Z https://docs.axolotl.ai/docs/api/core.trainers.grpo.sampler.html - 2025-08-22T11:29:50.810Z + 2025-08-22T18:32:33.019Z https://docs.axolotl.ai/docs/api/core.trainers.mixins.rng_state_loader.html - 2025-08-22T11:29:50.857Z + 2025-08-22T18:32:33.066Z https://docs.axolotl.ai/docs/api/cli.utils.html - 2025-08-22T11:29:50.710Z + 2025-08-22T18:32:32.918Z https://docs.axolotl.ai/docs/api/core.chat.format.shared.html - 2025-08-22T11:29:50.530Z + 2025-08-22T18:32:32.732Z https://docs.axolotl.ai/docs/api/utils.callbacks.lisa.html - 2025-08-22T11:29:51.792Z + 2025-08-22T18:32:34.004Z https://docs.axolotl.ai/docs/api/utils.collators.mm_chat.html - 2025-08-22T11:29:51.738Z + 2025-08-22T18:32:33.952Z https://docs.axolotl.ai/docs/api/core.trainers.utils.html - 2025-08-22T11:29:50.812Z + 2025-08-22T18:32:33.021Z https://docs.axolotl.ai/docs/api/utils.optimizers.adopt.html - 2025-08-22T11:29:51.404Z + 2025-08-22T18:32:33.624Z https://docs.axolotl.ai/docs/api/integrations.base.html - 2025-08-22T11:29:51.669Z + 2025-08-22T18:32:33.882Z https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.user_defined.html - 2025-08-22T11:29:51.032Z + 2025-08-22T18:32:33.243Z https://docs.axolotl.ai/docs/api/monkeypatch.btlm_attn_hijack_flash.html - 2025-08-22T11:29:51.245Z + 2025-08-22T18:32:33.458Z https://docs.axolotl.ai/docs/api/utils.quantization.html - 2025-08-22T11:29:51.435Z + 2025-08-22T18:32:33.653Z https://docs.axolotl.ai/docs/api/utils.callbacks.qat.html - 2025-08-22T11:29:51.806Z + 2025-08-22T18:32:34.017Z https://docs.axolotl.ai/docs/api/core.builders.base.html - 2025-08-22T11:29:50.481Z + 2025-08-22T18:32:32.683Z https://docs.axolotl.ai/docs/api/monkeypatch.gradient_checkpointing.offload_cpu.html - 2025-08-22T11:29:51.270Z + 2025-08-22T18:32:33.490Z https://docs.axolotl.ai/docs/api/integrations.kd.trainer.html - 2025-08-22T11:29:51.681Z + 2025-08-22T18:32:33.894Z https://docs.axolotl.ai/docs/api/integrations.liger.args.html - 2025-08-22T11:29:51.684Z + 2025-08-22T18:32:33.897Z https://docs.axolotl.ai/docs/api/utils.collators.mamba.html - 2025-08-22T11:29:51.734Z + 2025-08-22T18:32:33.947Z https://docs.axolotl.ai/docs/api/loaders.model.html - 2025-08-22T11:29:50.822Z + 2025-08-22T18:32:33.030Z https://docs.axolotl.ai/docs/api/utils.schedulers.html - 2025-08-22T11:29:51.371Z + 2025-08-22T18:32:33.591Z https://docs.axolotl.ai/docs/api/kernels.lora.html - 2025-08-22T11:29:51.163Z + 2025-08-22T18:32:33.374Z https://docs.axolotl.ai/docs/api/utils.model_shard_quant.html - 2025-08-22T11:29:51.314Z + 2025-08-22T18:32:33.535Z https://docs.axolotl.ai/docs/api/core.chat.format.llama3x.html - 2025-08-22T11:29:50.528Z + 2025-08-22T18:32:32.731Z https://docs.axolotl.ai/docs/api/core.trainers.mamba.html - 2025-08-22T11:29:50.781Z + 2025-08-22T18:32:32.989Z https://docs.axolotl.ai/docs/api/utils.schemas.enums.html - 2025-08-22T11:29:51.521Z + 2025-08-22T18:32:33.739Z https://docs.axolotl.ai/docs/api/monkeypatch.mixtral.html - 2025-08-22T11:29:51.266Z + 2025-08-22T18:32:33.487Z https://docs.axolotl.ai/docs/api/kernels.utils.html - 2025-08-22T11:29:51.192Z + 2025-08-22T18:32:33.404Z https://docs.axolotl.ai/docs/api/core.training_args.html - 2025-08-22T11:29:50.502Z + 2025-08-22T18:32:32.705Z https://docs.axolotl.ai/docs/api/utils.callbacks.perplexity.html - 2025-08-22T11:29:51.787Z + 2025-08-22T18:32:33.999Z https://docs.axolotl.ai/docs/api/cli.cloud.modal_.html - 2025-08-22T11:29:50.708Z + 2025-08-22T18:32:32.916Z https://docs.axolotl.ai/docs/api/cli.utils.load.html - 2025-08-22T11:29:50.732Z + 2025-08-22T18:32:32.940Z https://docs.axolotl.ai/docs/api/train.html - 2025-08-22T11:29:50.389Z + 2025-08-22T18:32:32.592Z https://docs.axolotl.ai/docs/api/integrations.grokfast.optimizer.html - 2025-08-22T11:29:51.673Z + 2025-08-22T18:32:33.887Z https://docs.axolotl.ai/docs/api/utils.samplers.multipack.html - 2025-08-22T11:29:51.781Z + 2025-08-22T18:32:33.992Z https://docs.axolotl.ai/docs/api/prompt_strategies.metharme.html - 2025-08-22T11:29:50.993Z + 2025-08-22T18:32:33.203Z https://docs.axolotl.ai/docs/api/monkeypatch.llama_expand_mask.html - 2025-08-22T11:29:51.207Z + 2025-08-22T18:32:33.419Z https://docs.axolotl.ai/docs/api/monkeypatch.data.batch_dataset_fetcher.html - 2025-08-22T11:29:51.265Z + 2025-08-22T18:32:33.486Z https://docs.axolotl.ai/docs/api/utils.lora.html - 2025-08-22T11:29:51.309Z + 2025-08-22T18:32:33.529Z https://docs.axolotl.ai/docs/api/loaders.tokenizer.html - 2025-08-22T11:29:50.830Z + 2025-08-22T18:32:33.039Z https://docs.axolotl.ai/docs/api/core.chat.format.chatml.html - 2025-08-22T11:29:50.527Z + 2025-08-22T18:32:32.729Z https://docs.axolotl.ai/docs/api/utils.collators.batching.html - 2025-08-22T11:29:51.730Z + 2025-08-22T18:32:33.944Z https://docs.axolotl.ai/docs/api/cli.merge_sharded_fsdp_weights.html - 2025-08-22T11:29:50.679Z + 2025-08-22T18:32:32.886Z https://docs.axolotl.ai/docs/api/prompt_strategies.llama2_chat.html - 2025-08-22T11:29:50.970Z + 2025-08-22T18:32:33.180Z https://docs.axolotl.ai/docs/api/utils.tokenization.html - 2025-08-22T11:29:51.302Z + 2025-08-22T18:32:33.523Z https://docs.axolotl.ai/docs/api/common.architectures.html - 2025-08-22T11:29:51.692Z + 2025-08-22T18:32:33.905Z https://docs.axolotl.ai/docs/api/core.datasets.transforms.chat_builder.html - 2025-08-22T11:29:50.542Z + 2025-08-22T18:32:32.745Z https://docs.axolotl.ai/docs/api/core.trainers.base.html - 2025-08-22T11:29:50.760Z + 2025-08-22T18:32:32.969Z https://docs.axolotl.ai/docs/api/monkeypatch.lora_kernels.html - 2025-08-22T11:29:51.236Z + 2025-08-22T18:32:33.448Z https://docs.axolotl.ai/docs/api/utils.schemas.utils.html - 2025-08-22T11:29:51.527Z + 2025-08-22T18:32:33.744Z https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.llama3.html - 2025-08-22T11:29:51.019Z + 2025-08-22T18:32:33.230Z https://docs.axolotl.ai/docs/api/cli.config.html - 2025-08-22T11:29:50.640Z + 2025-08-22T18:32:32.846Z https://docs.axolotl.ai/docs/api/utils.schemas.peft.html - 2025-08-22T11:29:51.490Z + 2025-08-22T18:32:33.707Z https://docs.axolotl.ai/docs/api/prompt_strategies.user_defined.html - 2025-08-22T11:29:50.957Z + 2025-08-22T18:32:33.167Z https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.zephyr.html - 2025-08-22T11:29:51.031Z + 2025-08-22T18:32:33.241Z https://docs.axolotl.ai/docs/api/monkeypatch.multipack.html - 2025-08-22T11:29:51.202Z + 2025-08-22T18:32:33.414Z https://docs.axolotl.ai/docs/api/prompt_strategies.base.html - 2025-08-22T11:29:50.889Z + 2025-08-22T18:32:33.098Z https://docs.axolotl.ai/docs/api/models.mamba.modeling_mamba.html - 2025-08-22T11:29:51.710Z + 2025-08-22T18:32:33.923Z https://docs.axolotl.ai/docs/api/monkeypatch.relora.html - 2025-08-22T11:29:51.206Z + 2025-08-22T18:32:33.418Z https://docs.axolotl.ai/docs/api/common.const.html - 2025-08-22T11:29:51.694Z + 2025-08-22T18:32:33.907Z https://docs.axolotl.ai/docs/api/monkeypatch.trainer_fsdp_optim.html - 2025-08-22T11:29:51.256Z + 2025-08-22T18:32:33.469Z https://docs.axolotl.ai/docs/api/utils.distributed.html - 2025-08-22T11:29:51.391Z + 2025-08-22T18:32:33.611Z https://docs.axolotl.ai/docs/api/loaders.constants.html - 2025-08-22T11:29:50.848Z + 2025-08-22T18:32:33.057Z https://docs.axolotl.ai/docs/api/monkeypatch.gradient_checkpointing.offload_disk.html - 2025-08-22T11:29:51.296Z + 2025-08-22T18:32:33.516Z https://docs.axolotl.ai/docs/api/datasets.html - 2025-08-22T11:29:50.410Z + 2025-08-22T18:32:32.613Z https://docs.axolotl.ai/docs/api/monkeypatch.transformers_fa_utils.html - 2025-08-22T11:29:51.262Z + 2025-08-22T18:32:33.475Z https://docs.axolotl.ai/docs/api/loaders.processor.html - 2025-08-22T11:29:50.832Z + 2025-08-22T18:32:33.040Z https://docs.axolotl.ai/docs/api/integrations.spectrum.args.html - 2025-08-22T11:29:51.691Z + 2025-08-22T18:32:33.904Z https://docs.axolotl.ai/docs/api/loaders.adapter.html - 2025-08-22T11:29:50.837Z + 2025-08-22T18:32:33.046Z https://docs.axolotl.ai/docs/api/utils.schemas.model.html - 2025-08-22T11:29:51.456Z + 2025-08-22T18:32:33.674Z https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.chatml.html - 2025-08-22T11:29:51.030Z + 2025-08-22T18:32:33.240Z https://docs.axolotl.ai/docs/batch_vs_grad.html - 2025-08-22T11:26:37.292Z + 2025-08-22T18:29:19.153Z https://docs.axolotl.ai/docs/mac.html - 2025-08-22T11:26:37.296Z + 2025-08-22T18:29:19.157Z https://docs.axolotl.ai/docs/nd_parallelism.html - 2025-08-22T11:26:37.296Z + 2025-08-22T18:29:19.157Z https://docs.axolotl.ai/docs/dataset_loading.html - 2025-08-22T11:26:37.293Z + 2025-08-22T18:29:19.154Z https://docs.axolotl.ai/docs/lora_optims.html - 2025-08-22T11:26:37.296Z + 2025-08-22T18:29:19.157Z https://docs.axolotl.ai/docs/unsloth.html - 2025-08-22T11:26:37.297Z + 2025-08-22T18:29:19.158Z https://docs.axolotl.ai/docs/config-reference.html - 2025-08-22T11:30:04.826Z + 2025-08-22T18:32:49.071Z https://docs.axolotl.ai/docs/custom_integrations.html - 2025-08-22T11:26:37.292Z + 2025-08-22T18:29:19.153Z https://docs.axolotl.ai/docs/faq.html - 2025-08-22T11:26:37.293Z + 2025-08-22T18:29:19.154Z https://docs.axolotl.ai/docs/amd_hpc.html - 2025-08-22T11:26:37.292Z + 2025-08-22T18:29:19.153Z https://docs.axolotl.ai/docs/multi-node.html - 2025-08-22T11:26:37.296Z + 2025-08-22T18:29:19.157Z https://docs.axolotl.ai/docs/cli.html - 2025-08-22T11:26:37.292Z + 2025-08-22T18:29:19.153Z https://docs.axolotl.ai/docs/nccl.html - 2025-08-22T11:26:37.296Z + 2025-08-22T18:29:19.157Z https://docs.axolotl.ai/docs/optimizers.html - 2025-08-22T11:26:37.296Z + 2025-08-22T18:29:19.157Z https://docs.axolotl.ai/docs/getting-started.html - 2025-08-22T11:26:37.293Z + 2025-08-22T18:29:19.154Z https://docs.axolotl.ai/docs/quantize.html - 2025-08-22T11:26:37.296Z + 2025-08-22T18:29:19.157Z https://docs.axolotl.ai/docs/reward_modelling.html - 2025-08-22T11:26:37.296Z + 2025-08-22T18:29:19.157Z https://docs.axolotl.ai/src/axolotl/integrations/cut_cross_entropy/ACKNOWLEDGEMENTS.html - 2025-08-22T11:26:37.317Z + 2025-08-22T18:29:19.178Z