Adjust position IDs for a sliced sequence to maintain proper relative positions.
-This handles the case where position IDs might not be contiguous due to sample
-packing.
diff --git a/search.json b/search.json
index be1b9ec67..1640867b8 100644
--- a/search.json
+++ b/search.json
@@ -2422,7 +2422,7 @@
"href": "docs/api/utils.collators.batching.html",
"title": "utils.collators.batching",
"section": "",
- "text": "utils.collators.batching\nData collators for axolotl to pad labels and position_ids for packed sequences. Also\nincludes logic for handling sequence parallelism collation.\n\n\n\n\n\nName\nDescription\n\n\n\n\nBatchSamplerDataCollatorForSeq2Seq\nCollator for multipack specific to the using the BatchSampler\n\n\nDataCollatorForSeq2Seq\nData collator that will dynamically pad the inputs received, as well as the labels and position_ids\n\n\nPretrainingBatchSamplerDataCollatorForSeq2Seq\nCollator for multipack specific to the using the BatchSampler\n\n\nV2BatchSamplerDataCollatorForSeq2Seq\nCollator for multipack specific to the using the BatchSampler\n\n\n\n\n\nutils.collators.batching.BatchSamplerDataCollatorForSeq2Seq(\n self,\n tokenizer,\n model=None,\n padding=True,\n max_length=None,\n pad_to_multiple_of=None,\n label_pad_token_id=-100,\n position_pad_token_id=0,\n return_tensors='pt',\n sequence_parallel_degree=1,\n)\nCollator for multipack specific to the using the BatchSampler\n\n\n\nutils.collators.batching.DataCollatorForSeq2Seq(\n self,\n tokenizer,\n model=None,\n padding=True,\n max_length=None,\n pad_to_multiple_of=None,\n label_pad_token_id=-100,\n position_pad_token_id=0,\n return_tensors='pt',\n sequence_parallel_degree=1,\n)\nData collator that will dynamically pad the inputs received, as well as the labels and position_ids\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\ntokenizer\n[PreTrainedTokenizer] or [PreTrainedTokenizerFast]\nThe tokenizer used for encoding the data.\nrequired\n\n\nmodel\n[PreTrainedModel]\nThe model that is being trained. If set and has the prepare_decoder_input_ids_from_labels, use it to prepare the decoder_input_ids This is useful when using label_smoothing to avoid calculating loss twice.\nNone\n\n\npadding\nbool, str or [~utils.PaddingStrategy], optional, defaults to True\nSelect a strategy to pad the returned sequences (according to the model’s padding side and padding index) among: - True or 'longest' (default): Pad to the longest sequence in the batch (or no padding if only a single sequence is provided). - 'max_length': Pad to a maximum length specified with the argument max_length or to the maximum acceptable input length for the model if that argument is not provided. - False or 'do_not_pad': No padding (i.e., can output a batch with sequences of different lengths).\nTrue\n\n\nmax_length\nint, optional\nMaximum length of the returned list and optionally padding length (see above).\nNone\n\n\npad_to_multiple_of\nint, optional\nIf set will pad the sequence to a multiple of the provided value. This is especially useful to enable the use of Tensor Cores on NVIDIA hardware with compute capability >= 7.5 (Volta).\nNone\n\n\nlabel_pad_token_id\nint, optional, defaults to -100\nThe id to use when padding the labels (-100 will be automatically ignored by PyTorch loss functions).\n-100\n\n\nreturn_tensors\nstr\nThe type of Tensor to return. Allowable values are “np”, “pt” and “tf”.\n'pt'\n\n\nsequence_parallel_degree\nint\nThe degree of sequence parallelism. Default to 1 for no sequence parallelism.\n1\n\n\n\n\n\n\n\n\n\nName\nDescription\n\n\n\n\napply_sequence_parallelism\nApply sequence parallelism slicing to a batch.\n\n\n\n\n\nutils.collators.batching.DataCollatorForSeq2Seq.apply_sequence_parallelism(\n batch,\n)\nApply sequence parallelism slicing to a batch.\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\nbatch\ndict[str, torch.Tensor]\nBatch dictionary from parent collator.\nrequired\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\n\n\n\n\n\ntorch.Tensor\nSliced batch dictionary.\n\n\n\n\n\n\n\n\n\nutils.collators.batching.PretrainingBatchSamplerDataCollatorForSeq2Seq(\n self,\n *args,\n multipack_attn=True,\n **kwargs,\n)\nCollator for multipack specific to the using the BatchSampler\n\n\n\nutils.collators.batching.V2BatchSamplerDataCollatorForSeq2Seq(\n self,\n tokenizer,\n model=None,\n padding=True,\n max_length=None,\n pad_to_multiple_of=None,\n label_pad_token_id=-100,\n position_pad_token_id=0,\n return_tensors='pt',\n sequence_parallel_degree=1,\n)\nCollator for multipack specific to the using the BatchSampler\n\n\n\n\n\n\n\nName\nDescription\n\n\n\n\nadjust_position_ids_for_slice\nAdjust position IDs for a sliced sequence to maintain proper relative positions.\n\n\n\n\n\nutils.collators.batching.adjust_position_ids_for_slice(position_ids, start_idx)\nAdjust position IDs for a sliced sequence to maintain proper relative positions.\nThis handles the case where position IDs might not be contiguous due to sample\npacking."
+ "text": "utils.collators.batching\nData collators for axolotl to pad labels and position_ids for packed sequences. Also\nincludes logic for handling sequence parallelism collation.\n\n\n\n\n\nName\nDescription\n\n\n\n\nBatchSamplerDataCollatorForSeq2Seq\nCollator for multipack specific to the using the BatchSampler\n\n\nDataCollatorForSeq2Seq\nData collator that will dynamically pad the inputs received, as well as the labels and position_ids\n\n\nPretrainingBatchSamplerDataCollatorForSeq2Seq\nCollator for multipack specific to the using the BatchSampler\n\n\nV2BatchSamplerDataCollatorForSeq2Seq\nCollator for multipack specific to the using the BatchSampler\n\n\n\n\n\nutils.collators.batching.BatchSamplerDataCollatorForSeq2Seq(\n self,\n tokenizer,\n model=None,\n padding=True,\n max_length=None,\n pad_to_multiple_of=None,\n label_pad_token_id=-100,\n position_pad_token_id=0,\n return_tensors='pt',\n sequence_parallel_degree=1,\n)\nCollator for multipack specific to the using the BatchSampler\n\n\n\nutils.collators.batching.DataCollatorForSeq2Seq(\n self,\n tokenizer,\n model=None,\n padding=True,\n max_length=None,\n pad_to_multiple_of=None,\n label_pad_token_id=-100,\n position_pad_token_id=0,\n return_tensors='pt',\n sequence_parallel_degree=1,\n)\nData collator that will dynamically pad the inputs received, as well as the labels and position_ids\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\ntokenizer\n[PreTrainedTokenizer] or [PreTrainedTokenizerFast]\nThe tokenizer used for encoding the data.\nrequired\n\n\nmodel\n[PreTrainedModel]\nThe model that is being trained. If set and has the prepare_decoder_input_ids_from_labels, use it to prepare the decoder_input_ids This is useful when using label_smoothing to avoid calculating loss twice.\nNone\n\n\npadding\nbool, str or [~utils.PaddingStrategy], optional, defaults to True\nSelect a strategy to pad the returned sequences (according to the model’s padding side and padding index) among: - True or 'longest' (default): Pad to the longest sequence in the batch (or no padding if only a single sequence is provided). - 'max_length': Pad to a maximum length specified with the argument max_length or to the maximum acceptable input length for the model if that argument is not provided. - False or 'do_not_pad': No padding (i.e., can output a batch with sequences of different lengths).\nTrue\n\n\nmax_length\nint, optional\nMaximum length of the returned list and optionally padding length (see above).\nNone\n\n\npad_to_multiple_of\nint, optional\nIf set will pad the sequence to a multiple of the provided value. This is especially useful to enable the use of Tensor Cores on NVIDIA hardware with compute capability >= 7.5 (Volta).\nNone\n\n\nlabel_pad_token_id\nint, optional, defaults to -100\nThe id to use when padding the labels (-100 will be automatically ignored by PyTorch loss functions).\n-100\n\n\nreturn_tensors\nstr\nThe type of Tensor to return. Allowable values are “np”, “pt” and “tf”.\n'pt'\n\n\nsequence_parallel_degree\nint\nThe degree of sequence parallelism. Default to 1 for no sequence parallelism.\n1\n\n\n\n\n\n\n\n\n\nName\nDescription\n\n\n\n\napply_sequence_parallelism\nApply sequence parallelism slicing to a batch.\n\n\n\n\n\nutils.collators.batching.DataCollatorForSeq2Seq.apply_sequence_parallelism(\n batch,\n)\nApply sequence parallelism slicing to a batch.\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\nbatch\ndict[str, torch.Tensor]\nBatch dictionary from parent collator.\nrequired\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\n\n\n\n\n\ntorch.Tensor\nSliced batch dictionary.\n\n\n\n\n\n\n\n\n\nutils.collators.batching.PretrainingBatchSamplerDataCollatorForSeq2Seq(\n self,\n *args,\n multipack_attn=True,\n **kwargs,\n)\nCollator for multipack specific to the using the BatchSampler\n\n\n\nutils.collators.batching.V2BatchSamplerDataCollatorForSeq2Seq(\n self,\n tokenizer,\n model=None,\n padding=True,\n max_length=None,\n pad_to_multiple_of=None,\n label_pad_token_id=-100,\n position_pad_token_id=0,\n return_tensors='pt',\n sequence_parallel_degree=1,\n)\nCollator for multipack specific to the using the BatchSampler"
},
{
"objectID": "docs/api/utils.collators.batching.html#classes",
@@ -2431,13 +2431,6 @@
"section": "",
"text": "Name\nDescription\n\n\n\n\nBatchSamplerDataCollatorForSeq2Seq\nCollator for multipack specific to the using the BatchSampler\n\n\nDataCollatorForSeq2Seq\nData collator that will dynamically pad the inputs received, as well as the labels and position_ids\n\n\nPretrainingBatchSamplerDataCollatorForSeq2Seq\nCollator for multipack specific to the using the BatchSampler\n\n\nV2BatchSamplerDataCollatorForSeq2Seq\nCollator for multipack specific to the using the BatchSampler\n\n\n\n\n\nutils.collators.batching.BatchSamplerDataCollatorForSeq2Seq(\n self,\n tokenizer,\n model=None,\n padding=True,\n max_length=None,\n pad_to_multiple_of=None,\n label_pad_token_id=-100,\n position_pad_token_id=0,\n return_tensors='pt',\n sequence_parallel_degree=1,\n)\nCollator for multipack specific to the using the BatchSampler\n\n\n\nutils.collators.batching.DataCollatorForSeq2Seq(\n self,\n tokenizer,\n model=None,\n padding=True,\n max_length=None,\n pad_to_multiple_of=None,\n label_pad_token_id=-100,\n position_pad_token_id=0,\n return_tensors='pt',\n sequence_parallel_degree=1,\n)\nData collator that will dynamically pad the inputs received, as well as the labels and position_ids\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\ntokenizer\n[PreTrainedTokenizer] or [PreTrainedTokenizerFast]\nThe tokenizer used for encoding the data.\nrequired\n\n\nmodel\n[PreTrainedModel]\nThe model that is being trained. If set and has the prepare_decoder_input_ids_from_labels, use it to prepare the decoder_input_ids This is useful when using label_smoothing to avoid calculating loss twice.\nNone\n\n\npadding\nbool, str or [~utils.PaddingStrategy], optional, defaults to True\nSelect a strategy to pad the returned sequences (according to the model’s padding side and padding index) among: - True or 'longest' (default): Pad to the longest sequence in the batch (or no padding if only a single sequence is provided). - 'max_length': Pad to a maximum length specified with the argument max_length or to the maximum acceptable input length for the model if that argument is not provided. - False or 'do_not_pad': No padding (i.e., can output a batch with sequences of different lengths).\nTrue\n\n\nmax_length\nint, optional\nMaximum length of the returned list and optionally padding length (see above).\nNone\n\n\npad_to_multiple_of\nint, optional\nIf set will pad the sequence to a multiple of the provided value. This is especially useful to enable the use of Tensor Cores on NVIDIA hardware with compute capability >= 7.5 (Volta).\nNone\n\n\nlabel_pad_token_id\nint, optional, defaults to -100\nThe id to use when padding the labels (-100 will be automatically ignored by PyTorch loss functions).\n-100\n\n\nreturn_tensors\nstr\nThe type of Tensor to return. Allowable values are “np”, “pt” and “tf”.\n'pt'\n\n\nsequence_parallel_degree\nint\nThe degree of sequence parallelism. Default to 1 for no sequence parallelism.\n1\n\n\n\n\n\n\n\n\n\nName\nDescription\n\n\n\n\napply_sequence_parallelism\nApply sequence parallelism slicing to a batch.\n\n\n\n\n\nutils.collators.batching.DataCollatorForSeq2Seq.apply_sequence_parallelism(\n batch,\n)\nApply sequence parallelism slicing to a batch.\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\nbatch\ndict[str, torch.Tensor]\nBatch dictionary from parent collator.\nrequired\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\n\n\n\n\n\ntorch.Tensor\nSliced batch dictionary.\n\n\n\n\n\n\n\n\n\nutils.collators.batching.PretrainingBatchSamplerDataCollatorForSeq2Seq(\n self,\n *args,\n multipack_attn=True,\n **kwargs,\n)\nCollator for multipack specific to the using the BatchSampler\n\n\n\nutils.collators.batching.V2BatchSamplerDataCollatorForSeq2Seq(\n self,\n tokenizer,\n model=None,\n padding=True,\n max_length=None,\n pad_to_multiple_of=None,\n label_pad_token_id=-100,\n position_pad_token_id=0,\n return_tensors='pt',\n sequence_parallel_degree=1,\n)\nCollator for multipack specific to the using the BatchSampler"
},
- {
- "objectID": "docs/api/utils.collators.batching.html#functions",
- "href": "docs/api/utils.collators.batching.html#functions",
- "title": "utils.collators.batching",
- "section": "",
- "text": "Name\nDescription\n\n\n\n\nadjust_position_ids_for_slice\nAdjust position IDs for a sliced sequence to maintain proper relative positions.\n\n\n\n\n\nutils.collators.batching.adjust_position_ids_for_slice(position_ids, start_idx)\nAdjust position IDs for a sliced sequence to maintain proper relative positions.\nThis handles the case where position IDs might not be contiguous due to sample\npacking."
- },
{
"objectID": "docs/api/prompt_strategies.completion.html",
"href": "docs/api/prompt_strategies.completion.html",
diff --git a/sitemap.xml b/sitemap.xml
index c7811e7c5..f863e6a88 100644
--- a/sitemap.xml
+++ b/sitemap.xml
@@ -2,682 +2,682 @@
https://axolotl-ai-cloud.github.io/axolotl/examples/colab-notebooks/colab-axolotl-example.html
- 2025-04-07T16:41:27.525Z
+ 2025-04-07T18:48:09.453Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/dataset-formats/stepwise_supervised.html
- 2025-04-07T16:41:27.521Z
+ 2025-04-07T18:48:09.449Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/dataset-formats/template_free.html
- 2025-04-07T16:41:27.521Z
+ 2025-04-07T18:48:09.449Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/dataset-formats/tokenized.html
- 2025-04-07T16:41:27.521Z
+ 2025-04-07T18:48:09.449Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/nccl.html
- 2025-04-07T16:41:27.524Z
+ 2025-04-07T18:48:09.452Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/amd_hpc.html
- 2025-04-07T16:41:27.520Z
+ 2025-04-07T18:48:09.448Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/config.html
- 2025-04-07T16:41:27.520Z
+ 2025-04-07T18:48:09.448Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/multi-gpu.html
- 2025-04-07T16:41:27.524Z
+ 2025-04-07T18:48:09.452Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/installation.html
- 2025-04-07T16:41:27.524Z
+ 2025-04-07T18:48:09.452Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/torchao.html
- 2025-04-07T16:41:27.524Z
+ 2025-04-07T18:48:09.452Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/reward_modelling.html
- 2025-04-07T16:41:27.524Z
+ 2025-04-07T18:48:09.452Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/input_output.html
- 2025-04-07T16:41:27.524Z
+ 2025-04-07T18:48:09.452Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/multimodal.html
- 2025-04-07T16:41:27.524Z
+ 2025-04-07T18:48:09.452Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.callbacks.mlflow_.html
- 2025-04-07T16:42:04.119Z
+ 2025-04-07T18:48:41.444Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.trainer_fsdp_optim.html
- 2025-04-07T16:42:03.710Z
+ 2025-04-07T18:48:41.034Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.data.batch_dataset_fetcher.html
- 2025-04-07T16:42:03.727Z
+ 2025-04-07T18:48:41.050Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.stepwise_supervised.html
- 2025-04-07T16:42:03.418Z
+ 2025-04-07T18:48:40.744Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.mistral_attn_hijack_flash.html
- 2025-04-07T16:42:03.659Z
+ 2025-04-07T18:48:40.983Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.dpo.user_defined.html
- 2025-04-07T16:42:03.464Z
+ 2025-04-07T18:48:40.789Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/integrations.liger.args.html
- 2025-04-07T16:42:04.034Z
+ 2025-04-07T18:48:41.353Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.schemas.training.html
- 2025-04-07T16:42:03.897Z
+ 2025-04-07T18:48:41.218Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/datasets.html
- 2025-04-07T16:42:02.857Z
+ 2025-04-07T18:48:40.249Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/kernels.geglu.html
- 2025-04-07T16:42:03.600Z
+ 2025-04-07T18:48:40.922Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.llama_attn_hijack_flash.html
- 2025-04-07T16:42:03.643Z
+ 2025-04-07T18:48:40.968Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.sweeps.html
- 2025-04-07T16:42:03.251Z
+ 2025-04-07T18:48:40.579Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.freeze.html
- 2025-04-07T16:42:03.799Z
+ 2025-04-07T18:48:41.121Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.multipack.html
- 2025-04-07T16:42:03.660Z
+ 2025-04-07T18:48:40.985Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.main.html
- 2025-04-07T16:42:03.106Z
+ 2025-04-07T18:48:40.477Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.trainers.trl.html
- 2025-04-07T16:42:03.328Z
+ 2025-04-07T18:48:40.655Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.dpo.passthrough.html
- 2025-04-07T16:42:03.466Z
+ 2025-04-07T18:48:40.791Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.chat.format.llama3x.html
- 2025-04-07T16:42:03.061Z
+ 2025-04-07T18:48:40.432Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.datasets.transforms.chat_builder.html
- 2025-04-07T16:42:03.075Z
+ 2025-04-07T18:48:40.446Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.kto.user_defined.html
- 2025-04-07T16:42:03.483Z
+ 2025-04-07T18:48:40.808Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.collators.mamba.html
- 2025-04-07T16:42:04.091Z
+ 2025-04-07T18:48:41.414Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/integrations.base.html
- 2025-04-07T16:42:04.019Z
+ 2025-04-07T18:48:41.339Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.bench.html
- 2025-04-07T16:42:03.791Z
+ 2025-04-07T18:48:41.114Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/kernels.swiglu.html
- 2025-04-07T16:42:03.609Z
+ 2025-04-07T18:48:40.932Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.chat.format.shared.html
- 2025-04-07T16:42:03.062Z
+ 2025-04-07T18:48:40.434Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/integrations.cut_cross_entropy.args.html
- 2025-04-07T16:42:04.022Z
+ 2025-04-07T18:48:41.342Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.datasets.chat.html
- 2025-04-07T16:42:03.067Z
+ 2025-04-07T18:48:40.439Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.callbacks.lisa.html
- 2025-04-07T16:42:04.116Z
+ 2025-04-07T18:48:41.441Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/integrations.grokfast.optimizer.html
- 2025-04-07T16:42:04.023Z
+ 2025-04-07T18:48:41.343Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.alpaca_chat.html
- 2025-04-07T16:42:03.367Z
+ 2025-04-07T18:48:40.694Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.alpaca_instruct.html
- 2025-04-07T16:42:03.369Z
+ 2025-04-07T18:48:40.695Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.kto.chatml.html
- 2025-04-07T16:42:03.481Z
+ 2025-04-07T18:48:40.806Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.schemas.integrations.html
- 2025-04-07T16:42:03.943Z
+ 2025-04-07T18:48:41.264Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.schemas.trl.html
- 2025-04-07T16:42:03.926Z
+ 2025-04-07T18:48:41.247Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_tokenizers.html
- 2025-04-07T16:42:02.913Z
+ 2025-04-07T18:48:40.304Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.data.sft.html
- 2025-04-07T16:42:03.874Z
+ 2025-04-07T18:48:41.196Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.schedulers.html
- 2025-04-07T16:42:03.840Z
+ 2025-04-07T18:48:41.162Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.chat_templates.html
- 2025-04-07T16:42:03.774Z
+ 2025-04-07T18:48:41.096Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.models.html
- 2025-04-07T16:42:03.757Z
+ 2025-04-07T18:48:41.080Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.dpo.chatml.html
- 2025-04-07T16:42:03.461Z
+ 2025-04-07T18:48:40.786Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.distributed.html
- 2025-04-07T16:42:03.860Z
+ 2025-04-07T18:48:41.182Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.utils.html
- 2025-04-07T16:42:03.699Z
+ 2025-04-07T18:48:41.023Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.schemas.utils.html
- 2025-04-07T16:42:03.955Z
+ 2025-04-07T18:48:41.276Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.llama_expand_mask.html
- 2025-04-07T16:42:03.669Z
+ 2025-04-07T18:48:40.993Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/common.datasets.html
- 2025-04-07T16:42:04.060Z
+ 2025-04-07T18:48:41.380Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/logging_config.html
- 2025-04-07T16:42:02.920Z
+ 2025-04-07T18:48:40.308Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/kernels.quantize.html
- 2025-04-07T16:42:03.617Z
+ 2025-04-07T18:48:40.941Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.llama_patch_multipack.html
- 2025-04-07T16:42:03.702Z
+ 2025-04-07T18:48:41.025Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.schemas.model.html
- 2025-04-07T16:42:03.892Z
+ 2025-04-07T18:48:41.213Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.stablelm_attn_hijack_flash.html
- 2025-04-07T16:42:03.707Z
+ 2025-04-07T18:48:41.031Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.mixtral.html
- 2025-04-07T16:42:03.728Z
+ 2025-04-07T18:48:41.052Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.tokenization.html
- 2025-04-07T16:42:03.764Z
+ 2025-04-07T18:48:41.087Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/integrations.kd.trainer.html
- 2025-04-07T16:42:04.030Z
+ 2025-04-07T18:48:41.350Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.schemas.datasets.html
- 2025-04-07T16:42:03.914Z
+ 2025-04-07T18:48:41.236Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.collators.core.html
- 2025-04-07T16:42:04.062Z
+ 2025-04-07T18:48:41.383Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.btlm_attn_hijack_flash.html
- 2025-04-07T16:42:03.700Z
+ 2025-04-07T18:48:41.024Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.optimizers.adopt.html
- 2025-04-07T16:42:03.871Z
+ 2025-04-07T18:48:41.193Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.input_output.html
- 2025-04-07T16:42:03.413Z
+ 2025-04-07T18:48:40.739Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/index.html
- 2025-04-07T16:42:02.778Z
+ 2025-04-07T18:48:40.172Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.cloud.modal_.html
- 2025-04-07T16:42:03.297Z
+ 2025-04-07T18:48:40.624Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.dpo.llama3.html
- 2025-04-07T16:42:03.451Z
+ 2025-04-07T18:48:40.776Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.train.html
- 2025-04-07T16:42:03.114Z
+ 2025-04-07T18:48:40.485Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.trainer_builder.html
- 2025-04-07T16:42:02.943Z
+ 2025-04-07T18:48:40.323Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.callbacks.perplexity.html
- 2025-04-07T16:42:04.110Z
+ 2025-04-07T18:48:41.436Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/getting-started.html
- 2025-04-07T16:41:27.521Z
+ 2025-04-07T18:48:09.449Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/dataset_loading.html
- 2025-04-07T16:41:27.521Z
+ 2025-04-07T18:48:09.449Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/batch_vs_grad.html
- 2025-04-07T16:41:27.520Z
+ 2025-04-07T18:48:09.448Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/faq.html
- 2025-04-07T16:41:27.521Z
+ 2025-04-07T18:48:09.449Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/debugging.html
- 2025-04-07T16:41:27.521Z
+ 2025-04-07T18:48:09.449Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/lr_groups.html
- 2025-04-07T16:41:27.524Z
+ 2025-04-07T18:48:09.452Zhttps://axolotl-ai-cloud.github.io/axolotl/TODO.html
- 2025-04-07T16:41:27.519Z
+ 2025-04-07T18:48:09.447Zhttps://axolotl-ai-cloud.github.io/axolotl/src/axolotl/integrations/LICENSE.html
- 2025-04-07T16:41:27.540Z
+ 2025-04-07T18:48:09.467Zhttps://axolotl-ai-cloud.github.io/axolotl/index.html
- 2025-04-07T16:41:27.537Z
+ 2025-04-07T18:48:09.464Zhttps://axolotl-ai-cloud.github.io/axolotl/src/axolotl/integrations/cut_cross_entropy/ACKNOWLEDGEMENTS.html
- 2025-04-07T16:41:27.540Z
+ 2025-04-07T18:48:09.468Zhttps://axolotl-ai-cloud.github.io/axolotl/FAQS.html
- 2025-04-07T16:41:27.519Z
+ 2025-04-07T18:48:09.447Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/multi-node.html
- 2025-04-07T16:41:27.524Z
+ 2025-04-07T18:48:09.452Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/sequence_parallelism.html
- 2025-04-07T16:41:27.524Z
+ 2025-04-07T18:48:09.452Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/multipack.html
- 2025-04-07T16:41:27.524Z
+ 2025-04-07T18:48:09.452Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/inference.html
- 2025-04-07T16:41:27.524Z
+ 2025-04-07T18:48:09.452Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/lora_optims.html
- 2025-04-07T16:41:27.524Z
+ 2025-04-07T18:48:09.452Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.lora_embeddings.html
- 2025-04-07T16:42:03.782Z
+ 2025-04-07T18:48:41.104Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/kernels.utils.html
- 2025-04-07T16:42:03.618Z
+ 2025-04-07T18:48:40.942Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.chat_template.html
- 2025-04-07T16:42:03.354Z
+ 2025-04-07T18:48:40.680Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/convert.html
- 2025-04-07T16:42:02.870Z
+ 2025-04-07T18:48:40.263Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/common.const.html
- 2025-04-07T16:42:04.043Z
+ 2025-04-07T18:48:41.363Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.cloud.base.html
- 2025-04-07T16:42:03.291Z
+ 2025-04-07T18:48:40.618Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.relora.html
- 2025-04-07T16:42:03.667Z
+ 2025-04-07T18:48:40.991Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.lora.html
- 2025-04-07T16:42:03.779Z
+ 2025-04-07T18:48:41.101Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.merge_lora.html
- 2025-04-07T16:42:03.225Z
+ 2025-04-07T18:48:40.554Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.bradley_terry.llama3.html
- 2025-04-07T16:42:03.507Z
+ 2025-04-07T18:48:40.831Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.merge_sharded_fsdp_weights.html
- 2025-04-07T16:42:03.237Z
+ 2025-04-07T18:48:40.566Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/integrations.spectrum.args.html
- 2025-04-07T16:42:04.040Z
+ 2025-04-07T18:48:41.360Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/models.mamba.modeling_mamba.html
- 2025-04-07T16:42:04.061Z
+ 2025-04-07T18:48:41.381Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/common.architectures.html
- 2025-04-07T16:42:04.042Z
+ 2025-04-07T18:48:41.361Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.trainer.html
- 2025-04-07T16:42:03.815Z
+ 2025-04-07T18:48:41.138Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.callbacks.comet_.html
- 2025-04-07T16:42:04.123Z
+ 2025-04-07T18:48:41.448Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.vllm_serve.html
- 2025-04-07T16:42:03.287Z
+ 2025-04-07T18:48:40.614Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.schemas.multimodal.html
- 2025-04-07T16:42:03.931Z
+ 2025-04-07T18:48:41.252Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.gradient_checkpointing.unsloth.html
- 2025-04-07T16:42:03.877Z
+ 2025-04-07T18:48:41.199Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.trainers.base.html
- 2025-04-07T16:42:03.311Z
+ 2025-04-07T18:48:40.638Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.unsloth_.html
- 2025-04-07T16:42:03.718Z
+ 2025-04-07T18:48:41.042Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.samplers.multipack.html
- 2025-04-07T16:42:04.104Z
+ 2025-04-07T18:48:41.429Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.callbacks.profiler.html
- 2025-04-07T16:42:04.114Z
+ 2025-04-07T18:48:41.439Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/integrations.lm_eval.args.html
- 2025-04-07T16:42:04.037Z
+ 2025-04-07T18:48:41.357Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.data.pretraining.html
- 2025-04-07T16:42:03.872Z
+ 2025-04-07T18:48:41.194Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/evaluate.html
- 2025-04-07T16:42:02.849Z
+ 2025-04-07T18:48:40.242Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.dict.html
- 2025-04-07T16:42:03.863Z
+ 2025-04-07T18:48:41.185Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.utils.html
- 2025-04-07T16:42:03.283Z
+ 2025-04-07T18:48:40.610Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.pygmalion.html
- 2025-04-07T16:42:03.435Z
+ 2025-04-07T18:48:40.761Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.training_args.html
- 2025-04-07T16:42:03.035Z
+ 2025-04-07T18:48:40.407Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.inference.html
- 2025-04-07T16:42:03.217Z
+ 2025-04-07T18:48:40.546Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/kernels.lora.html
- 2025-04-07T16:42:03.589Z
+ 2025-04-07T18:48:40.912Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.evaluate.html
- 2025-04-07T16:42:03.122Z
+ 2025-04-07T18:48:40.493Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.collators.batching.html
- 2025-04-07T16:42:04.088Z
+ 2025-04-07T18:48:41.410Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.completion.html
- 2025-04-07T16:42:03.407Z
+ 2025-04-07T18:48:40.733Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.dpo.zephyr.html
- 2025-04-07T16:42:03.463Z
+ 2025-04-07T18:48:40.788Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.metharme.html
- 2025-04-07T16:42:03.425Z
+ 2025-04-07T18:48:40.750Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.orpo.chat_template.html
- 2025-04-07T16:42:03.504Z
+ 2025-04-07T18:48:40.828Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.alpaca_w_system.html
- 2025-04-07T16:42:03.381Z
+ 2025-04-07T18:48:40.707Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.model_shard_quant.html
- 2025-04-07T16:42:03.787Z
+ 2025-04-07T18:48:41.110Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.config.html
- 2025-04-07T16:42:03.192Z
+ 2025-04-07T18:48:40.532Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.schemas.enums.html
- 2025-04-07T16:42:03.949Z
+ 2025-04-07T18:48:41.271Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.preprocess.html
- 2025-04-07T16:42:03.245Z
+ 2025-04-07T18:48:40.574Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.chat.messages.html
- 2025-04-07T16:42:03.058Z
+ 2025-04-07T18:48:40.429Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.dpo.chat_template.html
- 2025-04-07T16:42:03.441Z
+ 2025-04-07T18:48:40.766Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.schemas.peft.html
- 2025-04-07T16:42:03.922Z
+ 2025-04-07T18:48:41.244Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/train.html
- 2025-04-07T16:42:02.839Z
+ 2025-04-07T18:48:40.232Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.messages.chat.html
- 2025-04-07T16:42:03.440Z
+ 2025-04-07T18:48:40.765Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.orcamini.html
- 2025-04-07T16:42:03.429Z
+ 2025-04-07T18:48:40.754Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.collators.mm_chat.html
- 2025-04-07T16:42:04.096Z
+ 2025-04-07T18:48:41.420Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.kto.llama3.html
- 2025-04-07T16:42:03.473Z
+ 2025-04-07T18:48:40.799Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.attention.mllama.html
- 2025-04-07T16:42:03.725Z
+ 2025-04-07T18:48:41.049Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.checks.html
- 2025-04-07T16:42:03.161Z
+ 2025-04-07T18:48:40.515Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.transformers_fa_utils.html
- 2025-04-07T16:42:03.717Z
+ 2025-04-07T18:48:41.041Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.llama_attn_hijack_xformers.html
- 2025-04-07T16:42:03.645Z
+ 2025-04-07T18:48:40.969Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.trainers.dpo.trainer.html
- 2025-04-07T16:42:03.334Z
+ 2025-04-07T18:48:40.661Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.user_defined.html
- 2025-04-07T16:42:03.389Z
+ 2025-04-07T18:48:40.715Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/cli.args.html
- 2025-04-07T16:42:03.150Z
+ 2025-04-07T18:48:40.509Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.llama2_chat.html
- 2025-04-07T16:42:03.401Z
+ 2025-04-07T18:48:40.728Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/utils.schemas.config.html
- 2025-04-07T16:42:03.885Z
+ 2025-04-07T18:48:41.207Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.trainers.grpo.trainer.html
- 2025-04-07T16:42:03.338Z
+ 2025-04-07T18:48:40.664Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/core.chat.format.chatml.html
- 2025-04-07T16:42:03.059Z
+ 2025-04-07T18:48:40.431Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/monkeypatch.lora_kernels.html
- 2025-04-07T16:42:03.691Z
+ 2025-04-07T18:48:41.015Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/api/prompt_strategies.base.html
- 2025-04-07T16:42:03.339Z
+ 2025-04-07T18:48:40.666Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/rlhf.html
- 2025-04-07T16:41:27.524Z
+ 2025-04-07T18:48:09.452Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/cli.html
- 2025-04-07T16:41:27.520Z
+ 2025-04-07T18:48:09.448Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/unsloth.html
- 2025-04-07T16:41:27.524Z
+ 2025-04-07T18:48:09.452Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/fsdp_qlora.html
- 2025-04-07T16:41:27.521Z
+ 2025-04-07T18:48:09.449Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/dataset_preprocessing.html
- 2025-04-07T16:41:27.521Z
+ 2025-04-07T18:48:09.449Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/custom_integrations.html
- 2025-04-07T16:41:27.521Z
+ 2025-04-07T18:48:09.448Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/mac.html
- 2025-04-07T16:41:27.524Z
+ 2025-04-07T18:48:09.452Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/docker.html
- 2025-04-07T16:41:27.521Z
+ 2025-04-07T18:48:09.449Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/ray-integration.html
- 2025-04-07T16:41:27.524Z
+ 2025-04-07T18:48:09.452Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/dataset-formats/index.html
- 2025-04-07T16:41:27.521Z
+ 2025-04-07T18:48:09.448Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/dataset-formats/conversation.html
- 2025-04-07T16:41:27.521Z
+ 2025-04-07T18:48:09.448Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/dataset-formats/pretraining.html
- 2025-04-07T16:41:27.521Z
+ 2025-04-07T18:48:09.449Zhttps://axolotl-ai-cloud.github.io/axolotl/docs/dataset-formats/inst_tune.html
- 2025-04-07T16:41:27.521Z
+ 2025-04-07T18:48:09.448Z