diff --git a/.nojekyll b/.nojekyll
index 2e3a335b7..6b71a7f0e 100644
--- a/.nojekyll
+++ b/.nojekyll
@@ -1 +1 @@
-0496a1b7
\ No newline at end of file
+d5d7dce8
\ No newline at end of file
diff --git a/docs/api/utils.ctx_managers.sequence_parallel.html b/docs/api/utils.ctx_managers.sequence_parallel.html
index 468cfc0f0..4a9b55b07 100644
--- a/docs/api/utils.ctx_managers.sequence_parallel.html
+++ b/docs/api/utils.ctx_managers.sequence_parallel.html
@@ -685,7 +685,8 @@ from the full gradient tensor.
Context manager for sequence parallelism operations.
This class provides a context that will automatically apply sequence parallelism
during model forward passes using a pre-forward hook, and gather outputs from
@@ -738,6 +739,12 @@ across the sequence parallelism group using a post-forward hook.
Sequence parallelism K head stride size. Passed through to varlen_llama3ring_flash_attn implementation.
required
+
+
gather_outputs
+
bool
+
Whether to gather outputs after model forward pass across the sequence parallel group.
+
required
+
diff --git a/search.json b/search.json
index 7909bf089..36689c9ed 100644
--- a/search.json
+++ b/search.json
@@ -1482,14 +1482,14 @@
"href": "docs/api/utils.ctx_managers.sequence_parallel.html",
"title": "utils.ctx_managers.sequence_parallel",
"section": "",
- "text": "utils.ctx_managers.sequence_parallel\nModule for Axolotl trainer sequence parallelism manager and utilities\n\n\n\n\n\nName\nDescription\n\n\n\n\nAllGatherWithGrad\nCustom autograd function for all-gather to preserve gradients.\n\n\nSequenceParallelContextManager\nContext manager for sequence parallelism operations.\n\n\n\n\n\nutils.ctx_managers.sequence_parallel.AllGatherWithGrad()\nCustom autograd function for all-gather to preserve gradients.\n\n\n\n\n\nName\nDescription\n\n\n\n\nbackward\nBackward pass for all-gather operation.\n\n\nforward\nForward pass of all-gather of data with sequence dimension.\n\n\n\n\n\nutils.ctx_managers.sequence_parallel.AllGatherWithGrad.backward(\n ctx,\n grad_output,\n)\nBackward pass for all-gather operation.\nExtracts the gradient slice corresponding to this rank’s original input\nfrom the full gradient tensor.\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\nctx\ntorch.autograd.function.FunctionCtx\ntorch.autograd function context.\nrequired\n\n\ngrad_output\ntorch.Tensor\nGradient from subsequent layers with respect to the concatenated output tensor.\nrequired\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\n\n\n\n\n\ntuple[torch.Tensor, None]\nTuple containing the gradient slice for this rank’s input tensor and None for the process group parameter which doesn’t require gradients.\n\n\n\n\n\n\n\nutils.ctx_managers.sequence_parallel.AllGatherWithGrad.forward(\n ctx,\n input_tensor,\n group,\n)\nForward pass of all-gather of data with sequence dimension.\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\nctx\ntorch.autograd.function.FunctionCtx\ntorch.autograd function context.\nrequired\n\n\ninput_tensor\ntorch.Tensor\nTensor from model output with sequence dimension.\nrequired\n\n\ngroup\ndist.ProcessGroup\ntorch.distributed process group.\nrequired\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\n\n\n\n\n\ntorch.Tensor\nTensor from gathering the input_tensor from across the process group and concatenating along the sequence dimension.\n\n\n\n\n\n\n\n\n\nutils.ctx_managers.sequence_parallel.SequenceParallelContextManager(\n models,\n sequence_parallel_degree,\n gradient_accumulation_steps,\n ring_attn_func,\n heads_k_stride,\n)\nContext manager for sequence parallelism operations.\nThis class provides a context that will automatically apply sequence parallelism\nduring model forward passes using a pre-forward hook, and gather outputs from\nacross the sequence parallelism group using a post-forward hook.\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\nmodels\nlist[nn.Module]\nList of models to apply sequence parallelism to pre- and post- forward hooks.\nrequired\n\n\nsequence_parallel_degree\nint\nNumber of processes to split sequences over.\nrequired\n\n\ngradient_accumulation_steps\nint\nNumber of steps to accumulate gradients over.\nrequired\n\n\nring_attn_func\nRingAttnFunc\nWhich ring attention function to use. Currently unused.\nrequired\n\n\nheads_k_stride\nint | None\nSequence parallelism K head stride size. Passed through to varlen_llama3 ring_flash_attn implementation.\nrequired\n\n\n\n\n\n\n\n\n\n\n\nName\nDescription\n\n\n\n\napply_sequence_parallelism\nApply sequence parallelism slicing to a batch.\n\n\n\n\n\nutils.ctx_managers.sequence_parallel.apply_sequence_parallelism(\n batch,\n local_rank,\n local_world_size,\n gradient_accumulation_steps,\n ring_attn_func,\n)\nApply sequence parallelism slicing to a batch.\nSpecial handling is implemented for integer logits_to_keep, which indicates\nto only keep the last N tokens in the sequence during generation.\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\nbatch\ndict[str, torch.Tensor]\nBatch dictionary (e.g., input_ids, attention_mask, etc.).\nrequired\n\n\nlocal_rank\nint\nLocal rank in the sequence parallel group.\nrequired\n\n\nlocal_world_size\nint\nWorld size of the sequence parallel group.\nrequired\n\n\ngradient_accumulation_steps\nint\nNumber of steps to accumulate gradients over.\nrequired\n\n\nring_attn_func\nRingAttnFunc\nWhich ring attention function to use. Currently unused, but related to above TODO.\nrequired\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\n\n\n\n\n\ntuple[dict[str, torch.Tensor], int, int]\ntuple of: - Batch dictionary with sliced tensors. - The original sequence length before padding. - The number of padding tokens added."
+ "text": "utils.ctx_managers.sequence_parallel\nModule for Axolotl trainer sequence parallelism manager and utilities\n\n\n\n\n\nName\nDescription\n\n\n\n\nAllGatherWithGrad\nCustom autograd function for all-gather to preserve gradients.\n\n\nSequenceParallelContextManager\nContext manager for sequence parallelism operations.\n\n\n\n\n\nutils.ctx_managers.sequence_parallel.AllGatherWithGrad()\nCustom autograd function for all-gather to preserve gradients.\n\n\n\n\n\nName\nDescription\n\n\n\n\nbackward\nBackward pass for all-gather operation.\n\n\nforward\nForward pass of all-gather of data with sequence dimension.\n\n\n\n\n\nutils.ctx_managers.sequence_parallel.AllGatherWithGrad.backward(\n ctx,\n grad_output,\n)\nBackward pass for all-gather operation.\nExtracts the gradient slice corresponding to this rank’s original input\nfrom the full gradient tensor.\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\nctx\ntorch.autograd.function.FunctionCtx\ntorch.autograd function context.\nrequired\n\n\ngrad_output\ntorch.Tensor\nGradient from subsequent layers with respect to the concatenated output tensor.\nrequired\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\n\n\n\n\n\ntuple[torch.Tensor, None]\nTuple containing the gradient slice for this rank’s input tensor and None for the process group parameter which doesn’t require gradients.\n\n\n\n\n\n\n\nutils.ctx_managers.sequence_parallel.AllGatherWithGrad.forward(\n ctx,\n input_tensor,\n group,\n)\nForward pass of all-gather of data with sequence dimension.\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\nctx\ntorch.autograd.function.FunctionCtx\ntorch.autograd function context.\nrequired\n\n\ninput_tensor\ntorch.Tensor\nTensor from model output with sequence dimension.\nrequired\n\n\ngroup\ndist.ProcessGroup\ntorch.distributed process group.\nrequired\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\n\n\n\n\n\ntorch.Tensor\nTensor from gathering the input_tensor from across the process group and concatenating along the sequence dimension.\n\n\n\n\n\n\n\n\n\nutils.ctx_managers.sequence_parallel.SequenceParallelContextManager(\n models,\n sequence_parallel_degree,\n gradient_accumulation_steps,\n ring_attn_func,\n heads_k_stride,\n gather_outputs,\n)\nContext manager for sequence parallelism operations.\nThis class provides a context that will automatically apply sequence parallelism\nduring model forward passes using a pre-forward hook, and gather outputs from\nacross the sequence parallelism group using a post-forward hook.\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\nmodels\nlist[nn.Module]\nList of models to apply sequence parallelism to pre- and post- forward hooks.\nrequired\n\n\nsequence_parallel_degree\nint\nNumber of processes to split sequences over.\nrequired\n\n\ngradient_accumulation_steps\nint\nNumber of steps to accumulate gradients over.\nrequired\n\n\nring_attn_func\nRingAttnFunc\nWhich ring attention function to use. Currently unused.\nrequired\n\n\nheads_k_stride\nint | None\nSequence parallelism K head stride size. Passed through to varlen_llama3 ring_flash_attn implementation.\nrequired\n\n\ngather_outputs\nbool\nWhether to gather outputs after model forward pass across the sequence parallel group.\nrequired\n\n\n\n\n\n\n\n\n\n\n\nName\nDescription\n\n\n\n\napply_sequence_parallelism\nApply sequence parallelism slicing to a batch.\n\n\n\n\n\nutils.ctx_managers.sequence_parallel.apply_sequence_parallelism(\n batch,\n local_rank,\n local_world_size,\n gradient_accumulation_steps,\n ring_attn_func,\n)\nApply sequence parallelism slicing to a batch.\nSpecial handling is implemented for integer logits_to_keep, which indicates\nto only keep the last N tokens in the sequence during generation.\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\nbatch\ndict[str, torch.Tensor]\nBatch dictionary (e.g., input_ids, attention_mask, etc.).\nrequired\n\n\nlocal_rank\nint\nLocal rank in the sequence parallel group.\nrequired\n\n\nlocal_world_size\nint\nWorld size of the sequence parallel group.\nrequired\n\n\ngradient_accumulation_steps\nint\nNumber of steps to accumulate gradients over.\nrequired\n\n\nring_attn_func\nRingAttnFunc\nWhich ring attention function to use. Currently unused, but related to above TODO.\nrequired\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\n\n\n\n\n\ntuple[dict[str, torch.Tensor], int, int]\ntuple of: - Batch dictionary with sliced tensors. - The original sequence length before padding. - The number of padding tokens added."
},
{
"objectID": "docs/api/utils.ctx_managers.sequence_parallel.html#classes",
"href": "docs/api/utils.ctx_managers.sequence_parallel.html#classes",
"title": "utils.ctx_managers.sequence_parallel",
"section": "",
- "text": "Name\nDescription\n\n\n\n\nAllGatherWithGrad\nCustom autograd function for all-gather to preserve gradients.\n\n\nSequenceParallelContextManager\nContext manager for sequence parallelism operations.\n\n\n\n\n\nutils.ctx_managers.sequence_parallel.AllGatherWithGrad()\nCustom autograd function for all-gather to preserve gradients.\n\n\n\n\n\nName\nDescription\n\n\n\n\nbackward\nBackward pass for all-gather operation.\n\n\nforward\nForward pass of all-gather of data with sequence dimension.\n\n\n\n\n\nutils.ctx_managers.sequence_parallel.AllGatherWithGrad.backward(\n ctx,\n grad_output,\n)\nBackward pass for all-gather operation.\nExtracts the gradient slice corresponding to this rank’s original input\nfrom the full gradient tensor.\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\nctx\ntorch.autograd.function.FunctionCtx\ntorch.autograd function context.\nrequired\n\n\ngrad_output\ntorch.Tensor\nGradient from subsequent layers with respect to the concatenated output tensor.\nrequired\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\n\n\n\n\n\ntuple[torch.Tensor, None]\nTuple containing the gradient slice for this rank’s input tensor and None for the process group parameter which doesn’t require gradients.\n\n\n\n\n\n\n\nutils.ctx_managers.sequence_parallel.AllGatherWithGrad.forward(\n ctx,\n input_tensor,\n group,\n)\nForward pass of all-gather of data with sequence dimension.\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\nctx\ntorch.autograd.function.FunctionCtx\ntorch.autograd function context.\nrequired\n\n\ninput_tensor\ntorch.Tensor\nTensor from model output with sequence dimension.\nrequired\n\n\ngroup\ndist.ProcessGroup\ntorch.distributed process group.\nrequired\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\n\n\n\n\n\ntorch.Tensor\nTensor from gathering the input_tensor from across the process group and concatenating along the sequence dimension.\n\n\n\n\n\n\n\n\n\nutils.ctx_managers.sequence_parallel.SequenceParallelContextManager(\n models,\n sequence_parallel_degree,\n gradient_accumulation_steps,\n ring_attn_func,\n heads_k_stride,\n)\nContext manager for sequence parallelism operations.\nThis class provides a context that will automatically apply sequence parallelism\nduring model forward passes using a pre-forward hook, and gather outputs from\nacross the sequence parallelism group using a post-forward hook.\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\nmodels\nlist[nn.Module]\nList of models to apply sequence parallelism to pre- and post- forward hooks.\nrequired\n\n\nsequence_parallel_degree\nint\nNumber of processes to split sequences over.\nrequired\n\n\ngradient_accumulation_steps\nint\nNumber of steps to accumulate gradients over.\nrequired\n\n\nring_attn_func\nRingAttnFunc\nWhich ring attention function to use. Currently unused.\nrequired\n\n\nheads_k_stride\nint | None\nSequence parallelism K head stride size. Passed through to varlen_llama3 ring_flash_attn implementation.\nrequired"
+ "text": "Name\nDescription\n\n\n\n\nAllGatherWithGrad\nCustom autograd function for all-gather to preserve gradients.\n\n\nSequenceParallelContextManager\nContext manager for sequence parallelism operations.\n\n\n\n\n\nutils.ctx_managers.sequence_parallel.AllGatherWithGrad()\nCustom autograd function for all-gather to preserve gradients.\n\n\n\n\n\nName\nDescription\n\n\n\n\nbackward\nBackward pass for all-gather operation.\n\n\nforward\nForward pass of all-gather of data with sequence dimension.\n\n\n\n\n\nutils.ctx_managers.sequence_parallel.AllGatherWithGrad.backward(\n ctx,\n grad_output,\n)\nBackward pass for all-gather operation.\nExtracts the gradient slice corresponding to this rank’s original input\nfrom the full gradient tensor.\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\nctx\ntorch.autograd.function.FunctionCtx\ntorch.autograd function context.\nrequired\n\n\ngrad_output\ntorch.Tensor\nGradient from subsequent layers with respect to the concatenated output tensor.\nrequired\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\n\n\n\n\n\ntuple[torch.Tensor, None]\nTuple containing the gradient slice for this rank’s input tensor and None for the process group parameter which doesn’t require gradients.\n\n\n\n\n\n\n\nutils.ctx_managers.sequence_parallel.AllGatherWithGrad.forward(\n ctx,\n input_tensor,\n group,\n)\nForward pass of all-gather of data with sequence dimension.\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\nctx\ntorch.autograd.function.FunctionCtx\ntorch.autograd function context.\nrequired\n\n\ninput_tensor\ntorch.Tensor\nTensor from model output with sequence dimension.\nrequired\n\n\ngroup\ndist.ProcessGroup\ntorch.distributed process group.\nrequired\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\n\n\n\n\n\ntorch.Tensor\nTensor from gathering the input_tensor from across the process group and concatenating along the sequence dimension.\n\n\n\n\n\n\n\n\n\nutils.ctx_managers.sequence_parallel.SequenceParallelContextManager(\n models,\n sequence_parallel_degree,\n gradient_accumulation_steps,\n ring_attn_func,\n heads_k_stride,\n gather_outputs,\n)\nContext manager for sequence parallelism operations.\nThis class provides a context that will automatically apply sequence parallelism\nduring model forward passes using a pre-forward hook, and gather outputs from\nacross the sequence parallelism group using a post-forward hook.\n\n\n\n\n\n\n\n\n\n\n\nName\nType\nDescription\nDefault\n\n\n\n\nmodels\nlist[nn.Module]\nList of models to apply sequence parallelism to pre- and post- forward hooks.\nrequired\n\n\nsequence_parallel_degree\nint\nNumber of processes to split sequences over.\nrequired\n\n\ngradient_accumulation_steps\nint\nNumber of steps to accumulate gradients over.\nrequired\n\n\nring_attn_func\nRingAttnFunc\nWhich ring attention function to use. Currently unused.\nrequired\n\n\nheads_k_stride\nint | None\nSequence parallelism K head stride size. Passed through to varlen_llama3 ring_flash_attn implementation.\nrequired\n\n\ngather_outputs\nbool\nWhether to gather outputs after model forward pass across the sequence parallel group.\nrequired"
},
{
"objectID": "docs/api/utils.ctx_managers.sequence_parallel.html#functions",
diff --git a/sitemap.xml b/sitemap.xml
index 1c87bb7dd..65114b4a0 100644
--- a/sitemap.xml
+++ b/sitemap.xml
@@ -2,758 +2,758 @@
https://docs.axolotl.ai/docs/unsloth.html
- 2025-06-24T18:59:39.482Z
+ 2025-06-25T12:34:07.138Zhttps://docs.axolotl.ai/docs/dataset-formats/conversation.html
- 2025-06-24T18:59:39.477Z
+ 2025-06-25T12:34:07.134Zhttps://docs.axolotl.ai/docs/dataset-formats/stepwise_supervised.html
- 2025-06-24T18:59:39.478Z
+ 2025-06-25T12:34:07.134Zhttps://docs.axolotl.ai/docs/dataset-formats/tokenized.html
- 2025-06-24T18:59:39.478Z
+ 2025-06-25T12:34:07.135Zhttps://docs.axolotl.ai/docs/mac.html
- 2025-06-24T18:59:39.481Z
+ 2025-06-25T12:34:07.138Zhttps://docs.axolotl.ai/docs/nccl.html
- 2025-06-24T18:59:39.481Z
+ 2025-06-25T12:34:07.138Zhttps://docs.axolotl.ai/docs/multi-node.html
- 2025-06-24T18:59:39.481Z
+ 2025-06-25T12:34:07.138Zhttps://docs.axolotl.ai/docs/docker.html
- 2025-06-24T18:59:39.478Z
+ 2025-06-25T12:34:07.135Zhttps://docs.axolotl.ai/docs/lr_groups.html
- 2025-06-24T18:59:39.481Z
+ 2025-06-25T12:34:07.138Zhttps://docs.axolotl.ai/docs/inference.html
- 2025-06-24T18:59:39.481Z
+ 2025-06-25T12:34:07.137Zhttps://docs.axolotl.ai/docs/cli.html
- 2025-06-24T18:59:39.477Z
+ 2025-06-25T12:34:07.134Zhttps://docs.axolotl.ai/docs/config-reference.html
- 2025-06-24T19:02:59.339Z
+ 2025-06-25T12:37:22.831Zhttps://docs.axolotl.ai/docs/multi-gpu.html
- 2025-06-24T18:59:39.481Z
+ 2025-06-25T12:34:07.138Zhttps://docs.axolotl.ai/docs/debugging.html
- 2025-06-24T18:59:39.478Z
+ 2025-06-25T12:34:07.135Zhttps://docs.axolotl.ai/docs/multimodal.html
- 2025-06-24T18:59:39.481Z
+ 2025-06-25T12:34:07.138Zhttps://docs.axolotl.ai/docs/api/cli.sweeps.html
- 2025-06-24T19:02:46.206Z
+ 2025-06-25T12:37:08.988Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.dpo.llama3.html
- 2025-06-24T19:02:46.533Z
+ 2025-06-25T12:37:09.321Zhttps://docs.axolotl.ai/docs/api/utils.schedulers.html
- 2025-06-24T19:02:46.930Z
+ 2025-06-25T12:37:09.717Zhttps://docs.axolotl.ai/docs/api/monkeypatch.llama_attn_hijack_xformers.html
- 2025-06-24T19:02:46.733Z
+ 2025-06-25T12:37:09.521Zhttps://docs.axolotl.ai/docs/api/cli.cloud.modal_.html
- 2025-06-24T19:02:46.254Z
+ 2025-06-25T12:37:09.038Zhttps://docs.axolotl.ai/docs/api/kernels.geglu.html
- 2025-06-24T19:02:46.687Z
+ 2025-06-25T12:37:09.474Zhttps://docs.axolotl.ai/docs/api/core.trainers.utils.html
- 2025-06-24T19:02:46.327Z
+ 2025-06-25T12:37:09.111Zhttps://docs.axolotl.ai/docs/api/core.datasets.chat.html
- 2025-06-24T19:02:46.057Z
+ 2025-06-25T12:37:08.836Zhttps://docs.axolotl.ai/docs/api/utils.schemas.peft.html
- 2025-06-24T19:02:47.041Z
+ 2025-06-25T12:37:09.830Zhttps://docs.axolotl.ai/docs/api/monkeypatch.btlm_attn_hijack_flash.html
- 2025-06-24T19:02:46.795Z
+ 2025-06-25T12:37:09.585Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.chat_template.html
- 2025-06-24T19:02:46.434Z
+ 2025-06-25T12:37:09.223Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.kto.user_defined.html
- 2025-06-24T19:02:46.569Z
+ 2025-06-25T12:37:09.353Zhttps://docs.axolotl.ai/docs/api/cli.cloud.base.html
- 2025-06-24T19:02:46.248Z
+ 2025-06-25T12:37:09.031Zhttps://docs.axolotl.ai/docs/api/kernels.swiglu.html
- 2025-06-24T19:02:46.697Z
+ 2025-06-25T12:37:09.484Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.stepwise_supervised.html
- 2025-06-24T19:02:46.498Z
+ 2025-06-25T12:37:09.287Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.bradley_terry.llama3.html
- 2025-06-24T19:02:46.592Z
+ 2025-06-25T12:37:09.377Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.completion.html
- 2025-06-24T19:02:46.487Z
+ 2025-06-25T12:37:09.277Zhttps://docs.axolotl.ai/docs/api/kernels.utils.html
- 2025-06-24T19:02:46.706Z
+ 2025-06-25T12:37:09.493Zhttps://docs.axolotl.ai/docs/api/common.datasets.html
- 2025-06-24T19:02:47.253Z
+ 2025-06-25T12:37:10.043Zhttps://docs.axolotl.ai/docs/api/utils.schemas.datasets.html
- 2025-06-24T19:02:47.032Z
+ 2025-06-25T12:37:09.821Zhttps://docs.axolotl.ai/docs/api/core.builders.rl.html
- 2025-06-24T19:02:46.012Z
+ 2025-06-25T12:37:08.791Zhttps://docs.axolotl.ai/docs/api/evaluate.html
- 2025-06-24T19:02:45.916Z
+ 2025-06-25T12:37:08.696Zhttps://docs.axolotl.ai/docs/api/kernels.quantize.html
- 2025-06-24T19:02:46.705Z
+ 2025-06-25T12:37:09.492Zhttps://docs.axolotl.ai/docs/api/monkeypatch.llama_attn_hijack_flash.html
- 2025-06-24T19:02:46.731Z
+ 2025-06-25T12:37:09.520Zhttps://docs.axolotl.ai/docs/api/core.trainers.mixins.rng_state_loader.html
- 2025-06-24T19:02:46.370Z
+ 2025-06-25T12:37:09.156Zhttps://docs.axolotl.ai/docs/api/integrations.base.html
- 2025-06-24T19:02:47.213Z
+ 2025-06-25T12:37:10.003Zhttps://docs.axolotl.ai/docs/api/cli.merge_lora.html
- 2025-06-24T19:02:46.180Z
+ 2025-06-25T12:37:08.961Zhttps://docs.axolotl.ai/docs/api/cli.merge_sharded_fsdp_weights.html
- 2025-06-24T19:02:46.192Z
+ 2025-06-25T12:37:08.973Zhttps://docs.axolotl.ai/docs/api/monkeypatch.transformers_fa_utils.html
- 2025-06-24T19:02:46.812Z
+ 2025-06-25T12:37:09.602Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.llama2_chat.html
- 2025-06-24T19:02:46.481Z
+ 2025-06-25T12:37:09.271Zhttps://docs.axolotl.ai/docs/api/utils.collators.mm_chat.html
- 2025-06-24T19:02:47.282Z
+ 2025-06-25T12:37:10.072Zhttps://docs.axolotl.ai/docs/api/utils.data.sft.html
- 2025-06-24T19:02:46.969Z
+ 2025-06-25T12:37:09.756Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.alpaca_instruct.html
- 2025-06-24T19:02:46.449Z
+ 2025-06-25T12:37:09.238Zhttps://docs.axolotl.ai/docs/api/integrations.liger.args.html
- 2025-06-24T19:02:47.228Z
+ 2025-06-25T12:37:10.018Zhttps://docs.axolotl.ai/docs/api/monkeypatch.mistral_attn_hijack_flash.html
- 2025-06-24T19:02:46.747Z
+ 2025-06-25T12:37:09.536Zhttps://docs.axolotl.ai/docs/api/cli.vllm_serve.html
- 2025-06-24T19:02:46.245Z
+ 2025-06-25T12:37:09.028Zhttps://docs.axolotl.ai/docs/api/monkeypatch.utils.html
- 2025-06-24T19:02:46.793Z
+ 2025-06-25T12:37:09.583Zhttps://docs.axolotl.ai/docs/api/loaders.patch_manager.html
- 2025-06-24T19:02:46.360Z
+ 2025-06-25T12:37:09.146Zhttps://docs.axolotl.ai/docs/api/utils.schemas.integrations.html
- 2025-06-24T19:02:47.061Z
+ 2025-06-25T12:37:09.851Zhttps://docs.axolotl.ai/docs/api/utils.callbacks.perplexity.html
- 2025-06-24T19:02:47.329Z
+ 2025-06-25T12:37:10.119Zhttps://docs.axolotl.ai/docs/api/cli.utils.html
- 2025-06-24T19:02:46.238Z
+ 2025-06-25T12:37:09.021Zhttps://docs.axolotl.ai/docs/api/utils.schemas.config.html
- 2025-06-24T19:02:47.003Z
+ 2025-06-25T12:37:09.790Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.input_output.html
- 2025-06-24T19:02:46.493Z
+ 2025-06-25T12:37:09.283Zhttps://docs.axolotl.ai/docs/api/utils.distributed.html
- 2025-06-24T19:02:46.950Z
+ 2025-06-25T12:37:09.737Zhttps://docs.axolotl.ai/docs/api/monkeypatch.gradient_checkpointing.offload_disk.html
- 2025-06-24T19:02:46.848Z
+ 2025-06-25T12:37:09.636Zhttps://docs.axolotl.ai/docs/api/monkeypatch.trainer_fsdp_optim.html
- 2025-06-24T19:02:46.805Z
+ 2025-06-25T12:37:09.595Zhttps://docs.axolotl.ai/docs/api/core.builders.base.html
- 2025-06-24T19:02:45.999Z
+ 2025-06-25T12:37:08.778Zhttps://docs.axolotl.ai/docs/api/core.trainers.trl.html
- 2025-06-24T19:02:46.286Z
+ 2025-06-25T12:37:09.069Zhttps://docs.axolotl.ai/docs/api/cli.evaluate.html
- 2025-06-24T19:02:46.113Z
+ 2025-06-25T12:37:08.893Zhttps://docs.axolotl.ai/docs/api/utils.optimizers.adopt.html
- 2025-06-24T19:02:46.961Z
+ 2025-06-25T12:37:09.748Zhttps://docs.axolotl.ai/docs/api/utils.callbacks.qat.html
- 2025-06-24T19:02:47.349Z
+ 2025-06-25T12:37:10.138Zhttps://docs.axolotl.ai/docs/api/core.trainers.dpo.trainer.html
- 2025-06-24T19:02:46.302Z
+ 2025-06-25T12:37:09.086Zhttps://docs.axolotl.ai/docs/api/core.chat.format.shared.html
- 2025-06-24T19:02:46.052Z
+ 2025-06-25T12:37:08.831Zhttps://docs.axolotl.ai/docs/api/monkeypatch.relora.html
- 2025-06-24T19:02:46.756Z
+ 2025-06-25T12:37:09.545Zhttps://docs.axolotl.ai/docs/api/cli.config.html
- 2025-06-24T19:02:46.157Z
+ 2025-06-25T12:37:08.938Zhttps://docs.axolotl.ai/docs/api/cli.preprocess.html
- 2025-06-24T19:02:46.200Z
+ 2025-06-25T12:37:08.982Zhttps://docs.axolotl.ai/docs/api/core.trainers.base.html
- 2025-06-24T19:02:46.270Z
+ 2025-06-25T12:37:09.053Zhttps://docs.axolotl.ai/docs/api/convert.html
- 2025-06-24T19:02:45.941Z
+ 2025-06-25T12:37:08.720Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.pygmalion.html
- 2025-06-24T19:02:46.515Z
+ 2025-06-25T12:37:09.304Zhttps://docs.axolotl.ai/docs/api/utils.schemas.trl.html
- 2025-06-24T19:02:47.044Z
+ 2025-06-25T12:37:09.833Zhttps://docs.axolotl.ai/docs/api/cli.args.html
- 2025-06-24T19:02:46.133Z
+ 2025-06-25T12:37:08.913Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.dpo.chat_template.html
- 2025-06-24T19:02:46.521Z
+ 2025-06-25T12:37:09.310Zhttps://docs.axolotl.ai/docs/api/loaders.constants.html
- 2025-06-24T19:02:46.361Z
+ 2025-06-25T12:37:09.147Zhttps://docs.axolotl.ai/docs/api/logging_config.html
- 2025-06-24T19:02:45.993Z
+ 2025-06-25T12:37:08.772Zhttps://docs.axolotl.ai/docs/api/cli.inference.html
- 2025-06-24T19:02:46.172Z
+ 2025-06-25T12:37:08.953Zhttps://docs.axolotl.ai/docs/api/utils.ctx_managers.sequence_parallel.html
- 2025-06-24T19:02:46.400Z
+ 2025-06-25T12:37:09.187Zhttps://docs.axolotl.ai/docs/api/integrations.spectrum.args.html
- 2025-06-24T19:02:47.234Z
+ 2025-06-25T12:37:10.025Zhttps://docs.axolotl.ai/docs/api/utils.schemas.training.html
- 2025-06-24T19:02:47.015Z
+ 2025-06-25T12:37:09.803Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.orcamini.html
- 2025-06-24T19:02:46.508Z
+ 2025-06-25T12:37:09.298Zhttps://docs.axolotl.ai/docs/api/utils.freeze.html
- 2025-06-24T19:02:46.889Z
+ 2025-06-25T12:37:09.674Zhttps://docs.axolotl.ai/docs/api/loaders.tokenizer.html
- 2025-06-24T19:02:46.345Z
+ 2025-06-25T12:37:09.130Zhttps://docs.axolotl.ai/docs/api/utils.bench.html
- 2025-06-24T19:02:46.881Z
+ 2025-06-25T12:37:09.667Zhttps://docs.axolotl.ai/docs/api/utils.quantization.html
- 2025-06-24T19:02:46.990Z
+ 2025-06-25T12:37:09.777Zhttps://docs.axolotl.ai/docs/batch_vs_grad.html
- 2025-06-24T18:59:39.477Z
+ 2025-06-25T12:34:07.134Zhttps://docs.axolotl.ai/docs/input_output.html
- 2025-06-24T18:59:39.481Z
+ 2025-06-25T12:34:07.138Zhttps://docs.axolotl.ai/docs/sequence_parallelism.html
- 2025-06-24T18:59:39.481Z
+ 2025-06-25T12:34:07.138Zhttps://docs.axolotl.ai/docs/reward_modelling.html
- 2025-06-24T18:59:39.481Z
+ 2025-06-25T12:34:07.138Zhttps://docs.axolotl.ai/index.html
- 2025-06-24T18:59:39.494Z
+ 2025-06-25T12:34:07.151Zhttps://docs.axolotl.ai/src/axolotl/integrations/LICENSE.html
- 2025-06-24T18:59:39.498Z
+ 2025-06-25T12:34:07.155Zhttps://docs.axolotl.ai/FAQS.html
- 2025-06-24T18:59:39.476Z
+ 2025-06-25T12:34:07.132Zhttps://docs.axolotl.ai/src/axolotl/integrations/cut_cross_entropy/ACKNOWLEDGEMENTS.html
- 2025-06-24T18:59:39.498Z
+ 2025-06-25T12:34:07.156Zhttps://docs.axolotl.ai/TODO.html
- 2025-06-24T18:59:39.476Z
+ 2025-06-25T12:34:07.133Zhttps://docs.axolotl.ai/examples/colab-notebooks/colab-axolotl-example.html
- 2025-06-24T18:59:39.482Z
+ 2025-06-25T12:34:07.139Zhttps://docs.axolotl.ai/docs/torchao.html
- 2025-06-24T18:59:39.482Z
+ 2025-06-25T12:34:07.138Zhttps://docs.axolotl.ai/docs/ray-integration.html
- 2025-06-24T18:59:39.481Z
+ 2025-06-25T12:34:07.138Zhttps://docs.axolotl.ai/docs/quantize.html
- 2025-06-24T18:59:39.481Z
+ 2025-06-25T12:34:07.138Zhttps://docs.axolotl.ai/docs/qat.html
- 2025-06-24T18:59:39.481Z
+ 2025-06-25T12:34:07.138Zhttps://docs.axolotl.ai/docs/api/utils.lora.html
- 2025-06-24T19:02:46.872Z
+ 2025-06-25T12:37:09.658Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.alpaca_w_system.html
- 2025-06-24T19:02:46.461Z
+ 2025-06-25T12:37:09.250Zhttps://docs.axolotl.ai/docs/api/monkeypatch.stablelm_attn_hijack_flash.html
- 2025-06-24T19:02:46.802Z
+ 2025-06-25T12:37:09.592Zhttps://docs.axolotl.ai/docs/api/utils.collators.core.html
- 2025-06-24T19:02:47.255Z
+ 2025-06-25T12:37:10.045Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.metharme.html
- 2025-06-24T19:02:46.505Z
+ 2025-06-25T12:37:09.294Zhttps://docs.axolotl.ai/docs/api/utils.callbacks.profiler.html
- 2025-06-24T19:02:47.333Z
+ 2025-06-25T12:37:10.123Zhttps://docs.axolotl.ai/docs/api/utils.data.pretraining.html
- 2025-06-24T19:02:46.962Z
+ 2025-06-25T12:37:09.749Zhttps://docs.axolotl.ai/docs/api/utils.callbacks.lisa.html
- 2025-06-24T19:02:47.334Z
+ 2025-06-25T12:37:10.124Zhttps://docs.axolotl.ai/docs/api/utils.trainer.html
- 2025-06-24T19:02:46.906Z
+ 2025-06-25T12:37:09.692Zhttps://docs.axolotl.ai/docs/api/integrations.cut_cross_entropy.args.html
- 2025-06-24T19:02:47.216Z
+ 2025-06-25T12:37:10.007Zhttps://docs.axolotl.ai/docs/api/utils.schemas.model.html
- 2025-06-24T19:02:47.010Z
+ 2025-06-25T12:37:09.798Zhttps://docs.axolotl.ai/docs/api/monkeypatch.data.batch_dataset_fetcher.html
- 2025-06-24T19:02:46.815Z
+ 2025-06-25T12:37:09.605Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.dpo.zephyr.html
- 2025-06-24T19:02:46.545Z
+ 2025-06-25T12:37:09.332Zhttps://docs.axolotl.ai/docs/api/datasets.html
- 2025-06-24T19:02:45.927Z
+ 2025-06-25T12:37:08.707Zhttps://docs.axolotl.ai/docs/api/utils.schemas.enums.html
- 2025-06-24T19:02:47.072Z
+ 2025-06-25T12:37:09.861Zhttps://docs.axolotl.ai/docs/api/integrations.kd.trainer.html
- 2025-06-24T19:02:47.224Z
+ 2025-06-25T12:37:10.015Zhttps://docs.axolotl.ai/docs/api/monkeypatch.lora_kernels.html
- 2025-06-24T19:02:46.785Z
+ 2025-06-25T12:37:09.575Zhttps://docs.axolotl.ai/docs/api/utils.collators.batching.html
- 2025-06-24T19:02:47.274Z
+ 2025-06-25T12:37:10.064Zhttps://docs.axolotl.ai/docs/api/core.trainers.grpo.sampler.html
- 2025-06-24T19:02:46.325Z
+ 2025-06-25T12:37:09.110Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.base.html
- 2025-06-24T19:02:46.401Z
+ 2025-06-25T12:37:09.189Zhttps://docs.axolotl.ai/docs/api/monkeypatch.multipack.html
- 2025-06-24T19:02:46.749Z
+ 2025-06-25T12:37:09.538Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.orpo.chat_template.html
- 2025-06-24T19:02:46.589Z
+ 2025-06-25T12:37:09.374Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.dpo.passthrough.html
- 2025-06-24T19:02:46.548Z
+ 2025-06-25T12:37:09.335Zhttps://docs.axolotl.ai/docs/api/core.chat.format.chatml.html
- 2025-06-24T19:02:46.049Z
+ 2025-06-25T12:37:08.828Zhttps://docs.axolotl.ai/docs/api/core.trainers.mixins.scheduler.html
- 2025-06-24T19:02:46.377Z
+ 2025-06-25T12:37:09.163Zhttps://docs.axolotl.ai/docs/api/utils.model_shard_quant.html
- 2025-06-24T19:02:46.877Z
+ 2025-06-25T12:37:09.663Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.kto.chatml.html
- 2025-06-24T19:02:46.567Z
+ 2025-06-25T12:37:09.352Zhttps://docs.axolotl.ai/docs/api/utils.tokenization.html
- 2025-06-24T19:02:46.855Z
+ 2025-06-25T12:37:09.642Zhttps://docs.axolotl.ai/docs/api/loaders.model.html
- 2025-06-24T19:02:46.336Z
+ 2025-06-25T12:37:09.121Zhttps://docs.axolotl.ai/docs/api/utils.callbacks.mlflow_.html
- 2025-06-24T19:02:47.338Z
+ 2025-06-25T12:37:10.128Zhttps://docs.axolotl.ai/docs/api/core.trainers.grpo.trainer.html
- 2025-06-24T19:02:46.313Z
+ 2025-06-25T12:37:09.097Zhttps://docs.axolotl.ai/docs/api/cli.main.html
- 2025-06-24T19:02:46.097Z
+ 2025-06-25T12:37:08.876Zhttps://docs.axolotl.ai/docs/api/utils.callbacks.comet_.html
- 2025-06-24T19:02:47.342Z
+ 2025-06-25T12:37:10.131Zhttps://docs.axolotl.ai/docs/api/utils.chat_templates.html
- 2025-06-24T19:02:46.865Z
+ 2025-06-25T12:37:09.652Zhttps://docs.axolotl.ai/docs/api/utils.schemas.utils.html
- 2025-06-24T19:02:47.077Z
+ 2025-06-25T12:37:09.867Zhttps://docs.axolotl.ai/docs/api/common.architectures.html
- 2025-06-24T19:02:47.236Z
+ 2025-06-25T12:37:10.026Zhttps://docs.axolotl.ai/docs/api/monkeypatch.llama_expand_mask.html
- 2025-06-24T19:02:46.757Z
+ 2025-06-25T12:37:09.546Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.alpaca_chat.html
- 2025-06-24T19:02:46.447Z
+ 2025-06-25T12:37:09.236Zhttps://docs.axolotl.ai/docs/api/utils.samplers.multipack.html
- 2025-06-24T19:02:47.323Z
+ 2025-06-25T12:37:10.113Zhttps://docs.axolotl.ai/docs/api/integrations.grokfast.optimizer.html
- 2025-06-24T19:02:47.217Z
+ 2025-06-25T12:37:10.008Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.dpo.chatml.html
- 2025-06-24T19:02:46.543Z
+ 2025-06-25T12:37:09.331Zhttps://docs.axolotl.ai/docs/api/monkeypatch.mixtral.html
- 2025-06-24T19:02:46.816Z
+ 2025-06-25T12:37:09.606Zhttps://docs.axolotl.ai/docs/api/train.html
- 2025-06-24T19:02:45.906Z
+ 2025-06-25T12:37:08.685Zhttps://docs.axolotl.ai/docs/api/monkeypatch.llama_patch_multipack.html
- 2025-06-24T19:02:46.796Z
+ 2025-06-25T12:37:09.586Zhttps://docs.axolotl.ai/docs/api/index.html
- 2025-06-24T19:02:45.844Z
+ 2025-06-25T12:37:08.622Zhttps://docs.axolotl.ai/docs/api/loaders.adapter.html
- 2025-06-24T19:02:46.352Z
+ 2025-06-25T12:37:09.137Zhttps://docs.axolotl.ai/docs/api/utils.schemas.multimodal.html
- 2025-06-24T19:02:47.049Z
+ 2025-06-25T12:37:09.838Zhttps://docs.axolotl.ai/docs/api/kernels.lora.html
- 2025-06-24T19:02:46.676Z
+ 2025-06-25T12:37:09.463Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.kto.llama3.html
- 2025-06-24T19:02:46.558Z
+ 2025-06-25T12:37:09.343Zhttps://docs.axolotl.ai/docs/api/cli.checks.html
- 2025-06-24T19:02:46.139Z
+ 2025-06-25T12:37:08.920Zhttps://docs.axolotl.ai/docs/api/cli.quantize.html
- 2025-06-24T19:02:46.259Z
+ 2025-06-25T12:37:09.043Zhttps://docs.axolotl.ai/docs/api/integrations.lm_eval.args.html
- 2025-06-24T19:02:47.231Z
+ 2025-06-25T12:37:10.021Zhttps://docs.axolotl.ai/docs/api/core.chat.messages.html
- 2025-06-24T19:02:46.047Z
+ 2025-06-25T12:37:08.826Zhttps://docs.axolotl.ai/docs/api/core.builders.causal.html
- 2025-06-24T19:02:46.004Z
+ 2025-06-25T12:37:08.783Zhttps://docs.axolotl.ai/docs/api/core.trainers.relora.html
- 2025-06-24T19:02:46.296Z
+ 2025-06-25T12:37:09.079Zhttps://docs.axolotl.ai/docs/api/models.mamba.modeling_mamba.html
- 2025-06-24T19:02:47.254Z
+ 2025-06-25T12:37:10.044Zhttps://docs.axolotl.ai/docs/api/monkeypatch.gradient_checkpointing.offload_cpu.html
- 2025-06-24T19:02:46.820Z
+ 2025-06-25T12:37:09.610Zhttps://docs.axolotl.ai/docs/api/core.trainers.mamba.html
- 2025-06-24T19:02:46.291Z
+ 2025-06-25T12:37:09.074Zhttps://docs.axolotl.ai/docs/api/core.datasets.transforms.chat_builder.html
- 2025-06-24T19:02:46.065Z
+ 2025-06-25T12:37:08.844Zhttps://docs.axolotl.ai/docs/api/loaders.processor.html
- 2025-06-24T19:02:46.346Z
+ 2025-06-25T12:37:09.131Zhttps://docs.axolotl.ai/docs/api/core.chat.format.llama3x.html
- 2025-06-24T19:02:46.050Z
+ 2025-06-25T12:37:08.829Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.messages.chat.html
- 2025-06-24T19:02:46.519Z
+ 2025-06-25T12:37:09.309Zhttps://docs.axolotl.ai/docs/api/cli.train.html
- 2025-06-24T19:02:46.105Z
+ 2025-06-25T12:37:08.885Zhttps://docs.axolotl.ai/docs/api/core.trainers.mixins.optimizer.html
- 2025-06-24T19:02:46.367Z
+ 2025-06-25T12:37:09.153Zhttps://docs.axolotl.ai/docs/api/utils.collators.mamba.html
- 2025-06-24T19:02:47.277Z
+ 2025-06-25T12:37:10.067Zhttps://docs.axolotl.ai/docs/api/monkeypatch.unsloth_.html
- 2025-06-24T19:02:46.813Z
+ 2025-06-25T12:37:09.603Zhttps://docs.axolotl.ai/docs/api/utils.dict.html
- 2025-06-24T19:02:46.953Z
+ 2025-06-25T12:37:09.740Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.user_defined.html
- 2025-06-24T19:02:46.469Z
+ 2025-06-25T12:37:09.258Zhttps://docs.axolotl.ai/docs/api/core.training_args.html
- 2025-06-24T19:02:46.024Z
+ 2025-06-25T12:37:08.803Zhttps://docs.axolotl.ai/docs/api/prompt_strategies.dpo.user_defined.html
- 2025-06-24T19:02:46.546Z
+ 2025-06-25T12:37:09.334Zhttps://docs.axolotl.ai/docs/api/prompt_tokenizers.html
- 2025-06-24T19:02:45.984Z
+ 2025-06-25T12:37:08.762Zhttps://docs.axolotl.ai/docs/api/common.const.html
- 2025-06-24T19:02:47.238Z
+ 2025-06-25T12:37:10.028Zhttps://docs.axolotl.ai/docs/fsdp_qlora.html
- 2025-06-24T18:59:39.478Z
+ 2025-06-25T12:34:07.135Zhttps://docs.axolotl.ai/docs/custom_integrations.html
- 2025-06-24T18:59:39.477Z
+ 2025-06-25T12:34:07.134Zhttps://docs.axolotl.ai/docs/getting-started.html
- 2025-06-24T18:59:39.478Z
+ 2025-06-25T12:34:07.135Zhttps://docs.axolotl.ai/docs/faq.html
- 2025-06-24T18:59:39.478Z
+ 2025-06-25T12:34:07.135Zhttps://docs.axolotl.ai/docs/lora_optims.html
- 2025-06-24T18:59:39.481Z
+ 2025-06-25T12:34:07.138Zhttps://docs.axolotl.ai/docs/rlhf.html
- 2025-06-24T18:59:39.481Z
+ 2025-06-25T12:34:07.138Zhttps://docs.axolotl.ai/docs/amd_hpc.html
- 2025-06-24T18:59:39.477Z
+ 2025-06-25T12:34:07.134Zhttps://docs.axolotl.ai/docs/installation.html
- 2025-06-24T18:59:39.481Z
+ 2025-06-25T12:34:07.138Zhttps://docs.axolotl.ai/docs/multipack.html
- 2025-06-24T18:59:39.481Z
+ 2025-06-25T12:34:07.138Zhttps://docs.axolotl.ai/docs/dataset_preprocessing.html
- 2025-06-24T18:59:39.478Z
+ 2025-06-25T12:34:07.135Zhttps://docs.axolotl.ai/docs/dataset_loading.html
- 2025-06-24T18:59:39.478Z
+ 2025-06-25T12:34:07.135Zhttps://docs.axolotl.ai/docs/dataset-formats/inst_tune.html
- 2025-06-24T18:59:39.478Z
+ 2025-06-25T12:34:07.134Zhttps://docs.axolotl.ai/docs/dataset-formats/template_free.html
- 2025-06-24T18:59:39.478Z
+ 2025-06-25T12:34:07.135Zhttps://docs.axolotl.ai/docs/dataset-formats/index.html
- 2025-06-24T18:59:39.477Z
+ 2025-06-25T12:34:07.134Zhttps://docs.axolotl.ai/docs/dataset-formats/pretraining.html
- 2025-06-24T18:59:39.478Z
+ 2025-06-25T12:34:07.134Z