Built site for gh-pages
This commit is contained in:
@@ -536,7 +536,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</tr>
|
||||
<tr class="even">
|
||||
<td><a href="#axolotl.cli.inference.do_inference">do_inference</a></td>
|
||||
<td>Runs inference on the command line in a loop. User input is accepted, a chat template</td>
|
||||
<td>Runs inference on the command line in a loop. User input is accepted, a chat</td>
|
||||
</tr>
|
||||
<tr class="odd">
|
||||
<td><a href="#axolotl.cli.inference.do_inference_gradio">do_inference_gradio</a></td>
|
||||
@@ -589,9 +589,9 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.cli.inference.do_inference" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.cli.inference.do_inference">do_inference</h3>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>cli.inference.do_inference(cfg, cli_args)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Runs inference on the command line in a loop. User input is accepted, a chat template
|
||||
is (optionally) applied, and the model specified in the <code>axolotl</code> config is used to
|
||||
generate completions according to a default generation config.</p>
|
||||
<p>Runs inference on the command line in a loop. User input is accepted, a chat
|
||||
template is (optionally) applied, and the model specified in the <code>axolotl</code> config is
|
||||
used to generate completions according to a default generation config.</p>
|
||||
<section id="parameters-1" class="level4 doc-section doc-section-parameters">
|
||||
<h4 class="doc-section doc-section-parameters anchored" data-anchor-id="parameters-1">Parameters</h4>
|
||||
<table class="caption-top table">
|
||||
|
||||
@@ -561,6 +561,10 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<td><a href="#axolotl.core.trainers.base.AxolotlTrainer.push_to_hub">push_to_hub</a></td>
|
||||
<td>Overwrite the <code>push_to_hub</code> method in order to force-add the tags when pushing the</td>
|
||||
</tr>
|
||||
<tr class="odd">
|
||||
<td><a href="#axolotl.core.trainers.base.AxolotlTrainer.store_metrics">store_metrics</a></td>
|
||||
<td>Store metrics with specified reduction type.</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<section id="axolotl.core.trainers.base.AxolotlTrainer.log" class="level5">
|
||||
@@ -606,6 +610,47 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>core.trainers.base.AxolotlTrainer.push_to_hub(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Overwrite the <code>push_to_hub</code> method in order to force-add the tags when pushing the
|
||||
model on the Hub. Please refer to <code>~transformers.Trainer.push_to_hub</code> for more details.</p>
|
||||
</section>
|
||||
<section id="axolotl.core.trainers.base.AxolotlTrainer.store_metrics" class="level5">
|
||||
<h5 class="anchored" data-anchor-id="axolotl.core.trainers.base.AxolotlTrainer.store_metrics">store_metrics</h5>
|
||||
<div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>core.trainers.base.AxolotlTrainer.store_metrics(</span>
|
||||
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> metrics,</span>
|
||||
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> train_eval<span class="op">=</span><span class="st">'train'</span>,</span>
|
||||
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> reduction<span class="op">=</span><span class="st">'mean'</span>,</span>
|
||||
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Store metrics with specified reduction type.</p>
|
||||
<section id="parameters-1" class="level6 doc-section doc-section-parameters">
|
||||
<h6 class="doc-section doc-section-parameters anchored" data-anchor-id="parameters-1">Parameters</h6>
|
||||
<table class="caption-top table">
|
||||
<colgroup>
|
||||
<col style="width: 6%">
|
||||
<col style="width: 35%">
|
||||
<col style="width: 50%">
|
||||
<col style="width: 6%">
|
||||
</colgroup>
|
||||
<thead>
|
||||
<tr class="header">
|
||||
<th>Name</th>
|
||||
<th>Type</th>
|
||||
<th>Description</th>
|
||||
<th>Default</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr class="odd">
|
||||
<td>metrics</td>
|
||||
<td>dict[str, float] | dict[str, tuple[int | float, str]]</td>
|
||||
<td>Dictionary of metric names to values, or metric names to (value, reduction_type) tuples.</td>
|
||||
<td><em>required</em></td>
|
||||
</tr>
|
||||
<tr class="even">
|
||||
<td>train_eval</td>
|
||||
<td>Literal['train', 'eval']</td>
|
||||
<td>Whether this is for training or evaluation.</td>
|
||||
<td><code>'train'</code></td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
|
||||
|
||||
</section>
|
||||
@@ -613,6 +658,7 @@ model on the Hub. Please refer to <code>~transformers.Trainer.push_to_hub</code>
|
||||
</section>
|
||||
</section>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
</main> <!-- /main -->
|
||||
<script id="quarto-html-after-body" type="application/javascript">
|
||||
|
||||
@@ -1047,9 +1047,9 @@ callbacks that require access to the model or trainer.</p>
|
||||
<h6 class="doc-section doc-section-returns anchored" data-anchor-id="returns-5">Returns</h6>
|
||||
<table class="caption-top table">
|
||||
<colgroup>
|
||||
<col style="width: 9%">
|
||||
<col style="width: 20%">
|
||||
<col style="width: 69%">
|
||||
<col style="width: 8%">
|
||||
<col style="width: 27%">
|
||||
<col style="width: 63%">
|
||||
</colgroup>
|
||||
<thead>
|
||||
<tr class="header">
|
||||
@@ -1061,7 +1061,7 @@ callbacks that require access to the model or trainer.</p>
|
||||
<tbody>
|
||||
<tr class="odd">
|
||||
<td></td>
|
||||
<td>Trainer | None</td>
|
||||
<td>type[Trainer] | None</td>
|
||||
<td>The first non-<code>None</code> trainer class returned by a plugin.</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -506,6 +506,22 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<li><a href="#citation" id="toc-citation" class="nav-link" data-scroll-target="#citation">Citation</a></li>
|
||||
</ul></li>
|
||||
<li><a href="#densemixer" id="toc-densemixer" class="nav-link" data-scroll-target="#densemixer">DenseMixer</a></li>
|
||||
<li><a href="#diffusion-lm-training-plugin-for-axolotl" id="toc-diffusion-lm-training-plugin-for-axolotl" class="nav-link" data-scroll-target="#diffusion-lm-training-plugin-for-axolotl">Diffusion LM Training Plugin for Axolotl</a>
|
||||
<ul class="collapse">
|
||||
<li><a href="#overview" id="toc-overview" class="nav-link" data-scroll-target="#overview">Overview</a></li>
|
||||
<li><a href="#installation-1" id="toc-installation-1" class="nav-link" data-scroll-target="#installation-1">Installation</a></li>
|
||||
<li><a href="#quickstart" id="toc-quickstart" class="nav-link" data-scroll-target="#quickstart">Quickstart</a></li>
|
||||
<li><a href="#basic-configuration" id="toc-basic-configuration" class="nav-link" data-scroll-target="#basic-configuration">Basic Configuration</a></li>
|
||||
<li><a href="#supported-models-1" id="toc-supported-models-1" class="nav-link" data-scroll-target="#supported-models-1">Supported Models</a></li>
|
||||
<li><a href="#how-it-works" id="toc-how-it-works" class="nav-link" data-scroll-target="#how-it-works">How It Works</a></li>
|
||||
<li><a href="#random-masking" id="toc-random-masking" class="nav-link" data-scroll-target="#random-masking">Random Masking</a></li>
|
||||
<li><a href="#diffusion-loss" id="toc-diffusion-loss" class="nav-link" data-scroll-target="#diffusion-loss">Diffusion Loss</a></li>
|
||||
<li><a href="#sample-generation" id="toc-sample-generation" class="nav-link" data-scroll-target="#sample-generation">Sample Generation</a></li>
|
||||
<li><a href="#inference" id="toc-inference" class="nav-link" data-scroll-target="#inference">Inference</a></li>
|
||||
<li><a href="#metrics-and-monitoring" id="toc-metrics-and-monitoring" class="nav-link" data-scroll-target="#metrics-and-monitoring">Metrics and Monitoring</a></li>
|
||||
<li><a href="#limitations" id="toc-limitations" class="nav-link" data-scroll-target="#limitations">Limitations</a></li>
|
||||
<li><a href="#references" id="toc-references" class="nav-link" data-scroll-target="#references">References</a></li>
|
||||
</ul></li>
|
||||
<li><a href="#grokfast" id="toc-grokfast" class="nav-link" data-scroll-target="#grokfast">Grokfast</a>
|
||||
<ul class="collapse">
|
||||
<li><a href="#usage-1" id="toc-usage-1" class="nav-link" data-scroll-target="#usage-1">Usage</a></li>
|
||||
@@ -532,12 +548,12 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<li><a href="#liger-kernels" id="toc-liger-kernels" class="nav-link" data-scroll-target="#liger-kernels">Liger Kernels</a>
|
||||
<ul class="collapse">
|
||||
<li><a href="#usage-5" id="toc-usage-5" class="nav-link" data-scroll-target="#usage-5">Usage</a></li>
|
||||
<li><a href="#supported-models-1" id="toc-supported-models-1" class="nav-link" data-scroll-target="#supported-models-1">Supported Models</a></li>
|
||||
<li><a href="#supported-models-2" id="toc-supported-models-2" class="nav-link" data-scroll-target="#supported-models-2">Supported Models</a></li>
|
||||
<li><a href="#citation-3" id="toc-citation-3" class="nav-link" data-scroll-target="#citation-3">Citation</a></li>
|
||||
</ul></li>
|
||||
<li><a href="#spectrum" id="toc-spectrum" class="nav-link" data-scroll-target="#spectrum">Spectrum</a>
|
||||
<ul class="collapse">
|
||||
<li><a href="#overview" id="toc-overview" class="nav-link" data-scroll-target="#overview">Overview</a></li>
|
||||
<li><a href="#overview-1" id="toc-overview-1" class="nav-link" data-scroll-target="#overview-1">Overview</a></li>
|
||||
<li><a href="#usage-6" id="toc-usage-6" class="nav-link" data-scroll-target="#usage-6">Usage</a></li>
|
||||
<li><a href="#citation-4" id="toc-citation-4" class="nav-link" data-scroll-target="#citation-4">Citation</a></li>
|
||||
</ul></li>
|
||||
@@ -662,25 +678,162 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> axolotl.integrations.densemixer.DenseMixerPlugin</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Please see reference <a href="https://github.com/axolotl-ai-cloud/axolotl/tree/main/src/axolotl/integrations/densemixer">here</a></p>
|
||||
</section>
|
||||
<section id="diffusion-lm-training-plugin-for-axolotl" class="level2">
|
||||
<h2 class="anchored" data-anchor-id="diffusion-lm-training-plugin-for-axolotl">Diffusion LM Training Plugin for Axolotl</h2>
|
||||
<p>This plugin enables diffusion language model training using an approach inspired by
|
||||
LLaDA (Large Language Diffusion Models) within Axolotl.</p>
|
||||
<section id="overview" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="overview">Overview</h3>
|
||||
<p>LLaDA is a diffusion-based approach to language model training that uses:
|
||||
- <strong>Random token masking</strong> during training instead of next-token prediction
|
||||
- <strong>Bidirectional attention</strong> to allow the model to attend to the full context
|
||||
- <strong>Importance weighting</strong> based on masking probabilities for stable training</p>
|
||||
<p>This approach can lead to more robust language models with better understanding of
|
||||
bidirectional context.</p>
|
||||
</section>
|
||||
<section id="installation-1" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="installation-1">Installation</h3>
|
||||
<p>The plugin is included with Axolotl. See our
|
||||
<a href="https://docs.axolotl.ai/docs/installation.html">installation docs</a>.</p>
|
||||
</section>
|
||||
<section id="quickstart" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="quickstart">Quickstart</h3>
|
||||
<p>Train with an example config (Llama‑3.2 1B):
|
||||
- Pretrain: <code>axolotl train examples/llama-3/diffusion-3.2-1b-pretrain.yaml</code>
|
||||
- SFT: <code>axolotl train examples/llama-3/diffusion-3.2-1b-sft.yaml</code></p>
|
||||
</section>
|
||||
<section id="basic-configuration" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="basic-configuration">Basic Configuration</h3>
|
||||
<p>You can also modify your existing configs to enable / customize diffusion training.</p>
|
||||
<p>Add the following to your Axolotl config:</p>
|
||||
<div class="sourceCode" id="cb6"><pre class="sourceCode yaml code-with-copy"><code class="sourceCode yaml"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a><span class="fu">plugins</span><span class="kw">:</span></span>
|
||||
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> axolotl.integrations.diffusion.DiffusionPlugin</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>And, configure the nested <code>diffusion</code> block (defaults shown):</p>
|
||||
<div class="sourceCode" id="cb7"><pre class="sourceCode yaml code-with-copy"><code class="sourceCode yaml"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a><span class="fu">diffusion</span><span class="kw">:</span></span>
|
||||
<span id="cb7-2"><a href="#cb7-2" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">noise_schedule</span><span class="kw">:</span><span class="at"> linear</span><span class="co"> # or "cosine"</span></span>
|
||||
<span id="cb7-3"><a href="#cb7-3" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">min_mask_ratio</span><span class="kw">:</span><span class="at"> </span><span class="fl">0.1</span></span>
|
||||
<span id="cb7-4"><a href="#cb7-4" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">max_mask_ratio</span><span class="kw">:</span><span class="at"> </span><span class="fl">0.9</span></span>
|
||||
<span id="cb7-5"><a href="#cb7-5" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">num_diffusion_steps</span><span class="kw">:</span><span class="at"> </span><span class="dv">128</span></span>
|
||||
<span id="cb7-6"><a href="#cb7-6" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">eps</span><span class="kw">:</span><span class="at"> </span><span class="fl">1e-3</span></span>
|
||||
<span id="cb7-7"><a href="#cb7-7" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">importance_weighting</span><span class="kw">:</span><span class="at"> </span><span class="ch">true</span></span>
|
||||
<span id="cb7-8"><a href="#cb7-8" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb7-9"><a href="#cb7-9" aria-hidden="true" tabindex="-1"></a><span class="co"> # Mask token (training auto-adds if missing, avoid pad/eos)</span></span>
|
||||
<span id="cb7-10"><a href="#cb7-10" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">mask_token_str</span><span class="kw">:</span><span class="at"> </span><span class="st">"<|diffusion_mask|>"</span></span>
|
||||
<span id="cb7-11"><a href="#cb7-11" aria-hidden="true" tabindex="-1"></a><span class="co"> # Or use an existing special token id (e.g., 128002 for Llama-3.x)</span></span>
|
||||
<span id="cb7-12"><a href="#cb7-12" aria-hidden="true" tabindex="-1"></a><span class="co"> # mask_token_id: 128002</span></span>
|
||||
<span id="cb7-13"><a href="#cb7-13" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb7-14"><a href="#cb7-14" aria-hidden="true" tabindex="-1"></a><span class="co"> # Sample generation during training (optional)</span></span>
|
||||
<span id="cb7-15"><a href="#cb7-15" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">generate_samples</span><span class="kw">:</span><span class="at"> </span><span class="ch">true</span></span>
|
||||
<span id="cb7-16"><a href="#cb7-16" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">generation_interval</span><span class="kw">:</span><span class="at"> </span><span class="dv">100</span></span>
|
||||
<span id="cb7-17"><a href="#cb7-17" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">num_generation_samples</span><span class="kw">:</span><span class="at"> </span><span class="dv">3</span></span>
|
||||
<span id="cb7-18"><a href="#cb7-18" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">generation_steps</span><span class="kw">:</span><span class="at"> </span><span class="dv">128</span></span>
|
||||
<span id="cb7-19"><a href="#cb7-19" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">generation_temperature</span><span class="kw">:</span><span class="at"> </span><span class="fl">0.0</span></span>
|
||||
<span id="cb7-20"><a href="#cb7-20" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">generation_max_length</span><span class="kw">:</span><span class="at"> </span><span class="dv">100</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
</section>
|
||||
<section id="supported-models-1" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="supported-models-1">Supported Models</h3>
|
||||
<p>Any models that support 4D attention masks should work out of the box. If not, please
|
||||
create an <a href="https://github.com/axolotl-ai-cloud/axolotl/issues">issue</a> or open a
|
||||
<a href="https://github.com/axolotl-ai-cloud/axolotl/compare">PR</a>!</p>
|
||||
</section>
|
||||
<section id="how-it-works" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="how-it-works">How It Works</h3>
|
||||
</section>
|
||||
<section id="random-masking" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="random-masking">Random Masking</h3>
|
||||
<p>During training, tokens are randomly masked:
|
||||
- Sample timestep <code>t</code> uniformly from [0, 1]
|
||||
- Calculate masking probability: <code>p = (1 - eps) * t + eps</code>
|
||||
- Randomly mask tokens with probability <code>p</code></p>
|
||||
</section>
|
||||
<section id="diffusion-loss" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="diffusion-loss">Diffusion Loss</h3>
|
||||
<p>Loss is computed only on masked tokens with (optional) importance weighting:</p>
|
||||
<div class="sourceCode" id="cb8"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a>loss <span class="op">=</span> <span class="bu">sum</span>(cross_entropy(pred, target) <span class="op">/</span> p_mask) <span class="op">/</span> total_tokens</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
</section>
|
||||
<section id="sample-generation" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="sample-generation">Sample Generation</h3>
|
||||
<p>When <code>diffusion.generate_samples: true</code>, the plugin generates samples during training:</p>
|
||||
<pre><code>Sample 1:
|
||||
Original (45 tokens): The quick brown fox jumps over the lazy dog...
|
||||
Masked (18/45 tokens, 40.0%): The [MASK] [MASK] fox [MASK] over [MASK] lazy [MASK]...
|
||||
Generated: The quick brown fox jumps over the lazy dog...</code></pre>
|
||||
<p>Samples are logged to console and wandb (if enabled).</p>
|
||||
</section>
|
||||
<section id="inference" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="inference">Inference</h3>
|
||||
<p>Diffusion inference is integrated into the standard Axolotl CLI. Use the same config
|
||||
you trained with and run:</p>
|
||||
<pre><code>axolotl inference path/to/your-config.yaml</code></pre>
|
||||
<p>Optionally, pass <code>--gradio</code> to use a simple web interface.</p>
|
||||
<p>Interactive controls (prefix the prompt with commands):
|
||||
- <code>:complete N</code> → completion mode with N new masked tokens appended (default 64)
|
||||
- <code>:mask R</code> → random masking mode with target mask ratio R in [0.0, 1.0]</p>
|
||||
<p>Example session:</p>
|
||||
<pre><code>================================================================================
|
||||
Commands:
|
||||
:complete N -> completion mode with N tokens (default 64)
|
||||
:mask R -> random masking with ratio R (0.0–1.0)
|
||||
================================================================================
|
||||
Give me an instruction (Ctrl + D to submit):
|
||||
|
||||
:mask 0.4 The quick brown fox jumps over the lazy dog
|
||||
|
||||
Masked (40.0%):
|
||||
The [MASK] brown [MASK] jumps over the [MASK] dog
|
||||
|
||||
Generated:
|
||||
The quick brown fox jumps over the loud dog</code></pre>
|
||||
</section>
|
||||
<section id="metrics-and-monitoring" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="metrics-and-monitoring">Metrics and Monitoring</h3>
|
||||
<p>The plugin adds (or modifies) several metrics to track diffusion training:</p>
|
||||
<ul>
|
||||
<li><code>train/loss</code>: Weighted diffusion loss</li>
|
||||
<li><code>train/accuracy</code>: Accuracy on masked tokens</li>
|
||||
<li><code>train/mask_ratio</code>: Average fraction of tokens masked</li>
|
||||
<li><code>train/num_masked_tokens</code>: Number of tokens masked</li>
|
||||
<li><code>train/avg_p_mask</code>: Average masking probability</li>
|
||||
<li><code>train/ce_loss</code>: Unweighted cross-entropy loss</li>
|
||||
<li><code>train/importance_weight_avg</code>: Average importance weight</li>
|
||||
</ul>
|
||||
</section>
|
||||
<section id="limitations" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="limitations">Limitations</h3>
|
||||
<ul>
|
||||
<li>No flash attention support</li>
|
||||
<li>No RL training support</li>
|
||||
</ul>
|
||||
</section>
|
||||
<section id="references" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="references">References</h3>
|
||||
<ul>
|
||||
<li><a href="https://arxiv.org/abs/2404.10406">LLaDA Paper</a></li>
|
||||
<li><a href="https://docs.axolotl.ai/">Axolotl Documentation</a></li>
|
||||
<li><a href="https://docs.axolotl.ai/docs/api/integrations.diffusion.args.html#axolotl.integrations.diffusion.args">API reference for plugin</a></li>
|
||||
</ul>
|
||||
<p>Please see reference <a href="https://github.com/axolotl-ai-cloud/axolotl/tree/main/src/axolotl/integrations/diffusion">here</a></p>
|
||||
</section>
|
||||
</section>
|
||||
<section id="grokfast" class="level2">
|
||||
<h2 class="anchored" data-anchor-id="grokfast">Grokfast</h2>
|
||||
<p>See https://github.com/ironjr/grokfast</p>
|
||||
<section id="usage-1" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="usage-1">Usage</h3>
|
||||
<div class="sourceCode" id="cb6"><pre class="sourceCode yaml code-with-copy"><code class="sourceCode yaml"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a><span class="fu">plugins</span><span class="kw">:</span></span>
|
||||
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> axolotl.integrations.grokfast.GrokfastPlugin</span></span>
|
||||
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a><span class="fu">grokfast_alpha</span><span class="kw">:</span><span class="at"> </span><span class="fl">2.0</span></span>
|
||||
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a><span class="fu">grokfast_lamb</span><span class="kw">:</span><span class="at"> </span><span class="fl">0.98</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb12"><pre class="sourceCode yaml code-with-copy"><code class="sourceCode yaml"><span id="cb12-1"><a href="#cb12-1" aria-hidden="true" tabindex="-1"></a><span class="fu">plugins</span><span class="kw">:</span></span>
|
||||
<span id="cb12-2"><a href="#cb12-2" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> axolotl.integrations.grokfast.GrokfastPlugin</span></span>
|
||||
<span id="cb12-3"><a href="#cb12-3" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb12-4"><a href="#cb12-4" aria-hidden="true" tabindex="-1"></a><span class="fu">grokfast_alpha</span><span class="kw">:</span><span class="at"> </span><span class="fl">2.0</span></span>
|
||||
<span id="cb12-5"><a href="#cb12-5" aria-hidden="true" tabindex="-1"></a><span class="fu">grokfast_lamb</span><span class="kw">:</span><span class="at"> </span><span class="fl">0.98</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
</section>
|
||||
<section id="citation-1" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="citation-1">Citation</h3>
|
||||
<div class="sourceCode" id="cb7"><pre class="sourceCode bib code-with-copy"><code class="sourceCode bibtex"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a><span class="va">@article</span>{<span class="ot">lee2024grokfast</span>,</span>
|
||||
<span id="cb7-2"><a href="#cb7-2" aria-hidden="true" tabindex="-1"></a> <span class="dt">title</span>={{Grokfast}: Accelerated Grokking by Amplifying Slow Gradients},</span>
|
||||
<span id="cb7-3"><a href="#cb7-3" aria-hidden="true" tabindex="-1"></a> <span class="dt">author</span>={Lee, Jaerin and Kang, Bong Gyun and Kim, Kihoon and Lee, Kyoung Mu},</span>
|
||||
<span id="cb7-4"><a href="#cb7-4" aria-hidden="true" tabindex="-1"></a> <span class="dt">journal</span>={arXiv preprint arXiv:2405.20233},</span>
|
||||
<span id="cb7-5"><a href="#cb7-5" aria-hidden="true" tabindex="-1"></a> <span class="dt">year</span>={2024}</span>
|
||||
<span id="cb7-6"><a href="#cb7-6" aria-hidden="true" tabindex="-1"></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb13"><pre class="sourceCode bib code-with-copy"><code class="sourceCode bibtex"><span id="cb13-1"><a href="#cb13-1" aria-hidden="true" tabindex="-1"></a><span class="va">@article</span>{<span class="ot">lee2024grokfast</span>,</span>
|
||||
<span id="cb13-2"><a href="#cb13-2" aria-hidden="true" tabindex="-1"></a> <span class="dt">title</span>={{Grokfast}: Accelerated Grokking by Amplifying Slow Gradients},</span>
|
||||
<span id="cb13-3"><a href="#cb13-3" aria-hidden="true" tabindex="-1"></a> <span class="dt">author</span>={Lee, Jaerin and Kang, Bong Gyun and Kim, Kihoon and Lee, Kyoung Mu},</span>
|
||||
<span id="cb13-4"><a href="#cb13-4" aria-hidden="true" tabindex="-1"></a> <span class="dt">journal</span>={arXiv preprint arXiv:2405.20233},</span>
|
||||
<span id="cb13-5"><a href="#cb13-5" aria-hidden="true" tabindex="-1"></a> <span class="dt">year</span>={2024}</span>
|
||||
<span id="cb13-6"><a href="#cb13-6" aria-hidden="true" tabindex="-1"></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Please see reference <a href="https://github.com/axolotl-ai-cloud/axolotl/tree/main/src/axolotl/integrations/grokfast">here</a></p>
|
||||
</section>
|
||||
</section>
|
||||
@@ -688,21 +841,21 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<h2 class="anchored" data-anchor-id="knowledge-distillation-kd">Knowledge Distillation (KD)</h2>
|
||||
<section id="usage-2" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="usage-2">Usage</h3>
|
||||
<div class="sourceCode" id="cb8"><pre class="sourceCode yaml code-with-copy"><code class="sourceCode yaml"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a><span class="fu">plugins</span><span class="kw">:</span></span>
|
||||
<span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> </span><span class="st">"axolotl.integrations.kd.KDPlugin"</span></span>
|
||||
<span id="cb8-3"><a href="#cb8-3" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb8-4"><a href="#cb8-4" aria-hidden="true" tabindex="-1"></a><span class="fu">kd_trainer</span><span class="kw">:</span><span class="at"> </span><span class="ch">True</span></span>
|
||||
<span id="cb8-5"><a href="#cb8-5" aria-hidden="true" tabindex="-1"></a><span class="fu">kd_ce_alpha</span><span class="kw">:</span><span class="at"> </span><span class="fl">0.1</span></span>
|
||||
<span id="cb8-6"><a href="#cb8-6" aria-hidden="true" tabindex="-1"></a><span class="fu">kd_alpha</span><span class="kw">:</span><span class="at"> </span><span class="fl">0.9</span></span>
|
||||
<span id="cb8-7"><a href="#cb8-7" aria-hidden="true" tabindex="-1"></a><span class="fu">kd_temperature</span><span class="kw">:</span><span class="at"> </span><span class="fl">1.0</span></span>
|
||||
<span id="cb8-8"><a href="#cb8-8" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb8-9"><a href="#cb8-9" aria-hidden="true" tabindex="-1"></a><span class="fu">torch_compile</span><span class="kw">:</span><span class="at"> </span><span class="ch">True</span><span class="co"> # torch>=2.6.0, recommended to reduce vram</span></span>
|
||||
<span id="cb8-10"><a href="#cb8-10" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb8-11"><a href="#cb8-11" aria-hidden="true" tabindex="-1"></a><span class="fu">datasets</span><span class="kw">:</span></span>
|
||||
<span id="cb8-12"><a href="#cb8-12" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> </span><span class="fu">path</span><span class="kw">:</span><span class="at"> ...</span></span>
|
||||
<span id="cb8-13"><a href="#cb8-13" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">type</span><span class="kw">:</span><span class="at"> </span><span class="st">"axolotl.integrations.kd.chat_template"</span></span>
|
||||
<span id="cb8-14"><a href="#cb8-14" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">field_messages</span><span class="kw">:</span><span class="at"> </span><span class="st">"messages_combined"</span></span>
|
||||
<span id="cb8-15"><a href="#cb8-15" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">logprobs_field</span><span class="kw">:</span><span class="at"> </span><span class="st">"llm_text_generation_vllm_logprobs"</span><span class="co"> # for kd only, field of logprobs</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb14"><pre class="sourceCode yaml code-with-copy"><code class="sourceCode yaml"><span id="cb14-1"><a href="#cb14-1" aria-hidden="true" tabindex="-1"></a><span class="fu">plugins</span><span class="kw">:</span></span>
|
||||
<span id="cb14-2"><a href="#cb14-2" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> </span><span class="st">"axolotl.integrations.kd.KDPlugin"</span></span>
|
||||
<span id="cb14-3"><a href="#cb14-3" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb14-4"><a href="#cb14-4" aria-hidden="true" tabindex="-1"></a><span class="fu">kd_trainer</span><span class="kw">:</span><span class="at"> </span><span class="ch">True</span></span>
|
||||
<span id="cb14-5"><a href="#cb14-5" aria-hidden="true" tabindex="-1"></a><span class="fu">kd_ce_alpha</span><span class="kw">:</span><span class="at"> </span><span class="fl">0.1</span></span>
|
||||
<span id="cb14-6"><a href="#cb14-6" aria-hidden="true" tabindex="-1"></a><span class="fu">kd_alpha</span><span class="kw">:</span><span class="at"> </span><span class="fl">0.9</span></span>
|
||||
<span id="cb14-7"><a href="#cb14-7" aria-hidden="true" tabindex="-1"></a><span class="fu">kd_temperature</span><span class="kw">:</span><span class="at"> </span><span class="fl">1.0</span></span>
|
||||
<span id="cb14-8"><a href="#cb14-8" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb14-9"><a href="#cb14-9" aria-hidden="true" tabindex="-1"></a><span class="fu">torch_compile</span><span class="kw">:</span><span class="at"> </span><span class="ch">True</span><span class="co"> # torch>=2.6.0, recommended to reduce vram</span></span>
|
||||
<span id="cb14-10"><a href="#cb14-10" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb14-11"><a href="#cb14-11" aria-hidden="true" tabindex="-1"></a><span class="fu">datasets</span><span class="kw">:</span></span>
|
||||
<span id="cb14-12"><a href="#cb14-12" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> </span><span class="fu">path</span><span class="kw">:</span><span class="at"> ...</span></span>
|
||||
<span id="cb14-13"><a href="#cb14-13" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">type</span><span class="kw">:</span><span class="at"> </span><span class="st">"axolotl.integrations.kd.chat_template"</span></span>
|
||||
<span id="cb14-14"><a href="#cb14-14" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">field_messages</span><span class="kw">:</span><span class="at"> </span><span class="st">"messages_combined"</span></span>
|
||||
<span id="cb14-15"><a href="#cb14-15" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">logprobs_field</span><span class="kw">:</span><span class="at"> </span><span class="st">"llm_text_generation_vllm_logprobs"</span><span class="co"> # for kd only, field of logprobs</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>An example dataset can be found at <a href="https://huggingface.co/datasets/axolotl-ai-co/evolkit-logprobs-pipeline-75k-v2-sample"><code>axolotl-ai-co/evolkit-logprobs-pipeline-75k-v2-sample</code></a></p>
|
||||
<p>Please see reference <a href="https://github.com/axolotl-ai-cloud/axolotl/tree/main/src/axolotl/integrations/kd">here</a></p>
|
||||
</section>
|
||||
@@ -717,7 +870,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<h3 class="anchored" data-anchor-id="requirements-1">Requirements</h3>
|
||||
<ul>
|
||||
<li><p>Axolotl with <code>llmcompressor</code> extras:</p>
|
||||
<div class="sourceCode" id="cb9"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb9-1"><a href="#cb9-1" aria-hidden="true" tabindex="-1"></a><span class="ex">pip</span> install <span class="st">"axolotl[llmcompressor]"</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div></li>
|
||||
<div class="sourceCode" id="cb15"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb15-1"><a href="#cb15-1" aria-hidden="true" tabindex="-1"></a><span class="ex">pip</span> install <span class="st">"axolotl[llmcompressor]"</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div></li>
|
||||
<li><p>Requires <code>llmcompressor >= 0.5.1</code></p></li>
|
||||
</ul>
|
||||
<p>This will install all necessary dependencies to fine-tune sparsified models using the integration.</p>
|
||||
@@ -726,25 +879,25 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="usage-3" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="usage-3">Usage</h3>
|
||||
<p>To enable sparse fine-tuning with this integration, include the plugin in your Axolotl config:</p>
|
||||
<div class="sourceCode" id="cb10"><pre class="sourceCode yaml code-with-copy"><code class="sourceCode yaml"><span id="cb10-1"><a href="#cb10-1" aria-hidden="true" tabindex="-1"></a><span class="fu">plugins</span><span class="kw">:</span></span>
|
||||
<span id="cb10-2"><a href="#cb10-2" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> axolotl.integrations.llm_compressor.LLMCompressorPlugin</span></span>
|
||||
<span id="cb10-3"><a href="#cb10-3" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb10-4"><a href="#cb10-4" aria-hidden="true" tabindex="-1"></a><span class="fu">llmcompressor</span><span class="kw">:</span></span>
|
||||
<span id="cb10-5"><a href="#cb10-5" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">recipe</span><span class="kw">:</span></span>
|
||||
<span id="cb10-6"><a href="#cb10-6" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">finetuning_stage</span><span class="kw">:</span></span>
|
||||
<span id="cb10-7"><a href="#cb10-7" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">finetuning_modifiers</span><span class="kw">:</span></span>
|
||||
<span id="cb10-8"><a href="#cb10-8" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">ConstantPruningModifier</span><span class="kw">:</span></span>
|
||||
<span id="cb10-9"><a href="#cb10-9" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">targets</span><span class="kw">:</span><span class="at"> </span><span class="kw">[</span></span>
|
||||
<span id="cb10-10"><a href="#cb10-10" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="st">'re:.*q_proj.weight'</span><span class="kw">,</span></span>
|
||||
<span id="cb10-11"><a href="#cb10-11" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="st">'re:.*k_proj.weight'</span><span class="kw">,</span></span>
|
||||
<span id="cb10-12"><a href="#cb10-12" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="st">'re:.*v_proj.weight'</span><span class="kw">,</span></span>
|
||||
<span id="cb10-13"><a href="#cb10-13" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="st">'re:.*o_proj.weight'</span><span class="kw">,</span></span>
|
||||
<span id="cb10-14"><a href="#cb10-14" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="st">'re:.*gate_proj.weight'</span><span class="kw">,</span></span>
|
||||
<span id="cb10-15"><a href="#cb10-15" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="st">'re:.*up_proj.weight'</span><span class="kw">,</span></span>
|
||||
<span id="cb10-16"><a href="#cb10-16" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="st">'re:.*down_proj.weight'</span><span class="kw">,</span></span>
|
||||
<span id="cb10-17"><a href="#cb10-17" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">]</span></span>
|
||||
<span id="cb10-18"><a href="#cb10-18" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">start</span><span class="kw">:</span><span class="at"> </span><span class="dv">0</span></span>
|
||||
<span id="cb10-19"><a href="#cb10-19" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">save_compressed</span><span class="kw">:</span><span class="at"> </span><span class="ch">true</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb16"><pre class="sourceCode yaml code-with-copy"><code class="sourceCode yaml"><span id="cb16-1"><a href="#cb16-1" aria-hidden="true" tabindex="-1"></a><span class="fu">plugins</span><span class="kw">:</span></span>
|
||||
<span id="cb16-2"><a href="#cb16-2" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> axolotl.integrations.llm_compressor.LLMCompressorPlugin</span></span>
|
||||
<span id="cb16-3"><a href="#cb16-3" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb16-4"><a href="#cb16-4" aria-hidden="true" tabindex="-1"></a><span class="fu">llmcompressor</span><span class="kw">:</span></span>
|
||||
<span id="cb16-5"><a href="#cb16-5" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">recipe</span><span class="kw">:</span></span>
|
||||
<span id="cb16-6"><a href="#cb16-6" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">finetuning_stage</span><span class="kw">:</span></span>
|
||||
<span id="cb16-7"><a href="#cb16-7" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">finetuning_modifiers</span><span class="kw">:</span></span>
|
||||
<span id="cb16-8"><a href="#cb16-8" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">ConstantPruningModifier</span><span class="kw">:</span></span>
|
||||
<span id="cb16-9"><a href="#cb16-9" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">targets</span><span class="kw">:</span><span class="at"> </span><span class="kw">[</span></span>
|
||||
<span id="cb16-10"><a href="#cb16-10" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="st">'re:.*q_proj.weight'</span><span class="kw">,</span></span>
|
||||
<span id="cb16-11"><a href="#cb16-11" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="st">'re:.*k_proj.weight'</span><span class="kw">,</span></span>
|
||||
<span id="cb16-12"><a href="#cb16-12" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="st">'re:.*v_proj.weight'</span><span class="kw">,</span></span>
|
||||
<span id="cb16-13"><a href="#cb16-13" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="st">'re:.*o_proj.weight'</span><span class="kw">,</span></span>
|
||||
<span id="cb16-14"><a href="#cb16-14" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="st">'re:.*gate_proj.weight'</span><span class="kw">,</span></span>
|
||||
<span id="cb16-15"><a href="#cb16-15" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="st">'re:.*up_proj.weight'</span><span class="kw">,</span></span>
|
||||
<span id="cb16-16"><a href="#cb16-16" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="st">'re:.*down_proj.weight'</span><span class="kw">,</span></span>
|
||||
<span id="cb16-17"><a href="#cb16-17" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">]</span></span>
|
||||
<span id="cb16-18"><a href="#cb16-18" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">start</span><span class="kw">:</span><span class="at"> </span><span class="dv">0</span></span>
|
||||
<span id="cb16-19"><a href="#cb16-19" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">save_compressed</span><span class="kw">:</span><span class="at"> </span><span class="ch">true</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>This plugin <strong>does not apply pruning or sparsification itself</strong> — it is intended for <strong>fine-tuning models that have already been sparsified</strong>.</p>
|
||||
<p>Pre-sparsified checkpoints can be:
|
||||
- Generated using <a href="https://github.com/vllm-project/llm-compressor">LLMCompressor</a>
|
||||
@@ -771,22 +924,22 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<p>After fine-tuning your sparse model, you can leverage vLLM for efficient inference.
|
||||
You can also use LLMCompressor to apply additional quantization to your fine-tuned
|
||||
sparse model before inference for even greater performance benefits.:</p>
|
||||
<div class="sourceCode" id="cb11"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb11-1"><a href="#cb11-1" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> vllm <span class="im">import</span> LLM, SamplingParams</span>
|
||||
<span id="cb11-2"><a href="#cb11-2" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb11-3"><a href="#cb11-3" aria-hidden="true" tabindex="-1"></a>prompts <span class="op">=</span> [</span>
|
||||
<span id="cb11-4"><a href="#cb11-4" aria-hidden="true" tabindex="-1"></a> <span class="st">"Hello, my name is"</span>,</span>
|
||||
<span id="cb11-5"><a href="#cb11-5" aria-hidden="true" tabindex="-1"></a> <span class="st">"The president of the United States is"</span>,</span>
|
||||
<span id="cb11-6"><a href="#cb11-6" aria-hidden="true" tabindex="-1"></a> <span class="st">"The capital of France is"</span>,</span>
|
||||
<span id="cb11-7"><a href="#cb11-7" aria-hidden="true" tabindex="-1"></a> <span class="st">"The future of AI is"</span>,</span>
|
||||
<span id="cb11-8"><a href="#cb11-8" aria-hidden="true" tabindex="-1"></a>]</span>
|
||||
<span id="cb11-9"><a href="#cb11-9" aria-hidden="true" tabindex="-1"></a>sampling_params <span class="op">=</span> SamplingParams(temperature<span class="op">=</span><span class="fl">0.8</span>, top_p<span class="op">=</span><span class="fl">0.95</span>)</span>
|
||||
<span id="cb11-10"><a href="#cb11-10" aria-hidden="true" tabindex="-1"></a>llm <span class="op">=</span> LLM(<span class="st">"path/to/your/sparse/model"</span>)</span>
|
||||
<span id="cb11-11"><a href="#cb11-11" aria-hidden="true" tabindex="-1"></a>outputs <span class="op">=</span> llm.generate(prompts, sampling_params)</span>
|
||||
<span id="cb11-12"><a href="#cb11-12" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb11-13"><a href="#cb11-13" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> output <span class="kw">in</span> outputs:</span>
|
||||
<span id="cb11-14"><a href="#cb11-14" aria-hidden="true" tabindex="-1"></a> prompt <span class="op">=</span> output.prompt</span>
|
||||
<span id="cb11-15"><a href="#cb11-15" aria-hidden="true" tabindex="-1"></a> generated_text <span class="op">=</span> output.outputs[<span class="dv">0</span>].text</span>
|
||||
<span id="cb11-16"><a href="#cb11-16" aria-hidden="true" tabindex="-1"></a> <span class="bu">print</span>(<span class="ss">f"Prompt: </span><span class="sc">{</span>prompt<span class="sc">!r}</span><span class="ss">, Generated text: </span><span class="sc">{</span>generated_text<span class="sc">!r}</span><span class="ss">"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb17"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb17-1"><a href="#cb17-1" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> vllm <span class="im">import</span> LLM, SamplingParams</span>
|
||||
<span id="cb17-2"><a href="#cb17-2" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb17-3"><a href="#cb17-3" aria-hidden="true" tabindex="-1"></a>prompts <span class="op">=</span> [</span>
|
||||
<span id="cb17-4"><a href="#cb17-4" aria-hidden="true" tabindex="-1"></a> <span class="st">"Hello, my name is"</span>,</span>
|
||||
<span id="cb17-5"><a href="#cb17-5" aria-hidden="true" tabindex="-1"></a> <span class="st">"The president of the United States is"</span>,</span>
|
||||
<span id="cb17-6"><a href="#cb17-6" aria-hidden="true" tabindex="-1"></a> <span class="st">"The capital of France is"</span>,</span>
|
||||
<span id="cb17-7"><a href="#cb17-7" aria-hidden="true" tabindex="-1"></a> <span class="st">"The future of AI is"</span>,</span>
|
||||
<span id="cb17-8"><a href="#cb17-8" aria-hidden="true" tabindex="-1"></a>]</span>
|
||||
<span id="cb17-9"><a href="#cb17-9" aria-hidden="true" tabindex="-1"></a>sampling_params <span class="op">=</span> SamplingParams(temperature<span class="op">=</span><span class="fl">0.8</span>, top_p<span class="op">=</span><span class="fl">0.95</span>)</span>
|
||||
<span id="cb17-10"><a href="#cb17-10" aria-hidden="true" tabindex="-1"></a>llm <span class="op">=</span> LLM(<span class="st">"path/to/your/sparse/model"</span>)</span>
|
||||
<span id="cb17-11"><a href="#cb17-11" aria-hidden="true" tabindex="-1"></a>outputs <span class="op">=</span> llm.generate(prompts, sampling_params)</span>
|
||||
<span id="cb17-12"><a href="#cb17-12" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb17-13"><a href="#cb17-13" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> output <span class="kw">in</span> outputs:</span>
|
||||
<span id="cb17-14"><a href="#cb17-14" aria-hidden="true" tabindex="-1"></a> prompt <span class="op">=</span> output.prompt</span>
|
||||
<span id="cb17-15"><a href="#cb17-15" aria-hidden="true" tabindex="-1"></a> generated_text <span class="op">=</span> output.outputs[<span class="dv">0</span>].text</span>
|
||||
<span id="cb17-16"><a href="#cb17-16" aria-hidden="true" tabindex="-1"></a> <span class="bu">print</span>(<span class="ss">f"Prompt: </span><span class="sc">{</span>prompt<span class="sc">!r}</span><span class="ss">, Generated text: </span><span class="sc">{</span>generated_text<span class="sc">!r}</span><span class="ss">"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>For more details on vLLM’s capabilities and advanced configuration options, see the <a href="https://docs.vllm.ai/">official vLLM documentation</a>.</p>
|
||||
</section>
|
||||
<section id="learn-more" class="level3">
|
||||
@@ -802,29 +955,29 @@ sparse model before inference for even greater performance benefits.:</p>
|
||||
<p>See https://github.com/EleutherAI/lm-evaluation-harness</p>
|
||||
<section id="usage-4" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="usage-4">Usage</h3>
|
||||
<div class="sourceCode" id="cb12"><pre class="sourceCode yaml code-with-copy"><code class="sourceCode yaml"><span id="cb12-1"><a href="#cb12-1" aria-hidden="true" tabindex="-1"></a><span class="fu">plugins</span><span class="kw">:</span></span>
|
||||
<span id="cb12-2"><a href="#cb12-2" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> axolotl.integrations.lm_eval.LMEvalPlugin</span></span>
|
||||
<span id="cb12-3"><a href="#cb12-3" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb12-4"><a href="#cb12-4" aria-hidden="true" tabindex="-1"></a><span class="fu">lm_eval_tasks</span><span class="kw">:</span></span>
|
||||
<span id="cb12-5"><a href="#cb12-5" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> gsm8k</span></span>
|
||||
<span id="cb12-6"><a href="#cb12-6" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> hellaswag</span></span>
|
||||
<span id="cb12-7"><a href="#cb12-7" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> arc_easy</span></span>
|
||||
<span id="cb12-8"><a href="#cb12-8" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb12-9"><a href="#cb12-9" aria-hidden="true" tabindex="-1"></a><span class="fu">lm_eval_batch_size</span><span class="kw">:</span><span class="co"> # Batch size for evaluation</span></span>
|
||||
<span id="cb12-10"><a href="#cb12-10" aria-hidden="true" tabindex="-1"></a><span class="fu">output_dir</span><span class="kw">:</span><span class="co"> # Directory to save evaluation results</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb18"><pre class="sourceCode yaml code-with-copy"><code class="sourceCode yaml"><span id="cb18-1"><a href="#cb18-1" aria-hidden="true" tabindex="-1"></a><span class="fu">plugins</span><span class="kw">:</span></span>
|
||||
<span id="cb18-2"><a href="#cb18-2" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> axolotl.integrations.lm_eval.LMEvalPlugin</span></span>
|
||||
<span id="cb18-3"><a href="#cb18-3" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb18-4"><a href="#cb18-4" aria-hidden="true" tabindex="-1"></a><span class="fu">lm_eval_tasks</span><span class="kw">:</span></span>
|
||||
<span id="cb18-5"><a href="#cb18-5" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> gsm8k</span></span>
|
||||
<span id="cb18-6"><a href="#cb18-6" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> hellaswag</span></span>
|
||||
<span id="cb18-7"><a href="#cb18-7" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> arc_easy</span></span>
|
||||
<span id="cb18-8"><a href="#cb18-8" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb18-9"><a href="#cb18-9" aria-hidden="true" tabindex="-1"></a><span class="fu">lm_eval_batch_size</span><span class="kw">:</span><span class="co"> # Batch size for evaluation</span></span>
|
||||
<span id="cb18-10"><a href="#cb18-10" aria-hidden="true" tabindex="-1"></a><span class="fu">output_dir</span><span class="kw">:</span><span class="co"> # Directory to save evaluation results</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
</section>
|
||||
<section id="citation-2" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="citation-2">Citation</h3>
|
||||
<div class="sourceCode" id="cb13"><pre class="sourceCode bib code-with-copy"><code class="sourceCode bibtex"><span id="cb13-1"><a href="#cb13-1" aria-hidden="true" tabindex="-1"></a><span class="va">@misc</span>{<span class="ot">eval</span>-<span class="ot">harness</span>,</span>
|
||||
<span id="cb13-2"><a href="#cb13-2" aria-hidden="true" tabindex="-1"></a> <span class="dt">author</span> = {Gao, Leo and Tow, Jonathan and Abbasi, Baber and Biderman, Stella and Black, Sid and DiPofi, Anthony and Foster, Charles and Golding, Laurence and Hsu, Jeffrey and Le Noac'h, Alain and Li, Haonan and McDonell, Kyle and Muennighoff, Niklas and Ociepa, Chris and Phang, Jason and Reynolds, Laria and Schoelkopf, Hailey and Skowron, Aviya and Sutawika, Lintang and Tang, Eric and Thite, Anish and Wang, Ben and Wang, Kevin and Zou, Andy},</span>
|
||||
<span id="cb13-3"><a href="#cb13-3" aria-hidden="true" tabindex="-1"></a> <span class="dt">title</span> = {A framework for few-shot language model evaluation},</span>
|
||||
<span id="cb13-4"><a href="#cb13-4" aria-hidden="true" tabindex="-1"></a> <span class="dt">month</span> = 07,</span>
|
||||
<span id="cb13-5"><a href="#cb13-5" aria-hidden="true" tabindex="-1"></a> <span class="dt">year</span> = 2024,</span>
|
||||
<span id="cb13-6"><a href="#cb13-6" aria-hidden="true" tabindex="-1"></a> <span class="dt">publisher</span> = {Zenodo},</span>
|
||||
<span id="cb13-7"><a href="#cb13-7" aria-hidden="true" tabindex="-1"></a> <span class="dt">version</span> = {v0.4.3},</span>
|
||||
<span id="cb13-8"><a href="#cb13-8" aria-hidden="true" tabindex="-1"></a> <span class="dt">doi</span> = {10.5281/zenodo.12608602},</span>
|
||||
<span id="cb13-9"><a href="#cb13-9" aria-hidden="true" tabindex="-1"></a> <span class="dt">url</span> = {https://zenodo.org/records/12608602}</span>
|
||||
<span id="cb13-10"><a href="#cb13-10" aria-hidden="true" tabindex="-1"></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb19"><pre class="sourceCode bib code-with-copy"><code class="sourceCode bibtex"><span id="cb19-1"><a href="#cb19-1" aria-hidden="true" tabindex="-1"></a><span class="va">@misc</span>{<span class="ot">eval</span>-<span class="ot">harness</span>,</span>
|
||||
<span id="cb19-2"><a href="#cb19-2" aria-hidden="true" tabindex="-1"></a> <span class="dt">author</span> = {Gao, Leo and Tow, Jonathan and Abbasi, Baber and Biderman, Stella and Black, Sid and DiPofi, Anthony and Foster, Charles and Golding, Laurence and Hsu, Jeffrey and Le Noac'h, Alain and Li, Haonan and McDonell, Kyle and Muennighoff, Niklas and Ociepa, Chris and Phang, Jason and Reynolds, Laria and Schoelkopf, Hailey and Skowron, Aviya and Sutawika, Lintang and Tang, Eric and Thite, Anish and Wang, Ben and Wang, Kevin and Zou, Andy},</span>
|
||||
<span id="cb19-3"><a href="#cb19-3" aria-hidden="true" tabindex="-1"></a> <span class="dt">title</span> = {A framework for few-shot language model evaluation},</span>
|
||||
<span id="cb19-4"><a href="#cb19-4" aria-hidden="true" tabindex="-1"></a> <span class="dt">month</span> = 07,</span>
|
||||
<span id="cb19-5"><a href="#cb19-5" aria-hidden="true" tabindex="-1"></a> <span class="dt">year</span> = 2024,</span>
|
||||
<span id="cb19-6"><a href="#cb19-6" aria-hidden="true" tabindex="-1"></a> <span class="dt">publisher</span> = {Zenodo},</span>
|
||||
<span id="cb19-7"><a href="#cb19-7" aria-hidden="true" tabindex="-1"></a> <span class="dt">version</span> = {v0.4.3},</span>
|
||||
<span id="cb19-8"><a href="#cb19-8" aria-hidden="true" tabindex="-1"></a> <span class="dt">doi</span> = {10.5281/zenodo.12608602},</span>
|
||||
<span id="cb19-9"><a href="#cb19-9" aria-hidden="true" tabindex="-1"></a> <span class="dt">url</span> = {https://zenodo.org/records/12608602}</span>
|
||||
<span id="cb19-10"><a href="#cb19-10" aria-hidden="true" tabindex="-1"></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Please see reference <a href="https://github.com/axolotl-ai-cloud/axolotl/tree/main/src/axolotl/integrations/lm_eval">here</a></p>
|
||||
</section>
|
||||
</section>
|
||||
@@ -839,16 +992,16 @@ sparse model before inference for even greater performance benefits.:</p>
|
||||
<p>See https://github.com/linkedin/Liger-Kernel</p>
|
||||
<section id="usage-5" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="usage-5">Usage</h3>
|
||||
<div class="sourceCode" id="cb14"><pre class="sourceCode yaml code-with-copy"><code class="sourceCode yaml"><span id="cb14-1"><a href="#cb14-1" aria-hidden="true" tabindex="-1"></a><span class="fu">plugins</span><span class="kw">:</span></span>
|
||||
<span id="cb14-2"><a href="#cb14-2" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> axolotl.integrations.liger.LigerPlugin</span></span>
|
||||
<span id="cb14-3"><a href="#cb14-3" aria-hidden="true" tabindex="-1"></a><span class="fu">liger_rope</span><span class="kw">:</span><span class="at"> </span><span class="ch">true</span></span>
|
||||
<span id="cb14-4"><a href="#cb14-4" aria-hidden="true" tabindex="-1"></a><span class="fu">liger_rms_norm</span><span class="kw">:</span><span class="at"> </span><span class="ch">true</span></span>
|
||||
<span id="cb14-5"><a href="#cb14-5" aria-hidden="true" tabindex="-1"></a><span class="fu">liger_glu_activation</span><span class="kw">:</span><span class="at"> </span><span class="ch">true</span></span>
|
||||
<span id="cb14-6"><a href="#cb14-6" aria-hidden="true" tabindex="-1"></a><span class="fu">liger_layer_norm</span><span class="kw">:</span><span class="at"> </span><span class="ch">true</span></span>
|
||||
<span id="cb14-7"><a href="#cb14-7" aria-hidden="true" tabindex="-1"></a><span class="fu">liger_fused_linear_cross_entropy</span><span class="kw">:</span><span class="at"> </span><span class="ch">true</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb20"><pre class="sourceCode yaml code-with-copy"><code class="sourceCode yaml"><span id="cb20-1"><a href="#cb20-1" aria-hidden="true" tabindex="-1"></a><span class="fu">plugins</span><span class="kw">:</span></span>
|
||||
<span id="cb20-2"><a href="#cb20-2" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> axolotl.integrations.liger.LigerPlugin</span></span>
|
||||
<span id="cb20-3"><a href="#cb20-3" aria-hidden="true" tabindex="-1"></a><span class="fu">liger_rope</span><span class="kw">:</span><span class="at"> </span><span class="ch">true</span></span>
|
||||
<span id="cb20-4"><a href="#cb20-4" aria-hidden="true" tabindex="-1"></a><span class="fu">liger_rms_norm</span><span class="kw">:</span><span class="at"> </span><span class="ch">true</span></span>
|
||||
<span id="cb20-5"><a href="#cb20-5" aria-hidden="true" tabindex="-1"></a><span class="fu">liger_glu_activation</span><span class="kw">:</span><span class="at"> </span><span class="ch">true</span></span>
|
||||
<span id="cb20-6"><a href="#cb20-6" aria-hidden="true" tabindex="-1"></a><span class="fu">liger_layer_norm</span><span class="kw">:</span><span class="at"> </span><span class="ch">true</span></span>
|
||||
<span id="cb20-7"><a href="#cb20-7" aria-hidden="true" tabindex="-1"></a><span class="fu">liger_fused_linear_cross_entropy</span><span class="kw">:</span><span class="at"> </span><span class="ch">true</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
</section>
|
||||
<section id="supported-models-1" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="supported-models-1">Supported Models</h3>
|
||||
<section id="supported-models-2" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="supported-models-2">Supported Models</h3>
|
||||
<ul>
|
||||
<li>deepseek_v2</li>
|
||||
<li>gemma</li>
|
||||
@@ -871,16 +1024,16 @@ sparse model before inference for even greater performance benefits.:</p>
|
||||
</section>
|
||||
<section id="citation-3" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="citation-3">Citation</h3>
|
||||
<div class="sourceCode" id="cb15"><pre class="sourceCode bib code-with-copy"><code class="sourceCode bibtex"><span id="cb15-1"><a href="#cb15-1" aria-hidden="true" tabindex="-1"></a><span class="va">@article</span>{<span class="ot">hsu2024ligerkernelefficienttriton</span>,</span>
|
||||
<span id="cb15-2"><a href="#cb15-2" aria-hidden="true" tabindex="-1"></a> <span class="dt">title</span>={Liger Kernel: Efficient Triton Kernels for LLM Training},</span>
|
||||
<span id="cb15-3"><a href="#cb15-3" aria-hidden="true" tabindex="-1"></a> <span class="dt">author</span>={Pin-Lun Hsu and Yun Dai and Vignesh Kothapalli and Qingquan Song and Shao Tang and Siyu Zhu and Steven Shimizu and Shivam Sahni and Haowen Ning and Yanning Chen},</span>
|
||||
<span id="cb15-4"><a href="#cb15-4" aria-hidden="true" tabindex="-1"></a> <span class="dt">year</span>={2024},</span>
|
||||
<span id="cb15-5"><a href="#cb15-5" aria-hidden="true" tabindex="-1"></a> <span class="dt">eprint</span>={2410.10989},</span>
|
||||
<span id="cb15-6"><a href="#cb15-6" aria-hidden="true" tabindex="-1"></a> <span class="dt">archivePrefix</span>={arXiv},</span>
|
||||
<span id="cb15-7"><a href="#cb15-7" aria-hidden="true" tabindex="-1"></a> <span class="dt">primaryClass</span>={cs.LG},</span>
|
||||
<span id="cb15-8"><a href="#cb15-8" aria-hidden="true" tabindex="-1"></a> <span class="dt">url</span>={https://arxiv.org/abs/2410.10989},</span>
|
||||
<span id="cb15-9"><a href="#cb15-9" aria-hidden="true" tabindex="-1"></a> <span class="dt">journal</span>={arXiv preprint arXiv:2410.10989},</span>
|
||||
<span id="cb15-10"><a href="#cb15-10" aria-hidden="true" tabindex="-1"></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb21"><pre class="sourceCode bib code-with-copy"><code class="sourceCode bibtex"><span id="cb21-1"><a href="#cb21-1" aria-hidden="true" tabindex="-1"></a><span class="va">@article</span>{<span class="ot">hsu2024ligerkernelefficienttriton</span>,</span>
|
||||
<span id="cb21-2"><a href="#cb21-2" aria-hidden="true" tabindex="-1"></a> <span class="dt">title</span>={Liger Kernel: Efficient Triton Kernels for LLM Training},</span>
|
||||
<span id="cb21-3"><a href="#cb21-3" aria-hidden="true" tabindex="-1"></a> <span class="dt">author</span>={Pin-Lun Hsu and Yun Dai and Vignesh Kothapalli and Qingquan Song and Shao Tang and Siyu Zhu and Steven Shimizu and Shivam Sahni and Haowen Ning and Yanning Chen},</span>
|
||||
<span id="cb21-4"><a href="#cb21-4" aria-hidden="true" tabindex="-1"></a> <span class="dt">year</span>={2024},</span>
|
||||
<span id="cb21-5"><a href="#cb21-5" aria-hidden="true" tabindex="-1"></a> <span class="dt">eprint</span>={2410.10989},</span>
|
||||
<span id="cb21-6"><a href="#cb21-6" aria-hidden="true" tabindex="-1"></a> <span class="dt">archivePrefix</span>={arXiv},</span>
|
||||
<span id="cb21-7"><a href="#cb21-7" aria-hidden="true" tabindex="-1"></a> <span class="dt">primaryClass</span>={cs.LG},</span>
|
||||
<span id="cb21-8"><a href="#cb21-8" aria-hidden="true" tabindex="-1"></a> <span class="dt">url</span>={https://arxiv.org/abs/2410.10989},</span>
|
||||
<span id="cb21-9"><a href="#cb21-9" aria-hidden="true" tabindex="-1"></a> <span class="dt">journal</span>={arXiv preprint arXiv:2410.10989},</span>
|
||||
<span id="cb21-10"><a href="#cb21-10" aria-hidden="true" tabindex="-1"></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Please see reference <a href="https://github.com/axolotl-ai-cloud/axolotl/tree/main/src/axolotl/integrations/liger">here</a></p>
|
||||
</section>
|
||||
</section>
|
||||
@@ -889,30 +1042,30 @@ sparse model before inference for even greater performance benefits.:</p>
|
||||
<p>by Eric Hartford, Lucas Atkins, Fernando Fernandes, David Golchinfar</p>
|
||||
<p>This plugin contains code to freeze the bottom fraction of modules in a model, based on the Signal-to-Noise Ratio (SNR).</p>
|
||||
<p>See https://github.com/cognitivecomputations/spectrum</p>
|
||||
<section id="overview" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="overview">Overview</h3>
|
||||
<section id="overview-1" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="overview-1">Overview</h3>
|
||||
<p>Spectrum is a tool for scanning and evaluating the Signal-to-Noise Ratio (SNR) of layers in large language models.
|
||||
By identifying the top n% of layers with the highest SNR, you can optimize training efficiency.</p>
|
||||
</section>
|
||||
<section id="usage-6" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="usage-6">Usage</h3>
|
||||
<div class="sourceCode" id="cb16"><pre class="sourceCode yaml code-with-copy"><code class="sourceCode yaml"><span id="cb16-1"><a href="#cb16-1" aria-hidden="true" tabindex="-1"></a><span class="fu">plugins</span><span class="kw">:</span></span>
|
||||
<span id="cb16-2"><a href="#cb16-2" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> axolotl.integrations.spectrum.SpectrumPlugin</span></span>
|
||||
<span id="cb16-3"><a href="#cb16-3" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb16-4"><a href="#cb16-4" aria-hidden="true" tabindex="-1"></a><span class="fu">spectrum_top_fraction</span><span class="kw">:</span><span class="at"> </span><span class="fl">0.5</span></span>
|
||||
<span id="cb16-5"><a href="#cb16-5" aria-hidden="true" tabindex="-1"></a><span class="fu">spectrum_model_name</span><span class="kw">:</span><span class="at"> meta-llama/Meta-Llama-3.1-8B</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb22"><pre class="sourceCode yaml code-with-copy"><code class="sourceCode yaml"><span id="cb22-1"><a href="#cb22-1" aria-hidden="true" tabindex="-1"></a><span class="fu">plugins</span><span class="kw">:</span></span>
|
||||
<span id="cb22-2"><a href="#cb22-2" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> axolotl.integrations.spectrum.SpectrumPlugin</span></span>
|
||||
<span id="cb22-3"><a href="#cb22-3" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb22-4"><a href="#cb22-4" aria-hidden="true" tabindex="-1"></a><span class="fu">spectrum_top_fraction</span><span class="kw">:</span><span class="at"> </span><span class="fl">0.5</span></span>
|
||||
<span id="cb22-5"><a href="#cb22-5" aria-hidden="true" tabindex="-1"></a><span class="fu">spectrum_model_name</span><span class="kw">:</span><span class="at"> meta-llama/Meta-Llama-3.1-8B</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
</section>
|
||||
<section id="citation-4" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="citation-4">Citation</h3>
|
||||
<div class="sourceCode" id="cb17"><pre class="sourceCode bib code-with-copy"><code class="sourceCode bibtex"><span id="cb17-1"><a href="#cb17-1" aria-hidden="true" tabindex="-1"></a><span class="va">@misc</span>{<span class="ot">hartford2024spectrumtargetedtrainingsignal</span>,</span>
|
||||
<span id="cb17-2"><a href="#cb17-2" aria-hidden="true" tabindex="-1"></a> <span class="dt">title</span>={Spectrum: Targeted Training on Signal to Noise Ratio},</span>
|
||||
<span id="cb17-3"><a href="#cb17-3" aria-hidden="true" tabindex="-1"></a> <span class="dt">author</span>={Eric Hartford and Lucas Atkins and Fernando Fernandes Neto and David Golchinfar},</span>
|
||||
<span id="cb17-4"><a href="#cb17-4" aria-hidden="true" tabindex="-1"></a> <span class="dt">year</span>={2024},</span>
|
||||
<span id="cb17-5"><a href="#cb17-5" aria-hidden="true" tabindex="-1"></a> <span class="dt">eprint</span>={2406.06623},</span>
|
||||
<span id="cb17-6"><a href="#cb17-6" aria-hidden="true" tabindex="-1"></a> <span class="dt">archivePrefix</span>={arXiv},</span>
|
||||
<span id="cb17-7"><a href="#cb17-7" aria-hidden="true" tabindex="-1"></a> <span class="dt">primaryClass</span>={cs.LG},</span>
|
||||
<span id="cb17-8"><a href="#cb17-8" aria-hidden="true" tabindex="-1"></a> <span class="dt">url</span>={https://arxiv.org/abs/2406.06623},</span>
|
||||
<span id="cb17-9"><a href="#cb17-9" aria-hidden="true" tabindex="-1"></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb23"><pre class="sourceCode bib code-with-copy"><code class="sourceCode bibtex"><span id="cb23-1"><a href="#cb23-1" aria-hidden="true" tabindex="-1"></a><span class="va">@misc</span>{<span class="ot">hartford2024spectrumtargetedtrainingsignal</span>,</span>
|
||||
<span id="cb23-2"><a href="#cb23-2" aria-hidden="true" tabindex="-1"></a> <span class="dt">title</span>={Spectrum: Targeted Training on Signal to Noise Ratio},</span>
|
||||
<span id="cb23-3"><a href="#cb23-3" aria-hidden="true" tabindex="-1"></a> <span class="dt">author</span>={Eric Hartford and Lucas Atkins and Fernando Fernandes Neto and David Golchinfar},</span>
|
||||
<span id="cb23-4"><a href="#cb23-4" aria-hidden="true" tabindex="-1"></a> <span class="dt">year</span>={2024},</span>
|
||||
<span id="cb23-5"><a href="#cb23-5" aria-hidden="true" tabindex="-1"></a> <span class="dt">eprint</span>={2406.06623},</span>
|
||||
<span id="cb23-6"><a href="#cb23-6" aria-hidden="true" tabindex="-1"></a> <span class="dt">archivePrefix</span>={arXiv},</span>
|
||||
<span id="cb23-7"><a href="#cb23-7" aria-hidden="true" tabindex="-1"></a> <span class="dt">primaryClass</span>={cs.LG},</span>
|
||||
<span id="cb23-8"><a href="#cb23-8" aria-hidden="true" tabindex="-1"></a> <span class="dt">url</span>={https://arxiv.org/abs/2406.06623},</span>
|
||||
<span id="cb23-9"><a href="#cb23-9" aria-hidden="true" tabindex="-1"></a>}</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Please see reference <a href="https://github.com/axolotl-ai-cloud/axolotl/tree/main/src/axolotl/integrations/spectrum">here</a></p>
|
||||
</section>
|
||||
</section>
|
||||
@@ -956,10 +1109,10 @@ Warning
|
||||
</div>
|
||||
<div class="callout-body-container callout-body">
|
||||
<p>If you could not load your integration, please ensure you are pip installing in editable mode.</p>
|
||||
<div class="sourceCode" id="cb18"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb18-1"><a href="#cb18-1" aria-hidden="true" tabindex="-1"></a><span class="ex">pip</span> install <span class="at">-e</span> .</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb24"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb24-1"><a href="#cb24-1" aria-hidden="true" tabindex="-1"></a><span class="ex">pip</span> install <span class="at">-e</span> .</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>and correctly spelled the integration name in the config file.</p>
|
||||
<div class="sourceCode" id="cb19"><pre class="sourceCode yaml code-with-copy"><code class="sourceCode yaml"><span id="cb19-1"><a href="#cb19-1" aria-hidden="true" tabindex="-1"></a><span class="fu">plugins</span><span class="kw">:</span></span>
|
||||
<span id="cb19-2"><a href="#cb19-2" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> axolotl.integrations.your_integration_name.YourIntegrationPlugin</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb25"><pre class="sourceCode yaml code-with-copy"><code class="sourceCode yaml"><span id="cb25-1"><a href="#cb25-1" aria-hidden="true" tabindex="-1"></a><span class="fu">plugins</span><span class="kw">:</span></span>
|
||||
<span id="cb25-2"><a href="#cb25-2" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> axolotl.integrations.your_integration_name.YourIntegrationPlugin</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="callout callout-style-default callout-note callout-titled">
|
||||
|
||||
@@ -593,8 +593,8 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="configure-for-supervised-fine-tuning-sft" class="level1">
|
||||
<h1>Configure for Supervised Fine-Tuning (SFT)</h1>
|
||||
<div id="cell-9" class="cell" data-quarto-private-1="{"key":"colab","value":{"base_uri":"https://localhost:8080/","height":151,"referenced_widgets":["388f618924274d21a066f098f4f1e744","7c95f85a2b1f47a1bd846d110c47bb3c","083f9cda8d754c168beee10d2f8955a2","62e1a65582f446a78612eaa804e08a7d","487a177d020f4605834878b2fdc7afa3","7fd44cf9ca6e4726bfd7ac21846d6a14","366a343b62fa47d8985a3bd464d99f9e","a0a11e929edd4189b79723d618522c33","e87ea87fcff247b5bbcc331ba79a8dc2","5e18768f7ad6434ba8b8b8a2e853e204","bb33aec33a6447078c31bfd728942994"]}}" data-outputid="f0acdcec-4b41-4a3f-ffed-c2d2d929158e">
|
||||
<div class="sourceCode cell-code" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> axolotl.utils.<span class="bu">dict</span> <span class="im">import</span> DictDefault</span>
|
||||
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> axolotl.cli.config <span class="im">import</span> load_cfg</span>
|
||||
<div class="sourceCode cell-code" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> axolotl.cli.config <span class="im">import</span> load_cfg</span>
|
||||
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a><span class="im">from</span> axolotl.utils.<span class="bu">dict</span> <span class="im">import</span> DictDefault</span>
|
||||
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a></span>
|
||||
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a><span class="co"># Axolotl provides full control and transparency over model and training configuration</span></span>
|
||||
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a>config <span class="op">=</span> DictDefault(</span>
|
||||
|
||||
25
search.json
25
search.json
File diff suppressed because one or more lines are too long
396
sitemap.xml
396
sitemap.xml
@@ -2,794 +2,794 @@
|
||||
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/examples/colab-notebooks/colab-axolotl-example.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.683Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.393Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/src/axolotl/integrations/LICENSE.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.699Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.410Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/FAQS.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.673Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.383Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/unsloth.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.679Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.390Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/reward_modelling.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.679Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.389Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/docker.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.676Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.386Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/batch_vs_grad.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.675Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.385Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/streaming.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.679Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.390Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/nccl.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.679Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.389Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/quantize.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.679Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.389Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/nd_parallelism.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.679Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.389Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/custom_integrations.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.675Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.385Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/debugging.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.676Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.386Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/gradient_checkpointing.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.676Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.386Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/multimodal.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.679Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.389Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/dataset_loading.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.675Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.385Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/integrations.spectrum.args.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.217Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:57.134Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/loaders.adapter.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.373Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.280Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/core.builders.base.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.015Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:55.920Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/cli.merge_sharded_fsdp_weights.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.214Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.118Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/cli.cloud.modal_.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.243Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.148Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/monkeypatch.stablelm_attn_hijack_flash.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.786Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.696Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/integrations.kd.trainer.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.207Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:57.124Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/cli.delinearize_llama4.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.180Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.084Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/cli.evaluate.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.128Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.031Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.schemas.model.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.991Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.906Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/cli.art.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.151Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.054Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.llama3.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.554Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.463Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/core.chat.format.llama3x.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.063Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:55.968Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.kto.chatml.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.585Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.494Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/cli.args.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.147Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.051Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/monkeypatch.gradient_checkpointing.offload_cpu.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.811Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.721Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/evaluate.html</loc>
|
||||
<lastmod>2025-09-10T02:08:00.937Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:55.843Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.input_output.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.517Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.426Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/cli.vllm_serve.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.234Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.138Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/cli.quantize.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.227Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.131Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/cli.utils.sweeps.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.273Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.178Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/core.builders.rl.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.024Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:55.929Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.chatml.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.564Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.474Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.metharme.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.528Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.437Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.passthrough.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.569Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.478Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/core.datasets.chat.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.069Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:55.974Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.distributed.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.929Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.843Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/integrations.base.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.195Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:57.112Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/index.html</loc>
|
||||
<lastmod>2025-09-10T02:08:00.868Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:55.775Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/monkeypatch.gradient_checkpointing.offload_disk.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.836Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.747Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.quantization.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.970Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.884Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.dict.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.934Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.849Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/core.trainers.dpo.trainer.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.323Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.231Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.schemas.config.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.984Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.899Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.orcamini.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.532Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.441Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.collators.mamba.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.260Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:57.177Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/datasets.html</loc>
|
||||
<lastmod>2025-09-10T02:08:00.944Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:55.849Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/convert.html</loc>
|
||||
<lastmod>2025-09-10T02:08:00.957Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:55.862Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.kto.llama3.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.577Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.486Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.tokenization.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.843Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.754Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/core.trainers.base.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.296Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.204Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/monkeypatch.llama_expand_mask.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.742Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.651Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/loaders.model.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.357Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.265Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/logging_config.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.009Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:55.914Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/core.chat.format.shared.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.064Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:55.969Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/models.mamba.modeling_mamba.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.236Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:57.153Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.schemas.enums.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.055Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.970Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.callbacks.profiler.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.315Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:57.233Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.bench.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.858Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.769Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.samplers.multipack.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.305Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:57.223Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/core.trainers.mamba.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.317Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.224Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.alpaca_chat.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.471Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.379Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/monkeypatch.btlm_attn_hijack_flash.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.779Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.689Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/core.chat.messages.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.060Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:55.965Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/monkeypatch.llama_patch_multipack.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.780Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.690Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/monkeypatch.mixtral.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.808Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.718Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/core.trainers.mixins.scheduler.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.399Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.307Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/loaders.patch_manager.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.382Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.290Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.optimizers.adopt.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.942Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.856Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/monkeypatch.llama_attn_hijack_xformers.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.733Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.643Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.callbacks.perplexity.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.311Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:57.229Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/loaders.tokenizer.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.366Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.274Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.pygmalion.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.538Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.448Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.lora.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.849Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.760Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.user_defined.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.493Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.401Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/core.trainers.mixins.rng_state_loader.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.393Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.300Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.user_defined.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.567Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.477Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/kernels.swiglu.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.718Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.627Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.schemas.multimodal.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.032Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.947Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/cli.main.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.112Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.015Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.ctx_managers.sequence_parallel.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.423Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.331Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/monkeypatch.multipack.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.736Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.646Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/core.chat.format.chatml.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.061Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:55.966Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/cli.cloud.base.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.237Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.141Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/core.trainers.trl.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.311Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.219Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/mixed_precision.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.679Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.389Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/installation.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.678Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.389Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/mac.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.679Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.389Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/sequence_parallelism.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.679Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.390Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/faq.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.676Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.386Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/dataset-formats/tokenized.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.675Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.385Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/dataset-formats/stepwise_supervised.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.675Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.385Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/dataset-formats/pretraining.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.675Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.385Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/dataset-formats/template_free.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.675Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.385Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/dataset-formats/index.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.675Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.385Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/dataset-formats/conversation.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.675Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.385Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/dataset-formats/inst_tune.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.675Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.385Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/lr_groups.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.679Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.389Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/inference.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.678Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.388Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/lora_optims.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.678Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.389Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/multipack.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.679Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.389Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/amd_hpc.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.675Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.385Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/monkeypatch.trainer_fsdp_optim.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.789Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.699Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/cli.inference.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.193Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.098Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.collators.mm_chat.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.265Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:57.182Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.model_shard_quant.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.855Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.766Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/monkeypatch.llama_attn_hijack_flash.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.732Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.641Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/integrations.grokfast.optimizer.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.200Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:57.117Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.schemas.training.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.998Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.913Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.schemas.integrations.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.044Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.959Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.schemas.utils.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.060Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.975Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/cli.checks.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.157Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.061Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.callbacks.qat.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.330Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:57.248Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/monkeypatch.mistral_attn_hijack_flash.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.735Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.644Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.data.sft.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.950Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.864Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.chat_templates.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.844Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.755Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/monkeypatch.utils.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.777Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.688Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/cli.utils.fetch.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.262Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.166Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/core.builders.causal.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.020Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:55.925Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.schemas.trl.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.027Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.942Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.data.streaming.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.943Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.858Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/monkeypatch.unsloth_.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.797Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.707Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.base.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.424Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.332Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/cli.merge_lora.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.202Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.106Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.alpaca_instruct.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.473Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.381Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/loaders.constants.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.384Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.292Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/monkeypatch.transformers_fa_utils.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.796Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.706Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/monkeypatch.relora.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.740Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.649Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.schedulers.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.910Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.824Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/kernels.lora.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.697Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.606Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.collators.batching.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.256Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:57.174Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.completion.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.511Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.420Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/core.trainers.grpo.trainer.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.334Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.242Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/common.architectures.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.219Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:57.136Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/cli.utils.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.245Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.149Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.orpo.chat_template.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.606Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.516Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/core.training_args.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.037Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:55.942Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.schemas.datasets.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.015Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.930Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.chat_template.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.458Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.366Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.alpaca_w_system.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.485Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.393Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/integrations.cut_cross_entropy.args.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.199Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:57.116Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/common.const.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.220Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:57.137Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/monkeypatch.lora_kernels.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.770Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.680Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/kernels.quantize.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.725Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.635Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/cli.config.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.175Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.079Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.llama2_chat.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.505Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.414Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.zephyr.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.566Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.475Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/kernels.geglu.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.708Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.617Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.freeze.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.866Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.777Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.callbacks.mlflow_.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.320Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:57.238Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/core.trainers.grpo.sampler.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.346Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.254Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/cli.utils.train.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.285Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.189Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/cli.utils.load.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.267Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.172Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.collators.core.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.237Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:57.155Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/integrations.liger.args.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.211Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:57.128Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/cli.preprocess.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.222Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.126Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/kernels.utils.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.727Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.636Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.callbacks.lisa.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.316Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:57.234Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/monkeypatch.data.batch_dataset_fetcher.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.806Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.716Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/core.trainers.mixins.optimizer.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.389Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.297Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/loaders.processor.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.367Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.275Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/core.trainers.utils.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.348Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.255Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.dpo.chat_template.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.544Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.453Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_tokenizers.html</loc>
|
||||
<lastmod>2025-09-10T02:08:00.999Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:55.904Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.callbacks.comet_.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.323Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:57.241Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/core.datasets.transforms.chat_builder.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.078Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:55.982Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.trainer.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.882Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.794Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/cli.train.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.120Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.023Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/integrations.lm_eval.args.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.214Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:57.131Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.messages.chat.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.542Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.452Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/utils.schemas.peft.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.024Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.939Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.bradley_terry.llama3.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.610Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.520Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/cli.utils.args.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.256Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.161Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.stepwise_supervised.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.521Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.430Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/common.datasets.html</loc>
|
||||
<lastmod>2025-09-10T02:08:02.235Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:57.152Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/prompt_strategies.kto.user_defined.html</loc>
|
||||
<lastmod>2025-09-10T02:08:01.586Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:56.496Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/api/train.html</loc>
|
||||
<lastmod>2025-09-10T02:08:00.927Z</lastmod>
|
||||
<lastmod>2025-09-11T00:30:55.833Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/multi-node.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.679Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.389Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/input_output.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.678Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.388Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/ray-integration.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.679Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.389Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/getting-started.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.676Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.386Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/optimizers.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.679Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.389Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/multi-gpu.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.679Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.389Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/dataset_preprocessing.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.675Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.386Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/torchao.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.679Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.390Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/config-reference.html</loc>
|
||||
<lastmod>2025-09-10T02:08:16.952Z</lastmod>
|
||||
<lastmod>2025-09-11T00:31:11.763Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/rlhf.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.679Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.389Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/cli.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.675Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.385Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/fsdp_qlora.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.676Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.386Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/docs/qat.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.679Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.389Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/src/axolotl/integrations/cut_cross_entropy/ACKNOWLEDGEMENTS.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.700Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.410Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://docs.axolotl.ai/index.html</loc>
|
||||
<lastmod>2025-09-10T02:03:39.695Z</lastmod>
|
||||
<lastmod>2025-09-11T00:27:09.405Z</lastmod>
|
||||
</url>
|
||||
</urlset>
|
||||
|
||||
Reference in New Issue
Block a user