Built site for gh-pages

This commit is contained in:
Quarto GHA Workflow Runner
2026-03-17 15:50:29 +00:00
parent 6eeb2c8370
commit 8d38a13bb4
10 changed files with 9300 additions and 7827 deletions

View File

@@ -758,6 +758,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<ul class="collapse">
<li><a href="#classes" id="toc-classes" class="nav-link" data-scroll-target="#classes">Classes</a>
<ul class="collapse">
<li><a href="#axolotl.core.trainers.grpo.trainer.AxolotlAsyncGRPOTrainer" id="toc-axolotl.core.trainers.grpo.trainer.AxolotlAsyncGRPOTrainer" class="nav-link" data-scroll-target="#axolotl.core.trainers.grpo.trainer.AxolotlAsyncGRPOTrainer">AxolotlAsyncGRPOTrainer</a></li>
<li><a href="#axolotl.core.trainers.grpo.trainer.AxolotlGRPOSequenceParallelTrainer" id="toc-axolotl.core.trainers.grpo.trainer.AxolotlGRPOSequenceParallelTrainer" class="nav-link" data-scroll-target="#axolotl.core.trainers.grpo.trainer.AxolotlGRPOSequenceParallelTrainer">AxolotlGRPOSequenceParallelTrainer</a></li>
<li><a href="#axolotl.core.trainers.grpo.trainer.AxolotlGRPOTrainer" id="toc-axolotl.core.trainers.grpo.trainer.AxolotlGRPOTrainer" class="nav-link" data-scroll-target="#axolotl.core.trainers.grpo.trainer.AxolotlGRPOTrainer">AxolotlGRPOTrainer</a></li>
</ul></li>
@@ -786,30 +787,39 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</thead>
<tbody>
<tr class="odd">
<td><a href="#axolotl.core.trainers.grpo.trainer.AxolotlAsyncGRPOTrainer">AxolotlAsyncGRPOTrainer</a></td>
<td>Extend AsyncGRPOTrainer with axolotl helpers</td>
</tr>
<tr class="even">
<td><a href="#axolotl.core.trainers.grpo.trainer.AxolotlGRPOSequenceParallelTrainer">AxolotlGRPOSequenceParallelTrainer</a></td>
<td>Extend the base GRPOTrainer for sequence parallelism handling</td>
</tr>
<tr class="even">
<tr class="odd">
<td><a href="#axolotl.core.trainers.grpo.trainer.AxolotlGRPOTrainer">AxolotlGRPOTrainer</a></td>
<td>Extend the base GRPOTrainer for axolotl helpers</td>
</tr>
</tbody>
</table>
<section id="axolotl.core.trainers.grpo.trainer.AxolotlAsyncGRPOTrainer" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.grpo.trainer.AxolotlAsyncGRPOTrainer">AxolotlAsyncGRPOTrainer</h3>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.grpo.trainer.AxolotlAsyncGRPOTrainer(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p>Extend AsyncGRPOTrainer with axolotl helpers</p>
</section>
<section id="axolotl.core.trainers.grpo.trainer.AxolotlGRPOSequenceParallelTrainer" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.grpo.trainer.AxolotlGRPOSequenceParallelTrainer">AxolotlGRPOSequenceParallelTrainer</h3>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.grpo.trainer.AxolotlGRPOSequenceParallelTrainer(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> model,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> reward_funcs,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> args<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> train_dataset<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> eval_dataset<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> processing_class<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> reward_processing_classes<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> callbacks<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> optimizers<span class="op">=</span>(<span class="va">None</span>, <span class="va">None</span>),</span>
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a> peft_config<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a> optimizer_cls_and_kwargs<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a>)</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>core.trainers.grpo.trainer.AxolotlGRPOSequenceParallelTrainer(</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> model,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> reward_funcs,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> args<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> train_dataset<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> eval_dataset<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> processing_class<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> reward_processing_classes<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a> callbacks<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a> optimizers<span class="op">=</span>(<span class="va">None</span>, <span class="va">None</span>),</span>
<span id="cb2-11"><a href="#cb2-11" aria-hidden="true" tabindex="-1"></a> peft_config<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-12"><a href="#cb2-12" aria-hidden="true" tabindex="-1"></a> optimizer_cls_and_kwargs<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-13"><a href="#cb2-13" aria-hidden="true" tabindex="-1"></a>)</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p>Extend the base GRPOTrainer for sequence parallelism handling</p>
<section id="methods" class="level4">
<h4 class="anchored" data-anchor-id="methods">Methods</h4>
@@ -829,15 +839,15 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</table>
<section id="axolotl.core.trainers.grpo.trainer.AxolotlGRPOSequenceParallelTrainer.get_train_dataloader" class="level5">
<h5 class="anchored" data-anchor-id="axolotl.core.trainers.grpo.trainer.AxolotlGRPOSequenceParallelTrainer.get_train_dataloader">get_train_dataloader</h5>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>core.trainers.grpo.trainer.AxolotlGRPOSequenceParallelTrainer.get_train_dataloader(</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a>)</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>core.trainers.grpo.trainer.AxolotlGRPOSequenceParallelTrainer.get_train_dataloader(</span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a>)</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p>Get dataloader for training</p>
</section>
</section>
</section>
<section id="axolotl.core.trainers.grpo.trainer.AxolotlGRPOTrainer" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.grpo.trainer.AxolotlGRPOTrainer">AxolotlGRPOTrainer</h3>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>core.trainers.grpo.trainer.AxolotlGRPOTrainer(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>core.trainers.grpo.trainer.AxolotlGRPOTrainer(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p>Extend the base GRPOTrainer for axolotl helpers</p>

View File

@@ -1144,7 +1144,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</tr>
<tr class="even">
<td><a href="../../docs/api/kernels.quantize.html#axolotl.kernels.quantize">kernels.quantize</a></td>
<td>Dequantization utilities for <code>bitsandbytes</code> integration.</td>
<td>Dequantization utilities for <code>bitsandbytes</code> and FP8 integration.</td>
</tr>
<tr class="odd">
<td><a href="../../docs/api/kernels.utils.html#axolotl.kernels.utils">kernels.utils</a></td>

View File

@@ -1915,9 +1915,9 @@ supporting quantization and memory optimization.</p>
<h4 class="doc-section doc-section-returns anchored" data-anchor-id="returns-10">Returns</h4>
<table class="caption-top table">
<colgroup>
<col style="width: 7%">
<col style="width: 20%">
<col style="width: 72%">
<col style="width: 6%">
<col style="width: 29%">
<col style="width: 64%">
</colgroup>
<thead>
<tr class="header">
@@ -1939,7 +1939,7 @@ supporting quantization and memory optimization.</p>
</tr>
<tr class="odd">
<td></td>
<td>QuantState | None</td>
<td>QuantState | torch.Tensor | None</td>
<td><code>None</code> if not available.</td>
</tr>
</tbody>
@@ -1954,10 +1954,10 @@ supporting quantization and memory optimization.</p>
<h4 class="doc-section doc-section-parameters anchored" data-anchor-id="parameters-11">Parameters</h4>
<table class="caption-top table">
<colgroup>
<col style="width: 9%">
<col style="width: 24%">
<col style="width: 52%">
<col style="width: 13%">
<col style="width: 8%">
<col style="width: 34%">
<col style="width: 45%">
<col style="width: 11%">
</colgroup>
<thead>
<tr class="header">
@@ -1982,7 +1982,7 @@ supporting quantization and memory optimization.</p>
</tr>
<tr class="odd">
<td>W_quant</td>
<td>QuantState | None</td>
<td>QuantState | torch.Tensor | None</td>
<td>Quantization state for W</td>
<td><em>required</em></td>
</tr>

View File

@@ -759,6 +759,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<li><a href="#functions" id="toc-functions" class="nav-link" data-scroll-target="#functions">Functions</a>
<ul class="collapse">
<li><a href="#axolotl.kernels.quantize.dequantize" id="toc-axolotl.kernels.quantize.dequantize" class="nav-link" data-scroll-target="#axolotl.kernels.quantize.dequantize">dequantize</a></li>
<li><a href="#axolotl.kernels.quantize.dequantize_fp8" id="toc-axolotl.kernels.quantize.dequantize_fp8" class="nav-link" data-scroll-target="#axolotl.kernels.quantize.dequantize_fp8">dequantize_fp8</a></li>
</ul></li>
</ul></li>
</ul>
@@ -773,7 +774,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.kernels.quantize" class="level1">
<h1>kernels.quantize</h1>
<p><code>kernels.quantize</code></p>
<p>Dequantization utilities for <code>bitsandbytes</code> integration.</p>
<p>Dequantization utilities for <code>bitsandbytes</code> and FP8 integration.</p>
<section id="functions" class="level2">
<h2 class="anchored" data-anchor-id="functions">Functions</h2>
<table class="caption-top table">
@@ -788,6 +789,10 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<td><a href="#axolotl.kernels.quantize.dequantize">dequantize</a></td>
<td>Fast NF4 dequantization using <code>bitsandbytes</code> CUDA kernels.</td>
</tr>
<tr class="even">
<td><a href="#axolotl.kernels.quantize.dequantize_fp8">dequantize_fp8</a></td>
<td>Dequantize FP8 block-quantized weights: W_dequant = W_fp8 * scale_inv.</td>
</tr>
</tbody>
</table>
<section id="axolotl.kernels.quantize.dequantize" class="level3">
@@ -801,9 +806,9 @@ formats.</p>
<h4 class="doc-section doc-section-parameters anchored" data-anchor-id="parameters">Parameters</h4>
<table class="caption-top table">
<colgroup>
<col style="width: 6%">
<col style="width: 13%">
<col style="width: 74%">
<col style="width: 5%">
<col style="width: 19%">
<col style="width: 69%">
<col style="width: 5%">
</colgroup>
<thead>
@@ -823,7 +828,7 @@ formats.</p>
</tr>
<tr class="even">
<td>quant_state</td>
<td>QuantState | list | None</td>
<td>QuantState | list | torch.Tensor | None</td>
<td>Quantization state containing metadata needed for dequantization. Can be either a <code>QuantState</code> object or legacy list format. If None, returns <code>W</code> unchanged.</td>
<td><code>None</code></td>
</tr>
@@ -893,6 +898,69 @@ formats.</p>
<h4 class="doc-section doc-section-note anchored" data-anchor-id="note">Note</h4>
<p>Uses CUDA streams for better performance when available in newer <code>bitsandbytes</code>
versions (&gt;0.43.3).</p>
</section>
</section>
<section id="axolotl.kernels.quantize.dequantize_fp8" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.kernels.quantize.dequantize_fp8">dequantize_fp8</h3>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>kernels.quantize.dequantize_fp8(W, scale_inv, dtype<span class="op">=</span>torch.bfloat16)</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p>Dequantize FP8 block-quantized weights: W_dequant = W_fp8 * scale_inv.</p>
<section id="parameters-1" class="level4 doc-section doc-section-parameters">
<h4 class="doc-section doc-section-parameters anchored" data-anchor-id="parameters-1">Parameters</h4>
<table class="caption-top table">
<colgroup>
<col style="width: 8%">
<col style="width: 11%">
<col style="width: 65%">
<col style="width: 14%">
</colgroup>
<thead>
<tr class="header">
<th>Name</th>
<th>Type</th>
<th>Description</th>
<th>Default</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>W</td>
<td>torch.Tensor</td>
<td>FP8 weight tensor [out_features, in_features] in float8_e4m3fn.</td>
<td><em>required</em></td>
</tr>
<tr class="even">
<td>scale_inv</td>
<td>torch.Tensor</td>
<td>Per-block inverse scale [ceil(out/block), ceil(in/block)] or per-tensor scalar.</td>
<td><em>required</em></td>
</tr>
<tr class="odd">
<td>dtype</td>
<td>torch.dtype</td>
<td>Output dtype (default bf16).</td>
<td><code>torch.bfloat16</code></td>
</tr>
</tbody>
</table>
</section>
<section id="returns-1" class="level4 doc-section doc-section-returns">
<h4 class="doc-section doc-section-returns anchored" data-anchor-id="returns-1">Returns</h4>
<table class="caption-top table">
<thead>
<tr class="header">
<th>Name</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td></td>
<td>torch.Tensor</td>
<td>Dequantized tensor in the specified dtype.</td>
</tr>
</tbody>
</table>
</section>