Built site for gh-pages

This commit is contained in:
Quarto GHA Workflow Runner
2025-08-08 01:27:51 +00:00
parent 06f481c809
commit e5ae08a364
8 changed files with 1311 additions and 1294 deletions

View File

@@ -710,7 +710,8 @@ from the full gradient tensor.</p>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> ring_attn_func,</span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> heads_k_stride,</span>
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a> gather_outputs,</span>
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a> device_mesh<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Context manager for sequence parallelism operations.</p>
<p>This class provides a context that will automatically apply sequence parallelism
during model forward passes using a pre-forward hook, and gather outputs from