Built site for gh-pages

This commit is contained in:
Quarto GHA Workflow Runner
2026-02-25 04:38:55 +00:00
parent aaf47dc7ec
commit dafe30369e
6 changed files with 1024 additions and 1004 deletions

View File

@@ -1 +1 @@
b8caf314
0c5f4db8

View File

@@ -800,7 +800,8 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> tokenizer_path,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> token_mappings,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> output_dir,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a>)</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> revision<span class="op">=</span><span class="st">'main'</span>,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a>)</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p>Modify tokenizer files to replace added_tokens strings, save to output directory,
and return the path to the modified tokenizer.</p>
<p>This only works with reserved tokens that were added to the tokenizer, not tokens
@@ -809,10 +810,10 @@ already part of the vocab.</p>
<h4 class="doc-section doc-section-parameters anchored" data-anchor-id="parameters">Parameters</h4>
<table class="caption-top table">
<colgroup>
<col style="width: 16%">
<col style="width: 18%">
<col style="width: 51%">
<col style="width: 12%">
<col style="width: 15%">
<col style="width: 17%">
<col style="width: 54%">
<col style="width: 11%">
</colgroup>
<thead>
<tr class="header">
@@ -841,6 +842,12 @@ already part of the vocab.</p>
<td>Directory to save the modified tokenizer</td>
<td><em>required</em></td>
</tr>
<tr class="even">
<td>revision</td>
<td>str</td>
<td>Model revision/branch/tag/commit to load from (HF Hub)</td>
<td><code>'main'</code></td>
</tr>
</tbody>
</table>
</section>

View File

@@ -754,7 +754,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<ul class="collapse">
<li><a href="#axolotl.utils.trainer.add_pose_position_ids" id="toc-axolotl.utils.trainer.add_pose_position_ids" class="nav-link" data-scroll-target="#axolotl.utils.trainer.add_pose_position_ids">add_pose_position_ids</a></li>
<li><a href="#axolotl.utils.trainer.add_position_ids" id="toc-axolotl.utils.trainer.add_position_ids" class="nav-link" data-scroll-target="#axolotl.utils.trainer.add_position_ids">add_position_ids</a></li>
<li><a href="#axolotl.utils.trainer.drop_long_seq" id="toc-axolotl.utils.trainer.drop_long_seq" class="nav-link" data-scroll-target="#axolotl.utils.trainer.drop_long_seq">drop_long_seq</a></li>
<li><a href="#axolotl.utils.trainer.filter_sequences_by_length" id="toc-axolotl.utils.trainer.filter_sequences_by_length" class="nav-link" data-scroll-target="#axolotl.utils.trainer.filter_sequences_by_length">filter_sequences_by_length</a></li>
<li><a href="#axolotl.utils.trainer.setup_trainer" id="toc-axolotl.utils.trainer.setup_trainer" class="nav-link" data-scroll-target="#axolotl.utils.trainer.setup_trainer">setup_trainer</a></li>
</ul></li>
</ul></li>
@@ -790,8 +790,8 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<td>Handle both single-example and batched data.</td>
</tr>
<tr class="odd">
<td><a href="#axolotl.utils.trainer.drop_long_seq">drop_long_seq</a></td>
<td>Drop samples whose sequence length is either too long (&gt; sequence_len)</td>
<td><a href="#axolotl.utils.trainer.filter_sequences_by_length">filter_sequences_by_length</a></td>
<td>Filter sequences outside valid length range [min_sequence_len, sequence_len].</td>
</tr>
<tr class="even">
<td><a href="#axolotl.utils.trainer.setup_trainer">setup_trainer</a></td>
@@ -822,16 +822,16 @@ remaining in each sample.</p>
- single example: sample[input_ids] is a list[int]
- batched data: sample[input_ids] is a list[list[int]]</p>
</section>
<section id="axolotl.utils.trainer.drop_long_seq" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.utils.trainer.drop_long_seq">drop_long_seq</h3>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>utils.trainer.drop_long_seq(</span>
<section id="axolotl.utils.trainer.filter_sequences_by_length" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.utils.trainer.filter_sequences_by_length">filter_sequences_by_length</h3>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>utils.trainer.filter_sequences_by_length(</span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> sample,</span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> min_sequence_len<span class="op">=</span><span class="dv">2</span>,</span>
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> raise_on_drop<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a>)</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p>Drop samples whose sequence length is either too long (&gt; sequence_len)
or too short (&lt; min_sequence_len).</p>
<p>Filter sequences outside valid length range [min_sequence_len, sequence_len].</p>
<p>Drops samples that are either too short (&lt; min_sequence_len) or too long (&gt; sequence_len).</p>
<p>Works for both single-example (list[int]) or batched (list[list[int]]).</p>
<p>If raise_on_drop is set, the code raises a ValueError if a sample is
encountered that is too long and would have been dropped.</p>

File diff suppressed because it is too large Load Diff

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load Diff