Built site for gh-pages

This commit is contained in:
Quarto GHA Workflow Runner
2025-12-25 11:03:05 +00:00
parent 3411187898
commit 5339a73a2c
9 changed files with 356 additions and 261 deletions

View File

@@ -1 +1 @@
ef64af50
7ef13619

View File

@@ -588,9 +588,17 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<td>Apply post plugin-pre_model_load load patches based on config.</td>
</tr>
<tr class="odd">
<td><a href="#axolotl.loaders.patch_manager.PatchManager.apply_pre_config_load_patches">apply_pre_config_load_patches</a></td>
<td>Apply patches that must be set up before config loading.</td>
</tr>
<tr class="even">
<td><a href="#axolotl.loaders.patch_manager.PatchManager.apply_pre_model_load_patches">apply_pre_model_load_patches</a></td>
<td>Apply pre-model load patches based on config.</td>
</tr>
<tr class="odd">
<td><a href="#axolotl.loaders.patch_manager.PatchManager.apply_pre_tokenizer_load_patches">apply_pre_tokenizer_load_patches</a></td>
<td>Apply patches that must be set up before tokenizer loading.</td>
</tr>
</tbody>
</table>
<section id="axolotl.loaders.patch_manager.PatchManager.apply_post_model_load_patches" class="level5">
@@ -603,10 +611,77 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>loaders.patch_manager.PatchManager.apply_post_plugin_pre_model_load_patches()</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p>Apply post plugin-pre_model_load load patches based on config.</p>
</section>
<section id="axolotl.loaders.patch_manager.PatchManager.apply_pre_config_load_patches" class="level5">
<h5 class="anchored" data-anchor-id="axolotl.loaders.patch_manager.PatchManager.apply_pre_config_load_patches">apply_pre_config_load_patches</h5>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>loaders.patch_manager.PatchManager.apply_pre_config_load_patches(cfg)</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p>Apply patches that must be set up before config loading.
This is for patches that intercept remote code loading from HuggingFace,
which needs to be in place before AutoConfig.from_pretrained() is called.</p>
<section id="parameters" class="level6 doc-section doc-section-parameters">
<h6 class="doc-section doc-section-parameters anchored" data-anchor-id="parameters">Parameters</h6>
<table class="caption-top table">
<colgroup>
<col style="width: 8%">
<col style="width: 13%">
<col style="width: 64%">
<col style="width: 12%">
</colgroup>
<thead>
<tr class="header">
<th>Name</th>
<th>Type</th>
<th>Description</th>
<th>Default</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>cfg</td>
<td>DictDefault</td>
<td>Configuration dictionary with model and training settings.</td>
<td><em>required</em></td>
</tr>
</tbody>
</table>
</section>
</section>
<section id="axolotl.loaders.patch_manager.PatchManager.apply_pre_model_load_patches" class="level5">
<h5 class="anchored" data-anchor-id="axolotl.loaders.patch_manager.PatchManager.apply_pre_model_load_patches">apply_pre_model_load_patches</h5>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>loaders.patch_manager.PatchManager.apply_pre_model_load_patches()</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>loaders.patch_manager.PatchManager.apply_pre_model_load_patches()</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p>Apply pre-model load patches based on config.</p>
</section>
<section id="axolotl.loaders.patch_manager.PatchManager.apply_pre_tokenizer_load_patches" class="level5">
<h5 class="anchored" data-anchor-id="axolotl.loaders.patch_manager.PatchManager.apply_pre_tokenizer_load_patches">apply_pre_tokenizer_load_patches</h5>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb6"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a>loaders.patch_manager.PatchManager.apply_pre_tokenizer_load_patches(cfg)</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p>Apply patches that must be set up before tokenizer loading.
This is for patches that intercept remote code loading from HuggingFace,
which needs to be in place before AutoTokenizer.from_pretrained() is called.</p>
<section id="parameters-1" class="level6 doc-section doc-section-parameters">
<h6 class="doc-section doc-section-parameters anchored" data-anchor-id="parameters-1">Parameters</h6>
<table class="caption-top table">
<colgroup>
<col style="width: 8%">
<col style="width: 13%">
<col style="width: 64%">
<col style="width: 12%">
</colgroup>
<thead>
<tr class="header">
<th>Name</th>
<th>Type</th>
<th>Description</th>
<th>Default</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>cfg</td>
<td>DictDefault</td>
<td>Configuration dictionary with model and training settings.</td>
<td><em>required</em></td>
</tr>
</tbody>
</table>
</section>
@@ -614,6 +689,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</section>
</section>
</section>
</section>
</main> <!-- /main -->
<script id="quarto-html-after-body" type="application/javascript">

View File

@@ -521,6 +521,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<li><a href="#axolotl.utils.schemas.integrations.MLFlowConfig" id="toc-axolotl.utils.schemas.integrations.MLFlowConfig" class="nav-link" data-scroll-target="#axolotl.utils.schemas.integrations.MLFlowConfig">MLFlowConfig</a></li>
<li><a href="#axolotl.utils.schemas.integrations.OpenTelemetryConfig" id="toc-axolotl.utils.schemas.integrations.OpenTelemetryConfig" class="nav-link" data-scroll-target="#axolotl.utils.schemas.integrations.OpenTelemetryConfig">OpenTelemetryConfig</a></li>
<li><a href="#axolotl.utils.schemas.integrations.RayConfig" id="toc-axolotl.utils.schemas.integrations.RayConfig" class="nav-link" data-scroll-target="#axolotl.utils.schemas.integrations.RayConfig">RayConfig</a></li>
<li><a href="#axolotl.utils.schemas.integrations.TrackioConfig" id="toc-axolotl.utils.schemas.integrations.TrackioConfig" class="nav-link" data-scroll-target="#axolotl.utils.schemas.integrations.TrackioConfig">TrackioConfig</a></li>
<li><a href="#axolotl.utils.schemas.integrations.WandbConfig" id="toc-axolotl.utils.schemas.integrations.WandbConfig" class="nav-link" data-scroll-target="#axolotl.utils.schemas.integrations.WandbConfig">WandbConfig</a></li>
</ul></li>
</ul></li>
@@ -572,6 +573,10 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<td>Ray launcher configuration subset</td>
</tr>
<tr class="odd">
<td><a href="#axolotl.utils.schemas.integrations.TrackioConfig">TrackioConfig</a></td>
<td>Trackio configuration subset</td>
</tr>
<tr class="even">
<td><a href="#axolotl.utils.schemas.integrations.WandbConfig">WandbConfig</a></td>
<td>Wandb configuration subset</td>
</tr>
@@ -607,9 +612,14 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb6"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a>utils.schemas.integrations.RayConfig()</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p>Ray launcher configuration subset</p>
</section>
<section id="axolotl.utils.schemas.integrations.TrackioConfig" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.utils.schemas.integrations.TrackioConfig">TrackioConfig</h3>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb7"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a>utils.schemas.integrations.TrackioConfig()</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p>Trackio configuration subset</p>
</section>
<section id="axolotl.utils.schemas.integrations.WandbConfig" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.utils.schemas.integrations.WandbConfig">WandbConfig</h3>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb7"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a>utils.schemas.integrations.WandbConfig()</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb8"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a>utils.schemas.integrations.WandbConfig()</span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<p>Wandb configuration subset</p>

View File

@@ -1898,55 +1898,63 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<span id="cb1-1359"><a href="#cb1-1359" aria-hidden="true" tabindex="-1"></a><span class="co"># Dictionary for additional configuration settings, see the doc for more details.</span></span>
<span id="cb1-1360"><a href="#cb1-1360" aria-hidden="true" tabindex="-1"></a><span class="fu">comet_experiment_config</span><span class="kw">:</span><span class="at"> dict[str, Any] | None</span></span>
<span id="cb1-1361"><a href="#cb1-1361" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-1362"><a href="#cb1-1362" aria-hidden="true" tabindex="-1"></a><span class="co"># Enable OpenTelemetry metrics collection and Prometheus export</span></span>
<span id="cb1-1363"><a href="#cb1-1363" aria-hidden="true" tabindex="-1"></a><span class="fu">use_otel_metrics</span><span class="kw">:</span><span class="at"> bool | None = False</span></span>
<span id="cb1-1364"><a href="#cb1-1364" aria-hidden="true" tabindex="-1"></a><span class="co"># Host to bind the OpenTelemetry metrics server to</span></span>
<span id="cb1-1365"><a href="#cb1-1365" aria-hidden="true" tabindex="-1"></a><span class="fu">otel_metrics_host</span><span class="kw">:</span><span class="at"> str | None = localhost</span></span>
<span id="cb1-1366"><a href="#cb1-1366" aria-hidden="true" tabindex="-1"></a><span class="co"># Port for the Prometheus metrics HTTP server</span></span>
<span id="cb1-1367"><a href="#cb1-1367" aria-hidden="true" tabindex="-1"></a><span class="fu">otel_metrics_port</span><span class="kw">:</span><span class="at"> int | None = 8000</span></span>
<span id="cb1-1368"><a href="#cb1-1368" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-1369"><a href="#cb1-1369" aria-hidden="true" tabindex="-1"></a><span class="co"># the number of activate layers in LISA</span></span>
<span id="cb1-1370"><a href="#cb1-1370" aria-hidden="true" tabindex="-1"></a><span class="fu">lisa_n_layers</span><span class="kw">:</span><span class="at"> int | None</span></span>
<span id="cb1-1371"><a href="#cb1-1371" aria-hidden="true" tabindex="-1"></a><span class="co"># how often to switch layers in LISA</span></span>
<span id="cb1-1372"><a href="#cb1-1372" aria-hidden="true" tabindex="-1"></a><span class="fu">lisa_step_interval</span><span class="kw">:</span><span class="at"> int | None</span></span>
<span id="cb1-1373"><a href="#cb1-1373" aria-hidden="true" tabindex="-1"></a><span class="co"># path under the model to access the layers</span></span>
<span id="cb1-1374"><a href="#cb1-1374" aria-hidden="true" tabindex="-1"></a><span class="fu">lisa_layers_attribute</span><span class="kw">:</span><span class="at"> str | None = model.layers</span></span>
<span id="cb1-1375"><a href="#cb1-1375" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-1376"><a href="#cb1-1376" aria-hidden="true" tabindex="-1"></a><span class="fu">gradio_title</span><span class="kw">:</span><span class="at"> str | None</span></span>
<span id="cb1-1377"><a href="#cb1-1377" aria-hidden="true" tabindex="-1"></a><span class="fu">gradio_share</span><span class="kw">:</span><span class="at"> bool | None</span></span>
<span id="cb1-1378"><a href="#cb1-1378" aria-hidden="true" tabindex="-1"></a><span class="fu">gradio_server_name</span><span class="kw">:</span><span class="at"> str | None</span></span>
<span id="cb1-1379"><a href="#cb1-1379" aria-hidden="true" tabindex="-1"></a><span class="fu">gradio_server_port</span><span class="kw">:</span><span class="at"> int | None</span></span>
<span id="cb1-1380"><a href="#cb1-1380" aria-hidden="true" tabindex="-1"></a><span class="fu">gradio_max_new_tokens</span><span class="kw">:</span><span class="at"> int | None</span></span>
<span id="cb1-1381"><a href="#cb1-1381" aria-hidden="true" tabindex="-1"></a><span class="fu">gradio_temperature</span><span class="kw">:</span><span class="at"> float | None</span></span>
<span id="cb1-1382"><a href="#cb1-1382" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-1383"><a href="#cb1-1383" aria-hidden="true" tabindex="-1"></a><span class="fu">use_ray</span><span class="kw">:</span><span class="at"> bool = False</span></span>
<span id="cb1-1384"><a href="#cb1-1384" aria-hidden="true" tabindex="-1"></a><span class="fu">ray_run_name</span><span class="kw">:</span><span class="at"> str | None</span></span>
<span id="cb1-1385"><a href="#cb1-1385" aria-hidden="true" tabindex="-1"></a><span class="fu">ray_num_workers</span><span class="kw">:</span><span class="at"> int = 1</span></span>
<span id="cb1-1386"><a href="#cb1-1386" aria-hidden="true" tabindex="-1"></a><span class="fu">resources_per_worker</span><span class="kw">:</span><span class="at"> dict</span></span>
<span id="cb1-1387"><a href="#cb1-1387" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-1388"><a href="#cb1-1388" aria-hidden="true" tabindex="-1"></a><span class="co"># The size of the image to resize to. It can be an integer (resized into padded-square</span></span>
<span id="cb1-1389"><a href="#cb1-1389" aria-hidden="true" tabindex="-1"></a><span class="co"># image) or a tuple (width, height).If not provided, we will attempt to load from</span></span>
<span id="cb1-1390"><a href="#cb1-1390" aria-hidden="true" tabindex="-1"></a><span class="co"># preprocessor.size, otherwise, images won't be resized.</span></span>
<span id="cb1-1391"><a href="#cb1-1391" aria-hidden="true" tabindex="-1"></a><span class="fu">image_size</span><span class="kw">:</span><span class="at"> int | tuple[int, int] | None</span></span>
<span id="cb1-1392"><a href="#cb1-1392" aria-hidden="true" tabindex="-1"></a><span class="co"># The resampling algorithm to use for image resizing. Default is bilinear. Please refer</span></span>
<span id="cb1-1393"><a href="#cb1-1393" aria-hidden="true" tabindex="-1"></a><span class="co"># to PIL.Image.Resampling for more details.</span></span>
<span id="cb1-1394"><a href="#cb1-1394" aria-hidden="true" tabindex="-1"></a><span class="fu">image_resize_algorithm</span><span class="kw">:</span><span class="at"> Literal['bilinear', 'bicubic', 'lanczos'] | Resampling | None</span></span>
<span id="cb1-1362"><a href="#cb1-1362" aria-hidden="true" tabindex="-1"></a><span class="fu">use_trackio</span><span class="kw">:</span><span class="at"> bool | None</span></span>
<span id="cb1-1363"><a href="#cb1-1363" aria-hidden="true" tabindex="-1"></a><span class="co"># Your trackio project name</span></span>
<span id="cb1-1364"><a href="#cb1-1364" aria-hidden="true" tabindex="-1"></a><span class="fu">trackio_project_name</span><span class="kw">:</span><span class="at"> str | None</span></span>
<span id="cb1-1365"><a href="#cb1-1365" aria-hidden="true" tabindex="-1"></a><span class="co"># Set the name of your trackio run</span></span>
<span id="cb1-1366"><a href="#cb1-1366" aria-hidden="true" tabindex="-1"></a><span class="fu">trackio_run_name</span><span class="kw">:</span><span class="at"> str | None</span></span>
<span id="cb1-1367"><a href="#cb1-1367" aria-hidden="true" tabindex="-1"></a><span class="co"># Hugging Face Space ID to sync dashboard to (optional, runs locally if not provided)</span></span>
<span id="cb1-1368"><a href="#cb1-1368" aria-hidden="true" tabindex="-1"></a><span class="fu">trackio_space_id</span><span class="kw">:</span><span class="at"> str | None</span></span>
<span id="cb1-1369"><a href="#cb1-1369" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-1370"><a href="#cb1-1370" aria-hidden="true" tabindex="-1"></a><span class="co"># Enable OpenTelemetry metrics collection and Prometheus export</span></span>
<span id="cb1-1371"><a href="#cb1-1371" aria-hidden="true" tabindex="-1"></a><span class="fu">use_otel_metrics</span><span class="kw">:</span><span class="at"> bool | None = False</span></span>
<span id="cb1-1372"><a href="#cb1-1372" aria-hidden="true" tabindex="-1"></a><span class="co"># Host to bind the OpenTelemetry metrics server to</span></span>
<span id="cb1-1373"><a href="#cb1-1373" aria-hidden="true" tabindex="-1"></a><span class="fu">otel_metrics_host</span><span class="kw">:</span><span class="at"> str | None = localhost</span></span>
<span id="cb1-1374"><a href="#cb1-1374" aria-hidden="true" tabindex="-1"></a><span class="co"># Port for the Prometheus metrics HTTP server</span></span>
<span id="cb1-1375"><a href="#cb1-1375" aria-hidden="true" tabindex="-1"></a><span class="fu">otel_metrics_port</span><span class="kw">:</span><span class="at"> int | None = 8000</span></span>
<span id="cb1-1376"><a href="#cb1-1376" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-1377"><a href="#cb1-1377" aria-hidden="true" tabindex="-1"></a><span class="co"># the number of activate layers in LISA</span></span>
<span id="cb1-1378"><a href="#cb1-1378" aria-hidden="true" tabindex="-1"></a><span class="fu">lisa_n_layers</span><span class="kw">:</span><span class="at"> int | None</span></span>
<span id="cb1-1379"><a href="#cb1-1379" aria-hidden="true" tabindex="-1"></a><span class="co"># how often to switch layers in LISA</span></span>
<span id="cb1-1380"><a href="#cb1-1380" aria-hidden="true" tabindex="-1"></a><span class="fu">lisa_step_interval</span><span class="kw">:</span><span class="at"> int | None</span></span>
<span id="cb1-1381"><a href="#cb1-1381" aria-hidden="true" tabindex="-1"></a><span class="co"># path under the model to access the layers</span></span>
<span id="cb1-1382"><a href="#cb1-1382" aria-hidden="true" tabindex="-1"></a><span class="fu">lisa_layers_attribute</span><span class="kw">:</span><span class="at"> str | None = model.layers</span></span>
<span id="cb1-1383"><a href="#cb1-1383" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-1384"><a href="#cb1-1384" aria-hidden="true" tabindex="-1"></a><span class="fu">gradio_title</span><span class="kw">:</span><span class="at"> str | None</span></span>
<span id="cb1-1385"><a href="#cb1-1385" aria-hidden="true" tabindex="-1"></a><span class="fu">gradio_share</span><span class="kw">:</span><span class="at"> bool | None</span></span>
<span id="cb1-1386"><a href="#cb1-1386" aria-hidden="true" tabindex="-1"></a><span class="fu">gradio_server_name</span><span class="kw">:</span><span class="at"> str | None</span></span>
<span id="cb1-1387"><a href="#cb1-1387" aria-hidden="true" tabindex="-1"></a><span class="fu">gradio_server_port</span><span class="kw">:</span><span class="at"> int | None</span></span>
<span id="cb1-1388"><a href="#cb1-1388" aria-hidden="true" tabindex="-1"></a><span class="fu">gradio_max_new_tokens</span><span class="kw">:</span><span class="at"> int | None</span></span>
<span id="cb1-1389"><a href="#cb1-1389" aria-hidden="true" tabindex="-1"></a><span class="fu">gradio_temperature</span><span class="kw">:</span><span class="at"> float | None</span></span>
<span id="cb1-1390"><a href="#cb1-1390" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-1391"><a href="#cb1-1391" aria-hidden="true" tabindex="-1"></a><span class="fu">use_ray</span><span class="kw">:</span><span class="at"> bool = False</span></span>
<span id="cb1-1392"><a href="#cb1-1392" aria-hidden="true" tabindex="-1"></a><span class="fu">ray_run_name</span><span class="kw">:</span><span class="at"> str | None</span></span>
<span id="cb1-1393"><a href="#cb1-1393" aria-hidden="true" tabindex="-1"></a><span class="fu">ray_num_workers</span><span class="kw">:</span><span class="at"> int = 1</span></span>
<span id="cb1-1394"><a href="#cb1-1394" aria-hidden="true" tabindex="-1"></a><span class="fu">resources_per_worker</span><span class="kw">:</span><span class="at"> dict</span></span>
<span id="cb1-1395"><a href="#cb1-1395" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-1396"><a href="#cb1-1396" aria-hidden="true" tabindex="-1"></a><span class="co"># optional overrides to the base model configuration</span></span>
<span id="cb1-1397"><a href="#cb1-1397" aria-hidden="true" tabindex="-1"></a><span class="fu">overrides_of_model_config</span><span class="kw">:</span><span class="at"> dict[str, Any] | None</span></span>
<span id="cb1-1398"><a href="#cb1-1398" aria-hidden="true" tabindex="-1"></a><span class="co"># optional overrides the base model loading from_pretrained</span></span>
<span id="cb1-1399"><a href="#cb1-1399" aria-hidden="true" tabindex="-1"></a><span class="fu">overrides_of_model_kwargs</span><span class="kw">:</span><span class="at"> dict[str, Any] | None</span></span>
<span id="cb1-1400"><a href="#cb1-1400" aria-hidden="true" tabindex="-1"></a><span class="co"># If you want to specify the type of model to load, AutoModelForCausalLM is a good</span></span>
<span id="cb1-1401"><a href="#cb1-1401" aria-hidden="true" tabindex="-1"></a><span class="co"># choice too</span></span>
<span id="cb1-1402"><a href="#cb1-1402" aria-hidden="true" tabindex="-1"></a><span class="fu">type_of_model</span><span class="kw">:</span><span class="at"> str | None</span></span>
<span id="cb1-1403"><a href="#cb1-1403" aria-hidden="true" tabindex="-1"></a><span class="co"># You can specify to choose a specific model revision from huggingface hub</span></span>
<span id="cb1-1404"><a href="#cb1-1404" aria-hidden="true" tabindex="-1"></a><span class="fu">revision_of_model</span><span class="kw">:</span><span class="at"> str | None</span></span>
<span id="cb1-1405"><a href="#cb1-1405" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-1406"><a href="#cb1-1406" aria-hidden="true" tabindex="-1"></a><span class="fu">max_packed_sequence_len</span><span class="kw">:</span><span class="at"> int | None</span></span>
<span id="cb1-1407"><a href="#cb1-1407" aria-hidden="true" tabindex="-1"></a><span class="fu">rope_scaling</span><span class="kw">:</span><span class="at"> Any | None</span></span>
<span id="cb1-1408"><a href="#cb1-1408" aria-hidden="true" tabindex="-1"></a><span class="fu">noisy_embedding_alpha</span><span class="kw">:</span><span class="at"> float | None</span></span>
<span id="cb1-1409"><a href="#cb1-1409" aria-hidden="true" tabindex="-1"></a><span class="fu">dpo_beta</span><span class="kw">:</span><span class="at"> float | None</span></span>
<span id="cb1-1410"><a href="#cb1-1410" aria-hidden="true" tabindex="-1"></a><span class="fu">evaluation_strategy</span><span class="kw">:</span><span class="at"> str | None</span></span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<span id="cb1-1396"><a href="#cb1-1396" aria-hidden="true" tabindex="-1"></a><span class="co"># The size of the image to resize to. It can be an integer (resized into padded-square</span></span>
<span id="cb1-1397"><a href="#cb1-1397" aria-hidden="true" tabindex="-1"></a><span class="co"># image) or a tuple (width, height).If not provided, we will attempt to load from</span></span>
<span id="cb1-1398"><a href="#cb1-1398" aria-hidden="true" tabindex="-1"></a><span class="co"># preprocessor.size, otherwise, images won't be resized.</span></span>
<span id="cb1-1399"><a href="#cb1-1399" aria-hidden="true" tabindex="-1"></a><span class="fu">image_size</span><span class="kw">:</span><span class="at"> int | tuple[int, int] | None</span></span>
<span id="cb1-1400"><a href="#cb1-1400" aria-hidden="true" tabindex="-1"></a><span class="co"># The resampling algorithm to use for image resizing. Default is bilinear. Please refer</span></span>
<span id="cb1-1401"><a href="#cb1-1401" aria-hidden="true" tabindex="-1"></a><span class="co"># to PIL.Image.Resampling for more details.</span></span>
<span id="cb1-1402"><a href="#cb1-1402" aria-hidden="true" tabindex="-1"></a><span class="fu">image_resize_algorithm</span><span class="kw">:</span><span class="at"> Literal['bilinear', 'bicubic', 'lanczos'] | Resampling | None</span></span>
<span id="cb1-1403"><a href="#cb1-1403" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-1404"><a href="#cb1-1404" aria-hidden="true" tabindex="-1"></a><span class="co"># optional overrides to the base model configuration</span></span>
<span id="cb1-1405"><a href="#cb1-1405" aria-hidden="true" tabindex="-1"></a><span class="fu">overrides_of_model_config</span><span class="kw">:</span><span class="at"> dict[str, Any] | None</span></span>
<span id="cb1-1406"><a href="#cb1-1406" aria-hidden="true" tabindex="-1"></a><span class="co"># optional overrides the base model loading from_pretrained</span></span>
<span id="cb1-1407"><a href="#cb1-1407" aria-hidden="true" tabindex="-1"></a><span class="fu">overrides_of_model_kwargs</span><span class="kw">:</span><span class="at"> dict[str, Any] | None</span></span>
<span id="cb1-1408"><a href="#cb1-1408" aria-hidden="true" tabindex="-1"></a><span class="co"># If you want to specify the type of model to load, AutoModelForCausalLM is a good</span></span>
<span id="cb1-1409"><a href="#cb1-1409" aria-hidden="true" tabindex="-1"></a><span class="co"># choice too</span></span>
<span id="cb1-1410"><a href="#cb1-1410" aria-hidden="true" tabindex="-1"></a><span class="fu">type_of_model</span><span class="kw">:</span><span class="at"> str | None</span></span>
<span id="cb1-1411"><a href="#cb1-1411" aria-hidden="true" tabindex="-1"></a><span class="co"># You can specify to choose a specific model revision from huggingface hub</span></span>
<span id="cb1-1412"><a href="#cb1-1412" aria-hidden="true" tabindex="-1"></a><span class="fu">revision_of_model</span><span class="kw">:</span><span class="at"> str | None</span></span>
<span id="cb1-1413"><a href="#cb1-1413" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-1414"><a href="#cb1-1414" aria-hidden="true" tabindex="-1"></a><span class="fu">max_packed_sequence_len</span><span class="kw">:</span><span class="at"> int | None</span></span>
<span id="cb1-1415"><a href="#cb1-1415" aria-hidden="true" tabindex="-1"></a><span class="fu">rope_scaling</span><span class="kw">:</span><span class="at"> Any | None</span></span>
<span id="cb1-1416"><a href="#cb1-1416" aria-hidden="true" tabindex="-1"></a><span class="fu">noisy_embedding_alpha</span><span class="kw">:</span><span class="at"> float | None</span></span>
<span id="cb1-1417"><a href="#cb1-1417" aria-hidden="true" tabindex="-1"></a><span class="fu">dpo_beta</span><span class="kw">:</span><span class="at"> float | None</span></span>
<span id="cb1-1418"><a href="#cb1-1418" aria-hidden="true" tabindex="-1"></a><span class="fu">evaluation_strategy</span><span class="kw">:</span><span class="at"> str | None</span></span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>

View File

@@ -619,7 +619,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<ul>
<li>If you are installing from pip</li>
</ul>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb2"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="ex">pip3</span> uninstall <span class="at">-y</span> cut-cross-entropy <span class="kw">&amp;&amp;</span> <span class="ex">pip3</span> install <span class="st">"cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@f643b88"</span></span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb2"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="ex">pip3</span> uninstall <span class="at">-y</span> cut-cross-entropy <span class="kw">&amp;&amp;</span> <span class="ex">pip3</span> install <span class="st">"cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@242b245"</span></span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
</section>
<section id="usage" class="level3">
<h3 class="anchored" data-anchor-id="usage">Usage</h3>
@@ -652,6 +652,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<li>granitemoehybrid</li>
<li>hunyuan_v1_dense</li>
<li>hunyuan_v1_moe</li>
<li>kimi_linear</li>
<li>lfm2</li>
<li>lfm2_moe</li>
<li>lfm2_vl</li>

View File

@@ -567,7 +567,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="op">%%</span>capture</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="co"># This step can take ~5-10 minutes to install dependencies</span></span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="op">!</span>pip install <span class="op">--</span>no<span class="op">-</span>build<span class="op">-</span>isolation axolotl[flash<span class="op">-</span>attn]<span class="op">&gt;=</span><span class="fl">0.9.1</span></span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a><span class="op">!</span>pip install <span class="st">"cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@f643b88"</span></span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a><span class="op">!</span>pip install <span class="st">"cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@242b245"</span></span></code></pre></div><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></div>
</div>
<section id="demo-talk-like-a-pirate" class="level2">
<h2 class="anchored" data-anchor-id="demo-talk-like-a-pirate">Demo: Talk Like a Pirate</h2>

View File

@@ -564,7 +564,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="latest-updates" class="level2">
<h2 class="anchored" data-anchor-id="latest-updates">🎉 Latest Updates</h2>
<ul>
<li>2025/12: Axolotl now includes support for <a href="https://github.com/axolotl-ai-cloud/axolotl/blob/main/examples/olmo3">Olmo3</a>, <a href="https://github.com/axolotl-ai-cloud/axolotl/tree/main/examples/trinity">Trinity</a>, and <a href="https://github.com/axolotl-ai-cloud/axolotl/blob/main/examples/ministral3">Ministral3</a>.</li>
<li>2025/12: Axolotl now includes support for <a href="https://github.com/axolotl-ai-cloud/axolotl/tree/main/examples/kimi-linear">Kimi-Linear</a>, <a href="https://github.com/axolotl-ai-cloud/axolotl/blob/main/examples/olmo3">Olmo3</a>, <a href="https://github.com/axolotl-ai-cloud/axolotl/tree/main/examples/trinity">Trinity</a>, and <a href="https://github.com/axolotl-ai-cloud/axolotl/blob/main/examples/ministral3">Ministral3</a>.</li>
<li>2025/10: New model support has been added in Axolotl for: <a href="https://github.com/axolotl-ai-cloud/axolotl/blob/main/examples/qwen3-next">Qwen3 Next</a>, <a href="https://github.com/axolotl-ai-cloud/axolotl/tree/main/examples/qwen2_5-vl">Qwen2.5-vl, Qwen3-vl</a>, <a href="https://github.com/axolotl-ai-cloud/axolotl/tree/main/examples/qwen3">Qwen3, Qwen3MoE</a>, <a href="https://github.com/axolotl-ai-cloud/axolotl/tree/main/examples/granite4">Granite 4</a>, <a href="https://github.com/axolotl-ai-cloud/axolotl/tree/main/examples/hunyuan">HunYuan</a>, <a href="https://github.com/axolotl-ai-cloud/axolotl/tree/main/examples/magistral#vision">Magistral 2509</a>, <a href="https://github.com/axolotl-ai-cloud/axolotl/tree/main/examples/apertus">Apertus</a>, and <a href="https://github.com/axolotl-ai-cloud/axolotl/tree/main/examples/seed-oss">Seed-OSS</a>.</li>
<li>2025/09: Axolotl now has text diffusion training. Read more <a href="https://github.com/axolotl-ai-cloud/axolotl/tree/main/src/axolotl/integrations/diffusion">here</a>.</li>
<li>2025/08: QAT has been updated to include NVFP4 support. See <a href="https://github.com/axolotl-ai-cloud/axolotl/pull/3107">PR</a>.</li>

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load Diff