Built site for gh-pages

This commit is contained in:
Quarto GHA Workflow Runner
2025-10-22 22:29:16 +00:00
parent 27d2c41079
commit 302e9406ed
6 changed files with 273 additions and 232 deletions

View File

@@ -563,6 +563,19 @@ modules in a model.</p>
<p>In this example, we have a default learning rate of 2e-5 across the entire model, but we have a separate learning rate
of 1e-6 for all the self attention <code>o_proj</code> modules across all layers, and a learning are of 1e-5 to the 3rd layers
self attention <code>q_proj</code> module.</p>
<div class="callout callout-style-default callout-note callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
Note
</div>
</div>
<div class="callout-body-container callout-body">
<p>We currently only support varying <code>lr</code> for now. If youre interested in adding support for others (<code>weight_decay</code>), we welcome PRs. See https://github.com/axolotl-ai-cloud/axolotl/blob/613bcf90e58f3ab81d3827e7fc572319908db9fb/src/axolotl/core/trainers/mixins/optimizer.py#L17</p>
</div>
</div>
</section>