Built site for gh-pages

This commit is contained in:
Quarto GHA Workflow Runner
2025-05-30 04:24:18 +00:00
parent dd36fe4391
commit 9304e18f4b
58 changed files with 3955 additions and 2244 deletions

View File

@@ -531,72 +531,67 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.cli.args.EvaluateCliArgs" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.cli.args.EvaluateCliArgs">EvaluateCliArgs</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>cli.args.EvaluateCliArgs(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> debug<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> debug_text_only<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> debug_num_examples<span class="op">=</span><span class="dv">0</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> debug<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> debug_text_only<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> debug_num_examples<span class="op">=</span><span class="dv">0</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Dataclass with CLI arguments for <code>axolotl evaluate</code> command.</p>
</section>
<section id="axolotl.cli.args.InferenceCliArgs" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.cli.args.InferenceCliArgs">InferenceCliArgs</h3>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>cli.args.InferenceCliArgs(<span class="va">self</span>, prompter<span class="op">=</span><span class="va">None</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>cli.args.InferenceCliArgs(prompter<span class="op">=</span><span class="va">None</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Dataclass with CLI arguments for <code>axolotl inference</code> command.</p>
</section>
<section id="axolotl.cli.args.PreprocessCliArgs" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.cli.args.PreprocessCliArgs">PreprocessCliArgs</h3>
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>cli.args.PreprocessCliArgs(</span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> debug<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> debug_text_only<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> debug_num_examples<span class="op">=</span><span class="dv">1</span>,</span>
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a> prompter<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a> download<span class="op">=</span><span class="va">True</span>,</span>
<span id="cb3-8"><a href="#cb3-8" aria-hidden="true" tabindex="-1"></a> iterable<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-9"><a href="#cb3-9" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> debug<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> debug_text_only<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> debug_num_examples<span class="op">=</span><span class="dv">1</span>,</span>
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> prompter<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a> download<span class="op">=</span><span class="va">True</span>,</span>
<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a> iterable<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-8"><a href="#cb3-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Dataclass with CLI arguments for <code>axolotl preprocess</code> command.</p>
</section>
<section id="axolotl.cli.args.QuantizeCliArgs" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.cli.args.QuantizeCliArgs">QuantizeCliArgs</h3>
<div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>cli.args.QuantizeCliArgs(</span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> base_model<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> weight_dtype<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> activation_dtype<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> quantize_embedding<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a> group_size<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a> output_dir<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> base_model<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> weight_dtype<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> activation_dtype<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> quantize_embedding<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> group_size<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a> output_dir<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Dataclass with CLI arguments for <code>axolotl quantize</code> command.</p>
</section>
<section id="axolotl.cli.args.TrainerCliArgs" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.cli.args.TrainerCliArgs">TrainerCliArgs</h3>
<div class="sourceCode" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>cli.args.TrainerCliArgs(</span>
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a> debug<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a> debug_text_only<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a> debug_num_examples<span class="op">=</span><span class="dv">0</span>,</span>
<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a> merge_lora<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-7"><a href="#cb5-7" aria-hidden="true" tabindex="-1"></a> prompter<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-8"><a href="#cb5-8" aria-hidden="true" tabindex="-1"></a> shard<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-9"><a href="#cb5-9" aria-hidden="true" tabindex="-1"></a> main_process_port<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-10"><a href="#cb5-10" aria-hidden="true" tabindex="-1"></a> num_processes<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-11"><a href="#cb5-11" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a> debug<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a> debug_text_only<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a> debug_num_examples<span class="op">=</span><span class="dv">0</span>,</span>
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a> merge_lora<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a> prompter<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-7"><a href="#cb5-7" aria-hidden="true" tabindex="-1"></a> shard<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-8"><a href="#cb5-8" aria-hidden="true" tabindex="-1"></a> main_process_port<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-9"><a href="#cb5-9" aria-hidden="true" tabindex="-1"></a> num_processes<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-10"><a href="#cb5-10" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Dataclass with CLI arguments for <code>axolotl train</code> command.</p>
</section>
<section id="axolotl.cli.args.VllmServeCliArgs" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.cli.args.VllmServeCliArgs">VllmServeCliArgs</h3>
<div class="sourceCode" id="cb6"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a>cli.args.VllmServeCliArgs(</span>
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a> tensor_parallel_size<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a> host<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a> port<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a> gpu_memory_utilization<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-7"><a href="#cb6-7" aria-hidden="true" tabindex="-1"></a> dtype<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-8"><a href="#cb6-8" aria-hidden="true" tabindex="-1"></a> max_model_len<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-9"><a href="#cb6-9" aria-hidden="true" tabindex="-1"></a> enable_prefix_caching<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-10"><a href="#cb6-10" aria-hidden="true" tabindex="-1"></a> serve_module<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-11"><a href="#cb6-11" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a> tensor_parallel_size<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a> host<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a> port<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a> gpu_memory_utilization<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a> dtype<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-7"><a href="#cb6-7" aria-hidden="true" tabindex="-1"></a> max_model_len<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-8"><a href="#cb6-8" aria-hidden="true" tabindex="-1"></a> enable_prefix_caching<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-9"><a href="#cb6-9" aria-hidden="true" tabindex="-1"></a> serve_module<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-10"><a href="#cb6-10" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Dataclass with CLI arguments for <code>axolotl vllm-serve</code> command.</p>

View File

@@ -509,7 +509,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</table>
<section id="axolotl.cli.cloud.modal_.ModalCloud" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.cli.cloud.modal_.ModalCloud">ModalCloud</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>cli.cloud.modal_.ModalCloud(<span class="va">self</span>, config, app<span class="op">=</span><span class="va">None</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>cli.cloud.modal_.ModalCloud(config, app<span class="op">=</span><span class="va">None</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Modal Cloud implementation.</p>
</section>
</section>

View File

@@ -512,11 +512,10 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.common.datasets.TrainDatasetMeta" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.common.datasets.TrainDatasetMeta">TrainDatasetMeta</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>common.datasets.TrainDatasetMeta(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> train_dataset,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> eval_dataset<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> total_num_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> train_dataset,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> eval_dataset<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> total_num_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Dataclass with fields for training and validation datasets and metadata.</p>
</section>
</section>

View File

@@ -535,7 +535,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</section>
<section id="axolotl.convert.FileWriter" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.convert.FileWriter">FileWriter</h3>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>convert.FileWriter(<span class="va">self</span>, file_path)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>convert.FileWriter(file_path)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Writes a string to a file</p>
</section>
<section id="axolotl.convert.JsonParser" class="level3">
@@ -546,12 +546,11 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.convert.JsonToJsonlConverter" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.convert.JsonToJsonlConverter">JsonToJsonlConverter</h3>
<div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>convert.JsonToJsonlConverter(</span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> file_reader,</span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> file_writer,</span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> json_parser,</span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> jsonl_serializer,</span>
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> file_reader,</span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> file_writer,</span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> json_parser,</span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> jsonl_serializer,</span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Converts a JSON file to JSONL</p>
</section>
<section id="axolotl.convert.JsonlSerializer" class="level3">

View File

@@ -0,0 +1,943 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>
<meta charset="utf-8">
<meta name="generator" content="quarto-1.7.31">
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
<title>core.builders.base Axolotl</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
div.columns{display: flex; gap: min(4vw, 1.5em);}
div.column{flex: auto; overflow-x: auto;}
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
ul.task-list{list-style: none;}
ul.task-list li input[type="checkbox"] {
width: 0.8em;
margin: 0 0.8em 0.2em -1em; /* quarto-specific, see https://github.com/quarto-dev/quarto-cli/issues/4556 */
vertical-align: middle;
}
/* CSS for syntax highlighting */
html { -webkit-text-size-adjust: 100%; }
pre > code.sourceCode { white-space: pre; position: relative; }
pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
pre > code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
div.sourceCode { margin: 1em 0; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
}
pre.numberSource { margin-left: 3em; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
</style>
<script src="../../site_libs/quarto-nav/quarto-nav.js"></script>
<script src="../../site_libs/clipboard/clipboard.min.js"></script>
<script src="../../site_libs/quarto-search/autocomplete.umd.js"></script>
<script src="../../site_libs/quarto-search/fuse.min.js"></script>
<script src="../../site_libs/quarto-search/quarto-search.js"></script>
<meta name="quarto:offset" content="../../">
<link href="../../favicon.jpg" rel="icon" type="image/jpeg">
<script src="../../site_libs/quarto-html/quarto.js" type="module"></script>
<script src="../../site_libs/quarto-html/tabsets/tabsets.js" type="module"></script>
<script src="../../site_libs/quarto-html/popper.min.js"></script>
<script src="../../site_libs/quarto-html/tippy.umd.min.js"></script>
<script src="../../site_libs/quarto-html/anchor.min.js"></script>
<link href="../../site_libs/quarto-html/tippy.css" rel="stylesheet">
<link href="../../site_libs/quarto-html/quarto-syntax-highlighting-dark-8ef56b68f8fa1e9d2ba328e99e439f80.css" rel="stylesheet" id="quarto-text-highlighting-styles">
<script src="../../site_libs/bootstrap/bootstrap.min.js"></script>
<link href="../../site_libs/bootstrap/bootstrap-icons.css" rel="stylesheet">
<link href="../../site_libs/bootstrap/bootstrap-2288ecdcbf81d2ab6432743cedd71d9a.min.css" rel="stylesheet" append-hash="true" id="quarto-bootstrap" data-mode="dark">
<script id="quarto-search-options" type="application/json">{
"location": "navbar",
"copy-button": false,
"collapse-after": 3,
"panel-placement": "end",
"type": "overlay",
"limit": 50,
"keyboard-shortcut": [
"f",
"/",
"s"
],
"show-item-context": false,
"language": {
"search-no-results-text": "No results",
"search-matching-documents-text": "matching documents",
"search-copy-link-title": "Copy link to search",
"search-hide-matches-text": "Hide additional matches",
"search-more-match-text": "more match in this document",
"search-more-matches-text": "more matches in this document",
"search-clear-button-title": "Clear",
"search-text-placeholder": "",
"search-detached-cancel-button-title": "Cancel",
"search-submit-button-title": "Submit",
"search-label": "Search"
}
}</script>
<script async="" src="https://www.googletagmanager.com/gtag/js?id=G-9KYCVJBNMQ"></script>
<script type="text/javascript">
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</script>
<link rel="stylesheet" href="../../styles.css">
</head>
<body class="nav-sidebar docked nav-fixed quarto-light">
<div id="quarto-search-results"></div>
<header id="quarto-header" class="headroom fixed-top">
<nav class="navbar navbar-expand " data-bs-theme="dark">
<div class="navbar-container container-fluid">
<div class="navbar-brand-container mx-auto">
<a href="../../index.html" class="navbar-brand navbar-brand-logo">
<img src="../../image/axolotl_logo_digital_white.svg" alt="" class="navbar-logo">
</a>
</div>
<div class="quarto-navbar-tools tools-wide tools-end">
<a href="https://twitter.com/axolotl_ai" title="" class="quarto-navigation-tool px-1" aria-label=""><i class="bi bi-twitter"></i></a>
<a href="https://github.com/axolotl-ai-cloud/axolotl/" title="" class="quarto-navigation-tool px-1" aria-label=""><i class="bi bi-github"></i></a>
<a href="https://discord.gg/7m9sfhzaf3" title="" class="quarto-navigation-tool px-1" aria-label=""><i class="bi bi-discord"></i></a>
</div>
<div id="quarto-search" class="" title="Search"></div>
</div> <!-- /container-fluid -->
</nav>
<nav class="quarto-secondary-nav">
<div class="container-fluid d-flex">
<button type="button" class="quarto-btn-toggle btn" data-bs-toggle="collapse" role="button" data-bs-target=".quarto-sidebar-collapse-item" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
<i class="bi bi-layout-text-sidebar-reverse"></i>
</button>
<nav class="quarto-page-breadcrumbs" aria-label="breadcrumb"><ol class="breadcrumb"></ol></nav>
<a class="flex-grow-1" role="navigation" data-bs-toggle="collapse" data-bs-target=".quarto-sidebar-collapse-item" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
</a>
</div>
</nav>
</header>
<!-- content -->
<div id="quarto-content" class="quarto-container page-columns page-rows-contents page-layout-article page-navbar">
<!-- sidebar -->
<nav id="quarto-sidebar" class="sidebar collapse collapse-horizontal quarto-sidebar-collapse-item sidebar-navigation docked overflow-auto">
<div class="sidebar-menu-container">
<ul class="list-unstyled mt-1">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../index.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Home</span></a>
</div>
</li>
<li class="sidebar-item sidebar-item-section">
<div class="sidebar-item-container">
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-1" role="navigation" aria-expanded="true">
<span class="menu-text">Getting Started</span></a>
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-1" role="navigation" aria-expanded="true" aria-label="Toggle section">
<i class="bi bi-chevron-right ms-2"></i>
</a>
</div>
<ul id="quarto-sidebar-section-1" class="collapse list-unstyled sidebar-section depth1 show">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/getting-started.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Quickstart</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/installation.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Installation</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/inference.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Inference and Merging</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/cli.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Command Line Interface (CLI)</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/config.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Config Reference</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/api" class="sidebar-item-text sidebar-link">
<span class="menu-text">API Reference</span></a>
</div>
</li>
</ul>
</li>
<li class="sidebar-item sidebar-item-section">
<div class="sidebar-item-container">
<a href="../../docs/dataset-formats/index.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Dataset Formats</span></a>
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-2" role="navigation" aria-expanded="true" aria-label="Toggle section">
<i class="bi bi-chevron-right ms-2"></i>
</a>
</div>
<ul id="quarto-sidebar-section-2" class="collapse list-unstyled sidebar-section depth1 show">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/dataset-formats/pretraining.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Pre-training</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/dataset-formats/inst_tune.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Instruction Tuning</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/dataset-formats/conversation.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Conversation</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/dataset-formats/stepwise_supervised.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Stepwise Supervised Format</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/dataset-formats/template_free.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Template-Free</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/dataset-formats/tokenized.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Custom Pre-Tokenized Dataset</span></a>
</div>
</li>
</ul>
</li>
<li class="sidebar-item sidebar-item-section">
<div class="sidebar-item-container">
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-3" role="navigation" aria-expanded="true">
<span class="menu-text">Deployments</span></a>
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-3" role="navigation" aria-expanded="true" aria-label="Toggle section">
<i class="bi bi-chevron-right ms-2"></i>
</a>
</div>
<ul id="quarto-sidebar-section-3" class="collapse list-unstyled sidebar-section depth1 show">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/docker.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Docker</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/multi-gpu.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Multi-GPU</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/multi-node.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Multi Node</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/ray-integration.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Ray Train</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/amd_hpc.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">AMD GPUs on HPC Systems</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/mac.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Mac M-series</span></a>
</div>
</li>
</ul>
</li>
<li class="sidebar-item sidebar-item-section">
<div class="sidebar-item-container">
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-4" role="navigation" aria-expanded="true">
<span class="menu-text">How To Guides</span></a>
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-4" role="navigation" aria-expanded="true" aria-label="Toggle section">
<i class="bi bi-chevron-right ms-2"></i>
</a>
</div>
<ul id="quarto-sidebar-section-4" class="collapse list-unstyled sidebar-section depth1 show">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/multimodal.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">MultiModal / Vision Language Models (BETA)</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/rlhf.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">RLHF (Beta)</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/reward_modelling.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Reward Modelling</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/lr_groups.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Learning Rate Groups</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/lora_optims.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">LoRA Optimizations</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/dataset_loading.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Dataset Loading</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/qat.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Quantization Aware Training (QAT)</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/quantize.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Quantization with torchao</span></a>
</div>
</li>
</ul>
</li>
<li class="sidebar-item sidebar-item-section">
<div class="sidebar-item-container">
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-5" role="navigation" aria-expanded="true">
<span class="menu-text">Core Concepts</span></a>
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-5" role="navigation" aria-expanded="true" aria-label="Toggle section">
<i class="bi bi-chevron-right ms-2"></i>
</a>
</div>
<ul id="quarto-sidebar-section-5" class="collapse list-unstyled sidebar-section depth1 show">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/batch_vs_grad.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Batch size vs Gradient accumulation</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/dataset_preprocessing.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Dataset Preprocessing</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/multipack.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Multipack (Sample Packing)</span></a>
</div>
</li>
</ul>
</li>
<li class="sidebar-item sidebar-item-section">
<div class="sidebar-item-container">
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-6" role="navigation" aria-expanded="true">
<span class="menu-text">Advanced Features</span></a>
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-6" role="navigation" aria-expanded="true" aria-label="Toggle section">
<i class="bi bi-chevron-right ms-2"></i>
</a>
</div>
<ul id="quarto-sidebar-section-6" class="collapse list-unstyled sidebar-section depth1 show">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/fsdp_qlora.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">FDSP + QLoRA</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/unsloth.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Unsloth</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/torchao.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">PyTorch ao</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/custom_integrations.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Custom Integrations</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/sequence_parallelism.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Sequence Parallelism</span></a>
</div>
</li>
</ul>
</li>
<li class="sidebar-item sidebar-item-section">
<div class="sidebar-item-container">
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-7" role="navigation" aria-expanded="true">
<span class="menu-text">Troubleshooting</span></a>
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-7" role="navigation" aria-expanded="true" aria-label="Toggle section">
<i class="bi bi-chevron-right ms-2"></i>
</a>
</div>
<ul id="quarto-sidebar-section-7" class="collapse list-unstyled sidebar-section depth1 show">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/faq.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">FAQ</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/debugging.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Debugging</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/nccl.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">NCCL</span></a>
</div>
</li>
</ul>
</li>
</ul>
</div>
</nav>
<div id="quarto-sidebar-glass" class="quarto-sidebar-collapse-item" data-bs-toggle="collapse" data-bs-target=".quarto-sidebar-collapse-item"></div>
<!-- margin-sidebar -->
<div id="quarto-margin-sidebar" class="sidebar margin-sidebar">
<nav id="TOC" role="doc-toc" class="toc-active">
<h2 id="toc-title">On this page</h2>
<ul>
<li><a href="#axolotl.core.builders.base" id="toc-axolotl.core.builders.base" class="nav-link active" data-scroll-target="#axolotl.core.builders.base">core.builders.base</a>
<ul class="collapse">
<li><a href="#classes" id="toc-classes" class="nav-link" data-scroll-target="#classes">Classes</a>
<ul class="collapse">
<li><a href="#axolotl.core.builders.base.TrainerBuilderBase" id="toc-axolotl.core.builders.base.TrainerBuilderBase" class="nav-link" data-scroll-target="#axolotl.core.builders.base.TrainerBuilderBase">TrainerBuilderBase</a></li>
</ul></li>
</ul></li>
</ul>
</nav>
</div>
<!-- main -->
<main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block"></header>
<section id="axolotl.core.builders.base" class="level1">
<h1>core.builders.base</h1>
<p><code>core.builders.base</code></p>
<p>Base class for trainer builder</p>
<section id="classes" class="level2">
<h2 class="anchored" data-anchor-id="classes">Classes</h2>
<table class="caption-top table">
<thead>
<tr class="header">
<th>Name</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><a href="#axolotl.core.builders.base.TrainerBuilderBase">TrainerBuilderBase</a></td>
<td>Base class for trainer builder.</td>
</tr>
</tbody>
</table>
<section id="axolotl.core.builders.base.TrainerBuilderBase" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.builders.base.TrainerBuilderBase">TrainerBuilderBase</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.builders.base.TrainerBuilderBase(cfg, model, tokenizer, processor<span class="op">=</span><span class="va">None</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Base class for trainer builder.</p>
<section id="methods" class="level4">
<h4 class="anchored" data-anchor-id="methods">Methods</h4>
<table class="caption-top table">
<thead>
<tr class="header">
<th>Name</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><a href="#axolotl.core.builders.base.TrainerBuilderBase.get_post_trainer_create_callbacks">get_post_trainer_create_callbacks</a></td>
<td>Callbacks added after the trainer is created, usually b/c these need access to the trainer</td>
</tr>
</tbody>
</table>
<section id="axolotl.core.builders.base.TrainerBuilderBase.get_post_trainer_create_callbacks" class="level5">
<h5 class="anchored" data-anchor-id="axolotl.core.builders.base.TrainerBuilderBase.get_post_trainer_create_callbacks">get_post_trainer_create_callbacks</h5>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>core.builders.base.TrainerBuilderBase.get_post_trainer_create_callbacks(trainer)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Callbacks added after the trainer is created, usually b/c these need access to the trainer</p>
</section>
</section>
</section>
</section>
</section>
</main> <!-- /main -->
<script id="quarto-html-after-body" type="application/javascript">
window.document.addEventListener("DOMContentLoaded", function (event) {
const icon = "";
const anchorJS = new window.AnchorJS();
anchorJS.options = {
placement: 'right',
icon: icon
};
anchorJS.add('.anchored');
const isCodeAnnotation = (el) => {
for (const clz of el.classList) {
if (clz.startsWith('code-annotation-')) {
return true;
}
}
return false;
}
const onCopySuccess = function(e) {
// button target
const button = e.trigger;
// don't keep focus
button.blur();
// flash "checked"
button.classList.add('code-copy-button-checked');
var currentTitle = button.getAttribute("title");
button.setAttribute("title", "Copied!");
let tooltip;
if (window.bootstrap) {
button.setAttribute("data-bs-toggle", "tooltip");
button.setAttribute("data-bs-placement", "left");
button.setAttribute("data-bs-title", "Copied!");
tooltip = new bootstrap.Tooltip(button,
{ trigger: "manual",
customClass: "code-copy-button-tooltip",
offset: [0, -8]});
tooltip.show();
}
setTimeout(function() {
if (tooltip) {
tooltip.hide();
button.removeAttribute("data-bs-title");
button.removeAttribute("data-bs-toggle");
button.removeAttribute("data-bs-placement");
}
button.setAttribute("title", currentTitle);
button.classList.remove('code-copy-button-checked');
}, 1000);
// clear code selection
e.clearSelection();
}
const getTextToCopy = function(trigger) {
const codeEl = trigger.previousElementSibling.cloneNode(true);
for (const childEl of codeEl.children) {
if (isCodeAnnotation(childEl)) {
childEl.remove();
}
}
return codeEl.innerText;
}
const clipboard = new window.ClipboardJS('.code-copy-button:not([data-in-quarto-modal])', {
text: getTextToCopy
});
clipboard.on('success', onCopySuccess);
if (window.document.getElementById('quarto-embedded-source-code-modal')) {
const clipboardModal = new window.ClipboardJS('.code-copy-button[data-in-quarto-modal]', {
text: getTextToCopy,
container: window.document.getElementById('quarto-embedded-source-code-modal')
});
clipboardModal.on('success', onCopySuccess);
}
var localhostRegex = new RegExp(/^(?:http|https):\/\/localhost\:?[0-9]*\//);
var mailtoRegex = new RegExp(/^mailto:/);
var filterRegex = new RegExp("https:\/\/docs\.axolotl\.ai");
var isInternal = (href) => {
return filterRegex.test(href) || localhostRegex.test(href) || mailtoRegex.test(href);
}
// Inspect non-navigation links and adorn them if external
var links = window.document.querySelectorAll('a[href]:not(.nav-link):not(.navbar-brand):not(.toc-action):not(.sidebar-link):not(.sidebar-item-toggle):not(.pagination-link):not(.no-external):not([aria-hidden]):not(.dropdown-item):not(.quarto-navigation-tool):not(.about-link)');
for (var i=0; i<links.length; i++) {
const link = links[i];
if (!isInternal(link.href)) {
// undo the damage that might have been done by quarto-nav.js in the case of
// links that we want to consider external
if (link.dataset.originalHref !== undefined) {
link.href = link.dataset.originalHref;
}
}
}
function tippyHover(el, contentFn, onTriggerFn, onUntriggerFn) {
const config = {
allowHTML: true,
maxWidth: 500,
delay: 100,
arrow: false,
appendTo: function(el) {
return el.parentElement;
},
interactive: true,
interactiveBorder: 10,
theme: 'quarto',
placement: 'bottom-start',
};
if (contentFn) {
config.content = contentFn;
}
if (onTriggerFn) {
config.onTrigger = onTriggerFn;
}
if (onUntriggerFn) {
config.onUntrigger = onUntriggerFn;
}
window.tippy(el, config);
}
const noterefs = window.document.querySelectorAll('a[role="doc-noteref"]');
for (var i=0; i<noterefs.length; i++) {
const ref = noterefs[i];
tippyHover(ref, function() {
// use id or data attribute instead here
let href = ref.getAttribute('data-footnote-href') || ref.getAttribute('href');
try { href = new URL(href).hash; } catch {}
const id = href.replace(/^#\/?/, "");
const note = window.document.getElementById(id);
if (note) {
return note.innerHTML;
} else {
return "";
}
});
}
const xrefs = window.document.querySelectorAll('a.quarto-xref');
const processXRef = (id, note) => {
// Strip column container classes
const stripColumnClz = (el) => {
el.classList.remove("page-full", "page-columns");
if (el.children) {
for (const child of el.children) {
stripColumnClz(child);
}
}
}
stripColumnClz(note)
if (id === null || id.startsWith('sec-')) {
// Special case sections, only their first couple elements
const container = document.createElement("div");
if (note.children && note.children.length > 2) {
container.appendChild(note.children[0].cloneNode(true));
for (let i = 1; i < note.children.length; i++) {
const child = note.children[i];
if (child.tagName === "P" && child.innerText === "") {
continue;
} else {
container.appendChild(child.cloneNode(true));
break;
}
}
if (window.Quarto?.typesetMath) {
window.Quarto.typesetMath(container);
}
return container.innerHTML
} else {
if (window.Quarto?.typesetMath) {
window.Quarto.typesetMath(note);
}
return note.innerHTML;
}
} else {
// Remove any anchor links if they are present
const anchorLink = note.querySelector('a.anchorjs-link');
if (anchorLink) {
anchorLink.remove();
}
if (window.Quarto?.typesetMath) {
window.Quarto.typesetMath(note);
}
if (note.classList.contains("callout")) {
return note.outerHTML;
} else {
return note.innerHTML;
}
}
}
for (var i=0; i<xrefs.length; i++) {
const xref = xrefs[i];
tippyHover(xref, undefined, function(instance) {
instance.disable();
let url = xref.getAttribute('href');
let hash = undefined;
if (url.startsWith('#')) {
hash = url;
} else {
try { hash = new URL(url).hash; } catch {}
}
if (hash) {
const id = hash.replace(/^#\/?/, "");
const note = window.document.getElementById(id);
if (note !== null) {
try {
const html = processXRef(id, note.cloneNode(true));
instance.setContent(html);
} finally {
instance.enable();
instance.show();
}
} else {
// See if we can fetch this
fetch(url.split('#')[0])
.then(res => res.text())
.then(html => {
const parser = new DOMParser();
const htmlDoc = parser.parseFromString(html, "text/html");
const note = htmlDoc.getElementById(id);
if (note !== null) {
const html = processXRef(id, note);
instance.setContent(html);
}
}).finally(() => {
instance.enable();
instance.show();
});
}
} else {
// See if we can fetch a full url (with no hash to target)
// This is a special case and we should probably do some content thinning / targeting
fetch(url)
.then(res => res.text())
.then(html => {
const parser = new DOMParser();
const htmlDoc = parser.parseFromString(html, "text/html");
const note = htmlDoc.querySelector('main.content');
if (note !== null) {
// This should only happen for chapter cross references
// (since there is no id in the URL)
// remove the first header
if (note.children.length > 0 && note.children[0].tagName === "HEADER") {
note.children[0].remove();
}
const html = processXRef(null, note);
instance.setContent(html);
}
}).finally(() => {
instance.enable();
instance.show();
});
}
}, function(instance) {
});
}
let selectedAnnoteEl;
const selectorForAnnotation = ( cell, annotation) => {
let cellAttr = 'data-code-cell="' + cell + '"';
let lineAttr = 'data-code-annotation="' + annotation + '"';
const selector = 'span[' + cellAttr + '][' + lineAttr + ']';
return selector;
}
const selectCodeLines = (annoteEl) => {
const doc = window.document;
const targetCell = annoteEl.getAttribute("data-target-cell");
const targetAnnotation = annoteEl.getAttribute("data-target-annotation");
const annoteSpan = window.document.querySelector(selectorForAnnotation(targetCell, targetAnnotation));
const lines = annoteSpan.getAttribute("data-code-lines").split(",");
const lineIds = lines.map((line) => {
return targetCell + "-" + line;
})
let top = null;
let height = null;
let parent = null;
if (lineIds.length > 0) {
//compute the position of the single el (top and bottom and make a div)
const el = window.document.getElementById(lineIds[0]);
top = el.offsetTop;
height = el.offsetHeight;
parent = el.parentElement.parentElement;
if (lineIds.length > 1) {
const lastEl = window.document.getElementById(lineIds[lineIds.length - 1]);
const bottom = lastEl.offsetTop + lastEl.offsetHeight;
height = bottom - top;
}
if (top !== null && height !== null && parent !== null) {
// cook up a div (if necessary) and position it
let div = window.document.getElementById("code-annotation-line-highlight");
if (div === null) {
div = window.document.createElement("div");
div.setAttribute("id", "code-annotation-line-highlight");
div.style.position = 'absolute';
parent.appendChild(div);
}
div.style.top = top - 2 + "px";
div.style.height = height + 4 + "px";
div.style.left = 0;
let gutterDiv = window.document.getElementById("code-annotation-line-highlight-gutter");
if (gutterDiv === null) {
gutterDiv = window.document.createElement("div");
gutterDiv.setAttribute("id", "code-annotation-line-highlight-gutter");
gutterDiv.style.position = 'absolute';
const codeCell = window.document.getElementById(targetCell);
const gutter = codeCell.querySelector('.code-annotation-gutter');
gutter.appendChild(gutterDiv);
}
gutterDiv.style.top = top - 2 + "px";
gutterDiv.style.height = height + 4 + "px";
}
selectedAnnoteEl = annoteEl;
}
};
const unselectCodeLines = () => {
const elementsIds = ["code-annotation-line-highlight", "code-annotation-line-highlight-gutter"];
elementsIds.forEach((elId) => {
const div = window.document.getElementById(elId);
if (div) {
div.remove();
}
});
selectedAnnoteEl = undefined;
};
// Handle positioning of the toggle
window.addEventListener(
"resize",
throttle(() => {
elRect = undefined;
if (selectedAnnoteEl) {
selectCodeLines(selectedAnnoteEl);
}
}, 10)
);
function throttle(fn, ms) {
let throttle = false;
let timer;
return (...args) => {
if(!throttle) { // first call gets through
fn.apply(this, args);
throttle = true;
} else { // all the others get throttled
if(timer) clearTimeout(timer); // cancel #2
timer = setTimeout(() => {
fn.apply(this, args);
timer = throttle = false;
}, ms);
}
};
}
// Attach click handler to the DT
const annoteDls = window.document.querySelectorAll('dt[data-target-cell]');
for (const annoteDlNode of annoteDls) {
annoteDlNode.addEventListener('click', (event) => {
const clickedEl = event.target;
if (clickedEl !== selectedAnnoteEl) {
unselectCodeLines();
const activeEl = window.document.querySelector('dt[data-target-cell].code-annotation-active');
if (activeEl) {
activeEl.classList.remove('code-annotation-active');
}
selectCodeLines(clickedEl);
clickedEl.classList.add('code-annotation-active');
} else {
// Unselect the line
unselectCodeLines();
clickedEl.classList.remove('code-annotation-active');
}
});
}
const findCites = (el) => {
const parentEl = el.parentElement;
if (parentEl) {
const cites = parentEl.dataset.cites;
if (cites) {
return {
el,
cites: cites.split(' ')
};
} else {
return findCites(el.parentElement)
}
} else {
return undefined;
}
};
var bibliorefs = window.document.querySelectorAll('a[role="doc-biblioref"]');
for (var i=0; i<bibliorefs.length; i++) {
const ref = bibliorefs[i];
const citeInfo = findCites(ref);
if (citeInfo) {
tippyHover(citeInfo.el, function() {
var popup = window.document.createElement('div');
citeInfo.cites.forEach(function(cite) {
var citeDiv = window.document.createElement('div');
citeDiv.classList.add('hanging-indent');
citeDiv.classList.add('csl-entry');
var biblioDiv = window.document.getElementById('ref-' + cite);
if (biblioDiv) {
citeDiv.innerHTML = biblioDiv.innerHTML;
}
popup.appendChild(citeDiv);
});
return popup.innerHTML;
});
}
}
});
</script>
</div> <!-- /content -->
</body></html>

View File

@@ -7,7 +7,7 @@
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
<title>core.trainer_builder Axolotl</title>
<title>core.builders.causal Axolotl</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
@@ -467,14 +467,11 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<h2 id="toc-title">On this page</h2>
<ul>
<li><a href="#axolotl.core.trainer_builder" id="toc-axolotl.core.trainer_builder" class="nav-link active" data-scroll-target="#axolotl.core.trainer_builder">core.trainer_builder</a>
<li><a href="#axolotl.core.builders.causal" id="toc-axolotl.core.builders.causal" class="nav-link active" data-scroll-target="#axolotl.core.builders.causal">core.builders.causal</a>
<ul class="collapse">
<li><a href="#classes" id="toc-classes" class="nav-link" data-scroll-target="#classes">Classes</a>
<ul class="collapse">
<li><a href="#axolotl.core.trainer_builder.HFCausalTrainerBuilder" id="toc-axolotl.core.trainer_builder.HFCausalTrainerBuilder" class="nav-link" data-scroll-target="#axolotl.core.trainer_builder.HFCausalTrainerBuilder">HFCausalTrainerBuilder</a></li>
<li><a href="#axolotl.core.trainer_builder.HFPPOTrainerBuilder" id="toc-axolotl.core.trainer_builder.HFPPOTrainerBuilder" class="nav-link" data-scroll-target="#axolotl.core.trainer_builder.HFPPOTrainerBuilder">HFPPOTrainerBuilder</a></li>
<li><a href="#axolotl.core.trainer_builder.HFRLTrainerBuilder" id="toc-axolotl.core.trainer_builder.HFRLTrainerBuilder" class="nav-link" data-scroll-target="#axolotl.core.trainer_builder.HFRLTrainerBuilder">HFRLTrainerBuilder</a></li>
<li><a href="#axolotl.core.trainer_builder.TrainerBuilderBase" id="toc-axolotl.core.trainer_builder.TrainerBuilderBase" class="nav-link" data-scroll-target="#axolotl.core.trainer_builder.TrainerBuilderBase">TrainerBuilderBase</a></li>
<li><a href="#axolotl.core.builders.causal.HFCausalTrainerBuilder" id="toc-axolotl.core.builders.causal.HFCausalTrainerBuilder" class="nav-link" data-scroll-target="#axolotl.core.builders.causal.HFCausalTrainerBuilder">HFCausalTrainerBuilder</a></li>
</ul></li>
</ul></li>
</ul>
@@ -486,10 +483,10 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.core.trainer_builder" class="level1">
<h1>core.trainer_builder</h1>
<p><code>core.trainer_builder</code></p>
<p>Builder for the training args and trainer</p>
<section id="axolotl.core.builders.causal" class="level1">
<h1>core.builders.causal</h1>
<p><code>core.builders.causal</code></p>
<p>Builder for causal trainers</p>
<section id="classes" class="level2">
<h2 class="anchored" data-anchor-id="classes">Classes</h2>
<table class="caption-top table">
@@ -501,93 +498,23 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</thead>
<tbody>
<tr class="odd">
<td><a href="#axolotl.core.trainer_builder.HFCausalTrainerBuilder">HFCausalTrainerBuilder</a></td>
<td><a href="#axolotl.core.builders.causal.HFCausalTrainerBuilder">HFCausalTrainerBuilder</a></td>
<td>Build the HuggingFace training args/trainer for causal models and reward modeling</td>
</tr>
<tr class="even">
<td><a href="#axolotl.core.trainer_builder.HFPPOTrainerBuilder">HFPPOTrainerBuilder</a></td>
<td>HF Factory class for PPO Trainer</td>
</tr>
<tr class="odd">
<td><a href="#axolotl.core.trainer_builder.HFRLTrainerBuilder">HFRLTrainerBuilder</a></td>
<td>Trainer factory class for TRL-based RLHF trainers (e.g.&nbsp;DPO)</td>
</tr>
<tr class="even">
<td><a href="#axolotl.core.trainer_builder.TrainerBuilderBase">TrainerBuilderBase</a></td>
<td>Base class for trainer builder.</td>
</tr>
</tbody>
</table>
<section id="axolotl.core.trainer_builder.HFCausalTrainerBuilder" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.trainer_builder.HFCausalTrainerBuilder">HFCausalTrainerBuilder</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainer_builder.HFCausalTrainerBuilder(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> cfg,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> model,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> processor<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<section id="axolotl.core.builders.causal.HFCausalTrainerBuilder" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.builders.causal.HFCausalTrainerBuilder">HFCausalTrainerBuilder</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.builders.causal.HFCausalTrainerBuilder(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> cfg,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> model,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> processor<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Build the HuggingFace training args/trainer for causal models and reward modeling
using TRL.</p>
</section>
<section id="axolotl.core.trainer_builder.HFPPOTrainerBuilder" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.trainer_builder.HFPPOTrainerBuilder">HFPPOTrainerBuilder</h3>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>core.trainer_builder.HFPPOTrainerBuilder(</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> cfg,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> model,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> processor<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>HF Factory class for PPO Trainer</p>
</section>
<section id="axolotl.core.trainer_builder.HFRLTrainerBuilder" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.trainer_builder.HFRLTrainerBuilder">HFRLTrainerBuilder</h3>
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>core.trainer_builder.HFRLTrainerBuilder(</span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> cfg,</span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> model,</span>
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a> processor<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Trainer factory class for TRL-based RLHF trainers (e.g.&nbsp;DPO)</p>
</section>
<section id="axolotl.core.trainer_builder.TrainerBuilderBase" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.trainer_builder.TrainerBuilderBase">TrainerBuilderBase</h3>
<div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>core.trainer_builder.TrainerBuilderBase(</span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> cfg,</span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> model,</span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> processor<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Base class for trainer builder.</p>
<section id="methods" class="level4">
<h4 class="anchored" data-anchor-id="methods">Methods</h4>
<table class="caption-top table">
<thead>
<tr class="header">
<th>Name</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><a href="#axolotl.core.trainer_builder.TrainerBuilderBase.get_post_trainer_create_callbacks">get_post_trainer_create_callbacks</a></td>
<td>Callbacks added after the trainer is created, usually b/c these need access to the trainer</td>
</tr>
</tbody>
</table>
<section id="axolotl.core.trainer_builder.TrainerBuilderBase.get_post_trainer_create_callbacks" class="level5">
<h5 class="anchored" data-anchor-id="axolotl.core.trainer_builder.TrainerBuilderBase.get_post_trainer_create_callbacks">get_post_trainer_create_callbacks</h5>
<div class="sourceCode" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>core.trainer_builder.TrainerBuilderBase.get_post_trainer_create_callbacks(</span>
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a> trainer,</span>
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Callbacks added after the trainer is created, usually b/c these need access to the trainer</p>
</section>
</section>
</section>
</section>
</section>

View File

@@ -0,0 +1,931 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>
<meta charset="utf-8">
<meta name="generator" content="quarto-1.7.31">
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
<title>core.builders.rl Axolotl</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
div.columns{display: flex; gap: min(4vw, 1.5em);}
div.column{flex: auto; overflow-x: auto;}
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
ul.task-list{list-style: none;}
ul.task-list li input[type="checkbox"] {
width: 0.8em;
margin: 0 0.8em 0.2em -1em; /* quarto-specific, see https://github.com/quarto-dev/quarto-cli/issues/4556 */
vertical-align: middle;
}
/* CSS for syntax highlighting */
html { -webkit-text-size-adjust: 100%; }
pre > code.sourceCode { white-space: pre; position: relative; }
pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
pre > code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
div.sourceCode { margin: 1em 0; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
}
pre.numberSource { margin-left: 3em; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
</style>
<script src="../../site_libs/quarto-nav/quarto-nav.js"></script>
<script src="../../site_libs/clipboard/clipboard.min.js"></script>
<script src="../../site_libs/quarto-search/autocomplete.umd.js"></script>
<script src="../../site_libs/quarto-search/fuse.min.js"></script>
<script src="../../site_libs/quarto-search/quarto-search.js"></script>
<meta name="quarto:offset" content="../../">
<link href="../../favicon.jpg" rel="icon" type="image/jpeg">
<script src="../../site_libs/quarto-html/quarto.js" type="module"></script>
<script src="../../site_libs/quarto-html/tabsets/tabsets.js" type="module"></script>
<script src="../../site_libs/quarto-html/popper.min.js"></script>
<script src="../../site_libs/quarto-html/tippy.umd.min.js"></script>
<script src="../../site_libs/quarto-html/anchor.min.js"></script>
<link href="../../site_libs/quarto-html/tippy.css" rel="stylesheet">
<link href="../../site_libs/quarto-html/quarto-syntax-highlighting-dark-8ef56b68f8fa1e9d2ba328e99e439f80.css" rel="stylesheet" id="quarto-text-highlighting-styles">
<script src="../../site_libs/bootstrap/bootstrap.min.js"></script>
<link href="../../site_libs/bootstrap/bootstrap-icons.css" rel="stylesheet">
<link href="../../site_libs/bootstrap/bootstrap-2288ecdcbf81d2ab6432743cedd71d9a.min.css" rel="stylesheet" append-hash="true" id="quarto-bootstrap" data-mode="dark">
<script id="quarto-search-options" type="application/json">{
"location": "navbar",
"copy-button": false,
"collapse-after": 3,
"panel-placement": "end",
"type": "overlay",
"limit": 50,
"keyboard-shortcut": [
"f",
"/",
"s"
],
"show-item-context": false,
"language": {
"search-no-results-text": "No results",
"search-matching-documents-text": "matching documents",
"search-copy-link-title": "Copy link to search",
"search-hide-matches-text": "Hide additional matches",
"search-more-match-text": "more match in this document",
"search-more-matches-text": "more matches in this document",
"search-clear-button-title": "Clear",
"search-text-placeholder": "",
"search-detached-cancel-button-title": "Cancel",
"search-submit-button-title": "Submit",
"search-label": "Search"
}
}</script>
<script async="" src="https://www.googletagmanager.com/gtag/js?id=G-9KYCVJBNMQ"></script>
<script type="text/javascript">
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</script>
<link rel="stylesheet" href="../../styles.css">
</head>
<body class="nav-sidebar docked nav-fixed quarto-light">
<div id="quarto-search-results"></div>
<header id="quarto-header" class="headroom fixed-top">
<nav class="navbar navbar-expand " data-bs-theme="dark">
<div class="navbar-container container-fluid">
<div class="navbar-brand-container mx-auto">
<a href="../../index.html" class="navbar-brand navbar-brand-logo">
<img src="../../image/axolotl_logo_digital_white.svg" alt="" class="navbar-logo">
</a>
</div>
<div class="quarto-navbar-tools tools-wide tools-end">
<a href="https://twitter.com/axolotl_ai" title="" class="quarto-navigation-tool px-1" aria-label=""><i class="bi bi-twitter"></i></a>
<a href="https://github.com/axolotl-ai-cloud/axolotl/" title="" class="quarto-navigation-tool px-1" aria-label=""><i class="bi bi-github"></i></a>
<a href="https://discord.gg/7m9sfhzaf3" title="" class="quarto-navigation-tool px-1" aria-label=""><i class="bi bi-discord"></i></a>
</div>
<div id="quarto-search" class="" title="Search"></div>
</div> <!-- /container-fluid -->
</nav>
<nav class="quarto-secondary-nav">
<div class="container-fluid d-flex">
<button type="button" class="quarto-btn-toggle btn" data-bs-toggle="collapse" role="button" data-bs-target=".quarto-sidebar-collapse-item" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
<i class="bi bi-layout-text-sidebar-reverse"></i>
</button>
<nav class="quarto-page-breadcrumbs" aria-label="breadcrumb"><ol class="breadcrumb"></ol></nav>
<a class="flex-grow-1" role="navigation" data-bs-toggle="collapse" data-bs-target=".quarto-sidebar-collapse-item" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
</a>
</div>
</nav>
</header>
<!-- content -->
<div id="quarto-content" class="quarto-container page-columns page-rows-contents page-layout-article page-navbar">
<!-- sidebar -->
<nav id="quarto-sidebar" class="sidebar collapse collapse-horizontal quarto-sidebar-collapse-item sidebar-navigation docked overflow-auto">
<div class="sidebar-menu-container">
<ul class="list-unstyled mt-1">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../index.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Home</span></a>
</div>
</li>
<li class="sidebar-item sidebar-item-section">
<div class="sidebar-item-container">
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-1" role="navigation" aria-expanded="true">
<span class="menu-text">Getting Started</span></a>
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-1" role="navigation" aria-expanded="true" aria-label="Toggle section">
<i class="bi bi-chevron-right ms-2"></i>
</a>
</div>
<ul id="quarto-sidebar-section-1" class="collapse list-unstyled sidebar-section depth1 show">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/getting-started.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Quickstart</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/installation.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Installation</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/inference.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Inference and Merging</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/cli.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Command Line Interface (CLI)</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/config.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Config Reference</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/api" class="sidebar-item-text sidebar-link">
<span class="menu-text">API Reference</span></a>
</div>
</li>
</ul>
</li>
<li class="sidebar-item sidebar-item-section">
<div class="sidebar-item-container">
<a href="../../docs/dataset-formats/index.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Dataset Formats</span></a>
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-2" role="navigation" aria-expanded="true" aria-label="Toggle section">
<i class="bi bi-chevron-right ms-2"></i>
</a>
</div>
<ul id="quarto-sidebar-section-2" class="collapse list-unstyled sidebar-section depth1 show">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/dataset-formats/pretraining.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Pre-training</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/dataset-formats/inst_tune.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Instruction Tuning</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/dataset-formats/conversation.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Conversation</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/dataset-formats/stepwise_supervised.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Stepwise Supervised Format</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/dataset-formats/template_free.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Template-Free</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/dataset-formats/tokenized.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Custom Pre-Tokenized Dataset</span></a>
</div>
</li>
</ul>
</li>
<li class="sidebar-item sidebar-item-section">
<div class="sidebar-item-container">
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-3" role="navigation" aria-expanded="true">
<span class="menu-text">Deployments</span></a>
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-3" role="navigation" aria-expanded="true" aria-label="Toggle section">
<i class="bi bi-chevron-right ms-2"></i>
</a>
</div>
<ul id="quarto-sidebar-section-3" class="collapse list-unstyled sidebar-section depth1 show">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/docker.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Docker</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/multi-gpu.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Multi-GPU</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/multi-node.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Multi Node</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/ray-integration.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Ray Train</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/amd_hpc.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">AMD GPUs on HPC Systems</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/mac.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Mac M-series</span></a>
</div>
</li>
</ul>
</li>
<li class="sidebar-item sidebar-item-section">
<div class="sidebar-item-container">
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-4" role="navigation" aria-expanded="true">
<span class="menu-text">How To Guides</span></a>
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-4" role="navigation" aria-expanded="true" aria-label="Toggle section">
<i class="bi bi-chevron-right ms-2"></i>
</a>
</div>
<ul id="quarto-sidebar-section-4" class="collapse list-unstyled sidebar-section depth1 show">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/multimodal.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">MultiModal / Vision Language Models (BETA)</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/rlhf.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">RLHF (Beta)</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/reward_modelling.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Reward Modelling</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/lr_groups.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Learning Rate Groups</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/lora_optims.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">LoRA Optimizations</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/dataset_loading.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Dataset Loading</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/qat.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Quantization Aware Training (QAT)</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/quantize.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Quantization with torchao</span></a>
</div>
</li>
</ul>
</li>
<li class="sidebar-item sidebar-item-section">
<div class="sidebar-item-container">
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-5" role="navigation" aria-expanded="true">
<span class="menu-text">Core Concepts</span></a>
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-5" role="navigation" aria-expanded="true" aria-label="Toggle section">
<i class="bi bi-chevron-right ms-2"></i>
</a>
</div>
<ul id="quarto-sidebar-section-5" class="collapse list-unstyled sidebar-section depth1 show">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/batch_vs_grad.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Batch size vs Gradient accumulation</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/dataset_preprocessing.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Dataset Preprocessing</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/multipack.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Multipack (Sample Packing)</span></a>
</div>
</li>
</ul>
</li>
<li class="sidebar-item sidebar-item-section">
<div class="sidebar-item-container">
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-6" role="navigation" aria-expanded="true">
<span class="menu-text">Advanced Features</span></a>
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-6" role="navigation" aria-expanded="true" aria-label="Toggle section">
<i class="bi bi-chevron-right ms-2"></i>
</a>
</div>
<ul id="quarto-sidebar-section-6" class="collapse list-unstyled sidebar-section depth1 show">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/fsdp_qlora.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">FDSP + QLoRA</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/unsloth.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Unsloth</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/torchao.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">PyTorch ao</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/custom_integrations.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Custom Integrations</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/sequence_parallelism.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Sequence Parallelism</span></a>
</div>
</li>
</ul>
</li>
<li class="sidebar-item sidebar-item-section">
<div class="sidebar-item-container">
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-7" role="navigation" aria-expanded="true">
<span class="menu-text">Troubleshooting</span></a>
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-7" role="navigation" aria-expanded="true" aria-label="Toggle section">
<i class="bi bi-chevron-right ms-2"></i>
</a>
</div>
<ul id="quarto-sidebar-section-7" class="collapse list-unstyled sidebar-section depth1 show">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/faq.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">FAQ</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/debugging.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">Debugging</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../../docs/nccl.html" class="sidebar-item-text sidebar-link">
<span class="menu-text">NCCL</span></a>
</div>
</li>
</ul>
</li>
</ul>
</div>
</nav>
<div id="quarto-sidebar-glass" class="quarto-sidebar-collapse-item" data-bs-toggle="collapse" data-bs-target=".quarto-sidebar-collapse-item"></div>
<!-- margin-sidebar -->
<div id="quarto-margin-sidebar" class="sidebar margin-sidebar">
<nav id="TOC" role="doc-toc" class="toc-active">
<h2 id="toc-title">On this page</h2>
<ul>
<li><a href="#axolotl.core.builders.rl" id="toc-axolotl.core.builders.rl" class="nav-link active" data-scroll-target="#axolotl.core.builders.rl">core.builders.rl</a>
<ul class="collapse">
<li><a href="#classes" id="toc-classes" class="nav-link" data-scroll-target="#classes">Classes</a>
<ul class="collapse">
<li><a href="#axolotl.core.builders.rl.HFPPOTrainerBuilder" id="toc-axolotl.core.builders.rl.HFPPOTrainerBuilder" class="nav-link" data-scroll-target="#axolotl.core.builders.rl.HFPPOTrainerBuilder">HFPPOTrainerBuilder</a></li>
<li><a href="#axolotl.core.builders.rl.HFRLTrainerBuilder" id="toc-axolotl.core.builders.rl.HFRLTrainerBuilder" class="nav-link" data-scroll-target="#axolotl.core.builders.rl.HFRLTrainerBuilder">HFRLTrainerBuilder</a></li>
</ul></li>
</ul></li>
</ul>
</nav>
</div>
<!-- main -->
<main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block"></header>
<section id="axolotl.core.builders.rl" class="level1">
<h1>core.builders.rl</h1>
<p><code>core.builders.rl</code></p>
<p>Builder for RLHF trainers</p>
<section id="classes" class="level2">
<h2 class="anchored" data-anchor-id="classes">Classes</h2>
<table class="caption-top table">
<thead>
<tr class="header">
<th>Name</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><a href="#axolotl.core.builders.rl.HFPPOTrainerBuilder">HFPPOTrainerBuilder</a></td>
<td>HF Factory class for PPO Trainer</td>
</tr>
<tr class="even">
<td><a href="#axolotl.core.builders.rl.HFRLTrainerBuilder">HFRLTrainerBuilder</a></td>
<td>Trainer factory class for TRL-based RLHF trainers (e.g.&nbsp;DPO)</td>
</tr>
</tbody>
</table>
<section id="axolotl.core.builders.rl.HFPPOTrainerBuilder" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.builders.rl.HFPPOTrainerBuilder">HFPPOTrainerBuilder</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.builders.rl.HFPPOTrainerBuilder(cfg, model, tokenizer, processor<span class="op">=</span><span class="va">None</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>HF Factory class for PPO Trainer</p>
</section>
<section id="axolotl.core.builders.rl.HFRLTrainerBuilder" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.builders.rl.HFRLTrainerBuilder">HFRLTrainerBuilder</h3>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>core.builders.rl.HFRLTrainerBuilder(cfg, model, tokenizer, processor<span class="op">=</span><span class="va">None</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Trainer factory class for TRL-based RLHF trainers (e.g.&nbsp;DPO)</p>
</section>
</section>
</section>
</main> <!-- /main -->
<script id="quarto-html-after-body" type="application/javascript">
window.document.addEventListener("DOMContentLoaded", function (event) {
const icon = "";
const anchorJS = new window.AnchorJS();
anchorJS.options = {
placement: 'right',
icon: icon
};
anchorJS.add('.anchored');
const isCodeAnnotation = (el) => {
for (const clz of el.classList) {
if (clz.startsWith('code-annotation-')) {
return true;
}
}
return false;
}
const onCopySuccess = function(e) {
// button target
const button = e.trigger;
// don't keep focus
button.blur();
// flash "checked"
button.classList.add('code-copy-button-checked');
var currentTitle = button.getAttribute("title");
button.setAttribute("title", "Copied!");
let tooltip;
if (window.bootstrap) {
button.setAttribute("data-bs-toggle", "tooltip");
button.setAttribute("data-bs-placement", "left");
button.setAttribute("data-bs-title", "Copied!");
tooltip = new bootstrap.Tooltip(button,
{ trigger: "manual",
customClass: "code-copy-button-tooltip",
offset: [0, -8]});
tooltip.show();
}
setTimeout(function() {
if (tooltip) {
tooltip.hide();
button.removeAttribute("data-bs-title");
button.removeAttribute("data-bs-toggle");
button.removeAttribute("data-bs-placement");
}
button.setAttribute("title", currentTitle);
button.classList.remove('code-copy-button-checked');
}, 1000);
// clear code selection
e.clearSelection();
}
const getTextToCopy = function(trigger) {
const codeEl = trigger.previousElementSibling.cloneNode(true);
for (const childEl of codeEl.children) {
if (isCodeAnnotation(childEl)) {
childEl.remove();
}
}
return codeEl.innerText;
}
const clipboard = new window.ClipboardJS('.code-copy-button:not([data-in-quarto-modal])', {
text: getTextToCopy
});
clipboard.on('success', onCopySuccess);
if (window.document.getElementById('quarto-embedded-source-code-modal')) {
const clipboardModal = new window.ClipboardJS('.code-copy-button[data-in-quarto-modal]', {
text: getTextToCopy,
container: window.document.getElementById('quarto-embedded-source-code-modal')
});
clipboardModal.on('success', onCopySuccess);
}
var localhostRegex = new RegExp(/^(?:http|https):\/\/localhost\:?[0-9]*\//);
var mailtoRegex = new RegExp(/^mailto:/);
var filterRegex = new RegExp("https:\/\/docs\.axolotl\.ai");
var isInternal = (href) => {
return filterRegex.test(href) || localhostRegex.test(href) || mailtoRegex.test(href);
}
// Inspect non-navigation links and adorn them if external
var links = window.document.querySelectorAll('a[href]:not(.nav-link):not(.navbar-brand):not(.toc-action):not(.sidebar-link):not(.sidebar-item-toggle):not(.pagination-link):not(.no-external):not([aria-hidden]):not(.dropdown-item):not(.quarto-navigation-tool):not(.about-link)');
for (var i=0; i<links.length; i++) {
const link = links[i];
if (!isInternal(link.href)) {
// undo the damage that might have been done by quarto-nav.js in the case of
// links that we want to consider external
if (link.dataset.originalHref !== undefined) {
link.href = link.dataset.originalHref;
}
}
}
function tippyHover(el, contentFn, onTriggerFn, onUntriggerFn) {
const config = {
allowHTML: true,
maxWidth: 500,
delay: 100,
arrow: false,
appendTo: function(el) {
return el.parentElement;
},
interactive: true,
interactiveBorder: 10,
theme: 'quarto',
placement: 'bottom-start',
};
if (contentFn) {
config.content = contentFn;
}
if (onTriggerFn) {
config.onTrigger = onTriggerFn;
}
if (onUntriggerFn) {
config.onUntrigger = onUntriggerFn;
}
window.tippy(el, config);
}
const noterefs = window.document.querySelectorAll('a[role="doc-noteref"]');
for (var i=0; i<noterefs.length; i++) {
const ref = noterefs[i];
tippyHover(ref, function() {
// use id or data attribute instead here
let href = ref.getAttribute('data-footnote-href') || ref.getAttribute('href');
try { href = new URL(href).hash; } catch {}
const id = href.replace(/^#\/?/, "");
const note = window.document.getElementById(id);
if (note) {
return note.innerHTML;
} else {
return "";
}
});
}
const xrefs = window.document.querySelectorAll('a.quarto-xref');
const processXRef = (id, note) => {
// Strip column container classes
const stripColumnClz = (el) => {
el.classList.remove("page-full", "page-columns");
if (el.children) {
for (const child of el.children) {
stripColumnClz(child);
}
}
}
stripColumnClz(note)
if (id === null || id.startsWith('sec-')) {
// Special case sections, only their first couple elements
const container = document.createElement("div");
if (note.children && note.children.length > 2) {
container.appendChild(note.children[0].cloneNode(true));
for (let i = 1; i < note.children.length; i++) {
const child = note.children[i];
if (child.tagName === "P" && child.innerText === "") {
continue;
} else {
container.appendChild(child.cloneNode(true));
break;
}
}
if (window.Quarto?.typesetMath) {
window.Quarto.typesetMath(container);
}
return container.innerHTML
} else {
if (window.Quarto?.typesetMath) {
window.Quarto.typesetMath(note);
}
return note.innerHTML;
}
} else {
// Remove any anchor links if they are present
const anchorLink = note.querySelector('a.anchorjs-link');
if (anchorLink) {
anchorLink.remove();
}
if (window.Quarto?.typesetMath) {
window.Quarto.typesetMath(note);
}
if (note.classList.contains("callout")) {
return note.outerHTML;
} else {
return note.innerHTML;
}
}
}
for (var i=0; i<xrefs.length; i++) {
const xref = xrefs[i];
tippyHover(xref, undefined, function(instance) {
instance.disable();
let url = xref.getAttribute('href');
let hash = undefined;
if (url.startsWith('#')) {
hash = url;
} else {
try { hash = new URL(url).hash; } catch {}
}
if (hash) {
const id = hash.replace(/^#\/?/, "");
const note = window.document.getElementById(id);
if (note !== null) {
try {
const html = processXRef(id, note.cloneNode(true));
instance.setContent(html);
} finally {
instance.enable();
instance.show();
}
} else {
// See if we can fetch this
fetch(url.split('#')[0])
.then(res => res.text())
.then(html => {
const parser = new DOMParser();
const htmlDoc = parser.parseFromString(html, "text/html");
const note = htmlDoc.getElementById(id);
if (note !== null) {
const html = processXRef(id, note);
instance.setContent(html);
}
}).finally(() => {
instance.enable();
instance.show();
});
}
} else {
// See if we can fetch a full url (with no hash to target)
// This is a special case and we should probably do some content thinning / targeting
fetch(url)
.then(res => res.text())
.then(html => {
const parser = new DOMParser();
const htmlDoc = parser.parseFromString(html, "text/html");
const note = htmlDoc.querySelector('main.content');
if (note !== null) {
// This should only happen for chapter cross references
// (since there is no id in the URL)
// remove the first header
if (note.children.length > 0 && note.children[0].tagName === "HEADER") {
note.children[0].remove();
}
const html = processXRef(null, note);
instance.setContent(html);
}
}).finally(() => {
instance.enable();
instance.show();
});
}
}, function(instance) {
});
}
let selectedAnnoteEl;
const selectorForAnnotation = ( cell, annotation) => {
let cellAttr = 'data-code-cell="' + cell + '"';
let lineAttr = 'data-code-annotation="' + annotation + '"';
const selector = 'span[' + cellAttr + '][' + lineAttr + ']';
return selector;
}
const selectCodeLines = (annoteEl) => {
const doc = window.document;
const targetCell = annoteEl.getAttribute("data-target-cell");
const targetAnnotation = annoteEl.getAttribute("data-target-annotation");
const annoteSpan = window.document.querySelector(selectorForAnnotation(targetCell, targetAnnotation));
const lines = annoteSpan.getAttribute("data-code-lines").split(",");
const lineIds = lines.map((line) => {
return targetCell + "-" + line;
})
let top = null;
let height = null;
let parent = null;
if (lineIds.length > 0) {
//compute the position of the single el (top and bottom and make a div)
const el = window.document.getElementById(lineIds[0]);
top = el.offsetTop;
height = el.offsetHeight;
parent = el.parentElement.parentElement;
if (lineIds.length > 1) {
const lastEl = window.document.getElementById(lineIds[lineIds.length - 1]);
const bottom = lastEl.offsetTop + lastEl.offsetHeight;
height = bottom - top;
}
if (top !== null && height !== null && parent !== null) {
// cook up a div (if necessary) and position it
let div = window.document.getElementById("code-annotation-line-highlight");
if (div === null) {
div = window.document.createElement("div");
div.setAttribute("id", "code-annotation-line-highlight");
div.style.position = 'absolute';
parent.appendChild(div);
}
div.style.top = top - 2 + "px";
div.style.height = height + 4 + "px";
div.style.left = 0;
let gutterDiv = window.document.getElementById("code-annotation-line-highlight-gutter");
if (gutterDiv === null) {
gutterDiv = window.document.createElement("div");
gutterDiv.setAttribute("id", "code-annotation-line-highlight-gutter");
gutterDiv.style.position = 'absolute';
const codeCell = window.document.getElementById(targetCell);
const gutter = codeCell.querySelector('.code-annotation-gutter');
gutter.appendChild(gutterDiv);
}
gutterDiv.style.top = top - 2 + "px";
gutterDiv.style.height = height + 4 + "px";
}
selectedAnnoteEl = annoteEl;
}
};
const unselectCodeLines = () => {
const elementsIds = ["code-annotation-line-highlight", "code-annotation-line-highlight-gutter"];
elementsIds.forEach((elId) => {
const div = window.document.getElementById(elId);
if (div) {
div.remove();
}
});
selectedAnnoteEl = undefined;
};
// Handle positioning of the toggle
window.addEventListener(
"resize",
throttle(() => {
elRect = undefined;
if (selectedAnnoteEl) {
selectCodeLines(selectedAnnoteEl);
}
}, 10)
);
function throttle(fn, ms) {
let throttle = false;
let timer;
return (...args) => {
if(!throttle) { // first call gets through
fn.apply(this, args);
throttle = true;
} else { // all the others get throttled
if(timer) clearTimeout(timer); // cancel #2
timer = setTimeout(() => {
fn.apply(this, args);
timer = throttle = false;
}, ms);
}
};
}
// Attach click handler to the DT
const annoteDls = window.document.querySelectorAll('dt[data-target-cell]');
for (const annoteDlNode of annoteDls) {
annoteDlNode.addEventListener('click', (event) => {
const clickedEl = event.target;
if (clickedEl !== selectedAnnoteEl) {
unselectCodeLines();
const activeEl = window.document.querySelector('dt[data-target-cell].code-annotation-active');
if (activeEl) {
activeEl.classList.remove('code-annotation-active');
}
selectCodeLines(clickedEl);
clickedEl.classList.add('code-annotation-active');
} else {
// Unselect the line
unselectCodeLines();
clickedEl.classList.remove('code-annotation-active');
}
});
}
const findCites = (el) => {
const parentEl = el.parentElement;
if (parentEl) {
const cites = parentEl.dataset.cites;
if (cites) {
return {
el,
cites: cites.split(' ')
};
} else {
return findCites(el.parentElement)
}
} else {
return undefined;
}
};
var bibliorefs = window.document.querySelectorAll('a[role="doc-biblioref"]');
for (var i=0; i<bibliorefs.length; i++) {
const ref = bibliorefs[i];
const citeInfo = findCites(ref);
if (citeInfo) {
tippyHover(citeInfo.el, function() {
var popup = window.document.createElement('div');
citeInfo.cites.forEach(function(cite) {
var citeDiv = window.document.createElement('div');
citeDiv.classList.add('hanging-indent');
citeDiv.classList.add('csl-entry');
var biblioDiv = window.document.getElementById('ref-' + cite);
if (biblioDiv) {
citeDiv.innerHTML = biblioDiv.innerHTML;
}
popup.appendChild(citeDiv);
});
return popup.innerHTML;
});
}
}
});
</script>
</div> <!-- /content -->
</body></html>

View File

@@ -506,16 +506,15 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.core.datasets.chat.TokenizedChatDataset" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.datasets.chat.TokenizedChatDataset">TokenizedChatDataset</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.datasets.chat.TokenizedChatDataset(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> data,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> model_transform,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> message_transform<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> formatter<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> process_count<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> keep_in_memory<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> data,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> model_transform,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> message_transform<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> formatter<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> process_count<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> keep_in_memory<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Tokenized chat dataset</p>

View File

@@ -506,13 +506,12 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.core.trainers.base.AxolotlTrainer" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.base.AxolotlTrainer">AxolotlTrainer</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.base.AxolotlTrainer(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>_args,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> bench_data_collator<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> eval_data_collator<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> dataset_tags<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>_args,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> bench_data_collator<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> eval_data_collator<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> dataset_tags<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Extend the base Trainer for axolotl helpers</p>
<section id="methods" class="level4">
<h4 class="anchored" data-anchor-id="methods">Methods</h4>

View File

@@ -505,12 +505,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</table>
<section id="axolotl.core.trainers.dpo.trainer.AxolotlDPOTrainer" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.dpo.trainer.AxolotlDPOTrainer">AxolotlDPOTrainer</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.dpo.trainer.AxolotlDPOTrainer(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> dataset_tags<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.dpo.trainer.AxolotlDPOTrainer(<span class="op">*</span>args, dataset_tags<span class="op">=</span><span class="va">None</span>, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Extend the base DPOTrainer for axolotl helpers.</p>
<section id="methods" class="level4">
<h4 class="anchored" data-anchor-id="methods">Methods</h4>

View File

@@ -509,18 +509,17 @@ sequence parallel group.</p>
<section id="axolotl.core.trainers.grpo.sampler.SequenceParallelRepeatRandomSampler" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.grpo.sampler.SequenceParallelRepeatRandomSampler">SequenceParallelRepeatRandomSampler</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.grpo.sampler.SequenceParallelRepeatRandomSampler(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> dataset,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> mini_repeat_count,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> world_size,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> rank,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> batch_size<span class="op">=</span><span class="dv">1</span>,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> repeat_count<span class="op">=</span><span class="dv">1</span>,</span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> sequence_parallel_degree<span class="op">=</span><span class="dv">1</span>,</span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> shuffle<span class="op">=</span><span class="va">True</span>,</span>
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a> seed<span class="op">=</span><span class="dv">0</span>,</span>
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a> drop_last<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> dataset,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> mini_repeat_count,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> world_size,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> rank,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> batch_size<span class="op">=</span><span class="dv">1</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> repeat_count<span class="op">=</span><span class="dv">1</span>,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> sequence_parallel_degree<span class="op">=</span><span class="dv">1</span>,</span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> shuffle<span class="op">=</span><span class="va">True</span>,</span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> seed<span class="op">=</span><span class="dv">0</span>,</span>
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a> drop_last<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Sampler for GRPO training with sequence parallelism.</p>
<p>This sampler ensures:
- Ranks in the same sequence parallel (SP) group receive identical data.

View File

@@ -511,17 +511,17 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.core.trainers.grpo.trainer.AxolotlGRPOSequenceParallelTrainer" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.grpo.trainer.AxolotlGRPOSequenceParallelTrainer">AxolotlGRPOSequenceParallelTrainer</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.grpo.trainer.AxolotlGRPOSequenceParallelTrainer(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> model,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> reward_funcs,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> args<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> train_dataset<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> eval_dataset<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> processing_class<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> reward_processing_classes<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> callbacks<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a> optimizers<span class="op">=</span>(<span class="va">None</span>, <span class="va">None</span>),</span>
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a> peft_config<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> model,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> reward_funcs,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> args<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> train_dataset<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> eval_dataset<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> processing_class<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> reward_processing_classes<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> callbacks<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> optimizers<span class="op">=</span>(<span class="va">None</span>, <span class="va">None</span>),</span>
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a> peft_config<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a> optimizer_cls_and_kwargs<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Extend the base GRPOTrainer for sequence parallelism handling</p>
<section id="methods" class="level4">
@@ -550,7 +550,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</section>
<section id="axolotl.core.trainers.grpo.trainer.AxolotlGRPOTrainer" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.grpo.trainer.AxolotlGRPOTrainer">AxolotlGRPOTrainer</h3>
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>core.trainers.grpo.trainer.AxolotlGRPOTrainer()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>core.trainers.grpo.trainer.AxolotlGRPOTrainer(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Extend the base GRPOTrainer for axolotl helpers</p>

View File

@@ -506,13 +506,12 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.core.trainers.mamba.AxolotlMambaTrainer" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.mamba.AxolotlMambaTrainer">AxolotlMambaTrainer</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.mamba.AxolotlMambaTrainer(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>_args,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> bench_data_collator<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> eval_data_collator<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> dataset_tags<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>_args,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> bench_data_collator<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> eval_data_collator<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> dataset_tags<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Mamba specific trainer to handle loss calculation</p>

View File

@@ -471,6 +471,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<ul class="collapse">
<li><a href="#classes" id="toc-classes" class="nav-link" data-scroll-target="#classes">Classes</a>
<ul class="collapse">
<li><a href="#axolotl.core.trainers.mixins.optimizer.OptimizerInitMixin" id="toc-axolotl.core.trainers.mixins.optimizer.OptimizerInitMixin" class="nav-link" data-scroll-target="#axolotl.core.trainers.mixins.optimizer.OptimizerInitMixin">OptimizerInitMixin</a></li>
<li><a href="#axolotl.core.trainers.mixins.optimizer.OptimizerMixin" id="toc-axolotl.core.trainers.mixins.optimizer.OptimizerMixin" class="nav-link" data-scroll-target="#axolotl.core.trainers.mixins.optimizer.OptimizerMixin">OptimizerMixin</a></li>
</ul></li>
</ul></li>
@@ -498,14 +499,24 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</thead>
<tbody>
<tr class="odd">
<td><a href="#axolotl.core.trainers.mixins.optimizer.OptimizerInitMixin">OptimizerInitMixin</a></td>
<td>Mixin to handle common optimizer initialization logic for Trainers (mostly TRL) that do not</td>
</tr>
<tr class="even">
<td><a href="#axolotl.core.trainers.mixins.optimizer.OptimizerMixin">OptimizerMixin</a></td>
<td>Mixin class for shared handling of building custom optimizers</td>
</tr>
</tbody>
</table>
<section id="axolotl.core.trainers.mixins.optimizer.OptimizerInitMixin" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.mixins.optimizer.OptimizerInitMixin">OptimizerInitMixin</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.mixins.optimizer.OptimizerInitMixin(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Mixin to handle common optimizer initialization logic for Trainers (mostly TRL) that do not
accept optimizer_cls_and_kwargs as kwarg in constructor.</p>
</section>
<section id="axolotl.core.trainers.mixins.optimizer.OptimizerMixin" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.mixins.optimizer.OptimizerMixin">OptimizerMixin</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.mixins.optimizer.OptimizerMixin()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>core.trainers.mixins.optimizer.OptimizerMixin()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Mixin class for shared handling of building custom optimizers</p>

View File

@@ -505,7 +505,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</table>
<section id="axolotl.core.trainers.relora.ReLoRATrainer" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.relora.ReLoRATrainer">ReLoRATrainer</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.relora.ReLoRATrainer(<span class="va">self</span>, <span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.relora.ReLoRATrainer(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Trainer subclass that uses the <code>OneCycleLR</code> scheduler</p>

View File

@@ -530,84 +530,32 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</table>
<section id="axolotl.core.trainers.trl.AxolotlCPOTrainer" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.trl.AxolotlCPOTrainer">AxolotlCPOTrainer</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlCPOTrainer()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlCPOTrainer(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Extend the base CPOTrainer for axolotl helpers</p>
<section id="methods" class="level4">
<h4 class="anchored" data-anchor-id="methods">Methods</h4>
<table class="caption-top table">
<thead>
<tr class="header">
<th>Name</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><a href="#axolotl.core.trainers.trl.AxolotlCPOTrainer.get_batch_loss_metrics">get_batch_loss_metrics</a></td>
<td>Compute the CPO loss and other metrics for the given batch of inputs for train or test.</td>
</tr>
</tbody>
</table>
<section id="axolotl.core.trainers.trl.AxolotlCPOTrainer.get_batch_loss_metrics" class="level5">
<h5 class="anchored" data-anchor-id="axolotl.core.trainers.trl.AxolotlCPOTrainer.get_batch_loss_metrics">get_batch_loss_metrics</h5>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlCPOTrainer.get_batch_loss_metrics(</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> model,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> batch,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> train_eval<span class="op">=</span><span class="st">'train'</span>,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Compute the CPO loss and other metrics for the given batch of inputs for train or test.</p>
</section>
</section>
</section>
<section id="axolotl.core.trainers.trl.AxolotlKTOTrainer" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.trl.AxolotlKTOTrainer">AxolotlKTOTrainer</h3>
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlKTOTrainer()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlKTOTrainer(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Extend the base KTOTrainer for axolotl helpers</p>
</section>
<section id="axolotl.core.trainers.trl.AxolotlORPOTrainer" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.trl.AxolotlORPOTrainer">AxolotlORPOTrainer</h3>
<div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlORPOTrainer()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlORPOTrainer(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Extend the base ORPOTrainer for axolotl helpers</p>
<section id="methods-1" class="level4">
<h4 class="anchored" data-anchor-id="methods-1">Methods</h4>
<table class="caption-top table">
<thead>
<tr class="header">
<th>Name</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><a href="#axolotl.core.trainers.trl.AxolotlORPOTrainer.get_batch_loss_metrics">get_batch_loss_metrics</a></td>
<td>Compute the ORPO loss and other metrics for the given batch of inputs for train or test.</td>
</tr>
</tbody>
</table>
<section id="axolotl.core.trainers.trl.AxolotlORPOTrainer.get_batch_loss_metrics" class="level5">
<h5 class="anchored" data-anchor-id="axolotl.core.trainers.trl.AxolotlORPOTrainer.get_batch_loss_metrics">get_batch_loss_metrics</h5>
<div class="sourceCode" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlORPOTrainer.get_batch_loss_metrics(</span>
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a> model,</span>
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a> batch,</span>
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a> train_eval<span class="op">=</span><span class="st">'train'</span>,</span>
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Compute the ORPO loss and other metrics for the given batch of inputs for train or test.</p>
</section>
</section>
</section>
<section id="axolotl.core.trainers.trl.AxolotlPRMTrainer" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.trl.AxolotlPRMTrainer">AxolotlPRMTrainer</h3>
<div class="sourceCode" id="cb6"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlPRMTrainer()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlPRMTrainer(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Extend the base trl.PRMTrainer for axolotl helpers</p>
</section>
<section id="axolotl.core.trainers.trl.AxolotlRewardTrainer" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.trl.AxolotlRewardTrainer">AxolotlRewardTrainer</h3>
<div class="sourceCode" id="cb7"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlRewardTrainer()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlRewardTrainer(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Extend the base RewardTrainer for axolotl helpers</p>
</section>
<section id="axolotl.core.trainers.trl.TRLPPOTrainer" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.trl.TRLPPOTrainer">TRLPPOTrainer</h3>
<div class="sourceCode" id="cb8"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.TRLPPOTrainer()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb6"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.TRLPPOTrainer()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Wrapper for TRL PPO trainer to handle customizations</p>

View File

@@ -536,326 +536,314 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.core.training_args.AxolotlCPOConfig" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.training_args.AxolotlCPOConfig">AxolotlCPOConfig</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.training_args.AxolotlCPOConfig(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-15"><a href="#cb1-15" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-16"><a href="#cb1-16" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-17"><a href="#cb1-17" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
<span id="cb1-18"><a href="#cb1-18" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
<span id="cb1-19"><a href="#cb1-19" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
<span id="cb1-20"><a href="#cb1-20" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-21"><a href="#cb1-21" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-22"><a href="#cb1-22" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-23"><a href="#cb1-23" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb1-24"><a href="#cb1-24" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-25"><a href="#cb1-25" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-26"><a href="#cb1-26" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-27"><a href="#cb1-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-28"><a href="#cb1-28" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
<span id="cb1-29"><a href="#cb1-29" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-30"><a href="#cb1-30" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-31"><a href="#cb1-31" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-32"><a href="#cb1-32" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-33"><a href="#cb1-33" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-34"><a href="#cb1-34" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-35"><a href="#cb1-35" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-36"><a href="#cb1-36" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-37"><a href="#cb1-37" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-38"><a href="#cb1-38" aria-hidden="true" tabindex="-1"></a> alternate_optimizer<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-39"><a href="#cb1-39" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-40"><a href="#cb1-40" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-41"><a href="#cb1-41" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-42"><a href="#cb1-42" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb1-43"><a href="#cb1-43" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb1-44"><a href="#cb1-44" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-45"><a href="#cb1-45" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-46"><a href="#cb1-46" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-47"><a href="#cb1-47" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-48"><a href="#cb1-48" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-49"><a href="#cb1-49" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-50"><a href="#cb1-50" aria-hidden="true" tabindex="-1"></a> simpo_gamma<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-51"><a href="#cb1-51" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-15"><a href="#cb1-15" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-16"><a href="#cb1-16" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
<span id="cb1-17"><a href="#cb1-17" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
<span id="cb1-18"><a href="#cb1-18" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
<span id="cb1-19"><a href="#cb1-19" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-20"><a href="#cb1-20" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-21"><a href="#cb1-21" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-22"><a href="#cb1-22" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb1-23"><a href="#cb1-23" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-24"><a href="#cb1-24" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-25"><a href="#cb1-25" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-26"><a href="#cb1-26" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-27"><a href="#cb1-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
<span id="cb1-28"><a href="#cb1-28" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-29"><a href="#cb1-29" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-30"><a href="#cb1-30" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-31"><a href="#cb1-31" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-32"><a href="#cb1-32" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-33"><a href="#cb1-33" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-34"><a href="#cb1-34" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-35"><a href="#cb1-35" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-36"><a href="#cb1-36" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-37"><a href="#cb1-37" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-38"><a href="#cb1-38" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-39"><a href="#cb1-39" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-40"><a href="#cb1-40" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb1-41"><a href="#cb1-41" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb1-42"><a href="#cb1-42" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-43"><a href="#cb1-43" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-44"><a href="#cb1-44" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-45"><a href="#cb1-45" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-46"><a href="#cb1-46" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-47"><a href="#cb1-47" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-48"><a href="#cb1-48" aria-hidden="true" tabindex="-1"></a> simpo_gamma<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-49"><a href="#cb1-49" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>CPO config for CPO training</p>
</section>
<section id="axolotl.core.training_args.AxolotlKTOConfig" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.training_args.AxolotlKTOConfig">AxolotlKTOConfig</h3>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>core.training_args.AxolotlKTOConfig(</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb2-11"><a href="#cb2-11" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
<span id="cb2-12"><a href="#cb2-12" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
<span id="cb2-13"><a href="#cb2-13" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb2-14"><a href="#cb2-14" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-15"><a href="#cb2-15" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-16"><a href="#cb2-16" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-17"><a href="#cb2-17" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
<span id="cb2-18"><a href="#cb2-18" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
<span id="cb2-19"><a href="#cb2-19" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
<span id="cb2-20"><a href="#cb2-20" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-21"><a href="#cb2-21" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-22"><a href="#cb2-22" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-23"><a href="#cb2-23" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb2-24"><a href="#cb2-24" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-25"><a href="#cb2-25" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-26"><a href="#cb2-26" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-27"><a href="#cb2-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-28"><a href="#cb2-28" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
<span id="cb2-29"><a href="#cb2-29" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-30"><a href="#cb2-30" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-31"><a href="#cb2-31" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-32"><a href="#cb2-32" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-33"><a href="#cb2-33" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-34"><a href="#cb2-34" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-35"><a href="#cb2-35" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-36"><a href="#cb2-36" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-37"><a href="#cb2-37" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-38"><a href="#cb2-38" aria-hidden="true" tabindex="-1"></a> alternate_optimizer<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-39"><a href="#cb2-39" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-40"><a href="#cb2-40" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-41"><a href="#cb2-41" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-42"><a href="#cb2-42" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb2-43"><a href="#cb2-43" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb2-44"><a href="#cb2-44" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-45"><a href="#cb2-45" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-46"><a href="#cb2-46" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-47"><a href="#cb2-47" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-48"><a href="#cb2-48" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-49"><a href="#cb2-49" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-50"><a href="#cb2-50" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
<span id="cb2-11"><a href="#cb2-11" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
<span id="cb2-12"><a href="#cb2-12" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb2-13"><a href="#cb2-13" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-14"><a href="#cb2-14" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-15"><a href="#cb2-15" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-16"><a href="#cb2-16" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
<span id="cb2-17"><a href="#cb2-17" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
<span id="cb2-18"><a href="#cb2-18" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
<span id="cb2-19"><a href="#cb2-19" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-20"><a href="#cb2-20" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-21"><a href="#cb2-21" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-22"><a href="#cb2-22" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb2-23"><a href="#cb2-23" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-24"><a href="#cb2-24" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-25"><a href="#cb2-25" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-26"><a href="#cb2-26" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-27"><a href="#cb2-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
<span id="cb2-28"><a href="#cb2-28" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-29"><a href="#cb2-29" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-30"><a href="#cb2-30" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-31"><a href="#cb2-31" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-32"><a href="#cb2-32" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-33"><a href="#cb2-33" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-34"><a href="#cb2-34" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-35"><a href="#cb2-35" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-36"><a href="#cb2-36" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-37"><a href="#cb2-37" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-38"><a href="#cb2-38" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-39"><a href="#cb2-39" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-40"><a href="#cb2-40" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb2-41"><a href="#cb2-41" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb2-42"><a href="#cb2-42" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-43"><a href="#cb2-43" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-44"><a href="#cb2-44" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-45"><a href="#cb2-45" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-46"><a href="#cb2-46" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-47"><a href="#cb2-47" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-48"><a href="#cb2-48" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>KTO config for KTO training</p>
</section>
<section id="axolotl.core.training_args.AxolotlORPOConfig" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.training_args.AxolotlORPOConfig">AxolotlORPOConfig</h3>
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>core.training_args.AxolotlORPOConfig(</span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-8"><a href="#cb3-8" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-9"><a href="#cb3-9" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-10"><a href="#cb3-10" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb3-11"><a href="#cb3-11" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
<span id="cb3-12"><a href="#cb3-12" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
<span id="cb3-13"><a href="#cb3-13" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb3-14"><a href="#cb3-14" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-15"><a href="#cb3-15" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-16"><a href="#cb3-16" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-17"><a href="#cb3-17" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
<span id="cb3-18"><a href="#cb3-18" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
<span id="cb3-19"><a href="#cb3-19" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
<span id="cb3-20"><a href="#cb3-20" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-21"><a href="#cb3-21" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-22"><a href="#cb3-22" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-23"><a href="#cb3-23" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb3-24"><a href="#cb3-24" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-25"><a href="#cb3-25" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-26"><a href="#cb3-26" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-27"><a href="#cb3-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-28"><a href="#cb3-28" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
<span id="cb3-29"><a href="#cb3-29" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-30"><a href="#cb3-30" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-31"><a href="#cb3-31" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-32"><a href="#cb3-32" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-33"><a href="#cb3-33" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-34"><a href="#cb3-34" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-35"><a href="#cb3-35" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-36"><a href="#cb3-36" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-37"><a href="#cb3-37" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-38"><a href="#cb3-38" aria-hidden="true" tabindex="-1"></a> alternate_optimizer<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-39"><a href="#cb3-39" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-40"><a href="#cb3-40" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-41"><a href="#cb3-41" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-42"><a href="#cb3-42" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb3-43"><a href="#cb3-43" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb3-44"><a href="#cb3-44" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-45"><a href="#cb3-45" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-46"><a href="#cb3-46" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-47"><a href="#cb3-47" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-48"><a href="#cb3-48" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-49"><a href="#cb3-49" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-50"><a href="#cb3-50" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-8"><a href="#cb3-8" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-9"><a href="#cb3-9" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb3-10"><a href="#cb3-10" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
<span id="cb3-11"><a href="#cb3-11" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
<span id="cb3-12"><a href="#cb3-12" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb3-13"><a href="#cb3-13" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-14"><a href="#cb3-14" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-15"><a href="#cb3-15" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-16"><a href="#cb3-16" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
<span id="cb3-17"><a href="#cb3-17" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
<span id="cb3-18"><a href="#cb3-18" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
<span id="cb3-19"><a href="#cb3-19" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-20"><a href="#cb3-20" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-21"><a href="#cb3-21" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-22"><a href="#cb3-22" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb3-23"><a href="#cb3-23" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-24"><a href="#cb3-24" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-25"><a href="#cb3-25" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-26"><a href="#cb3-26" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-27"><a href="#cb3-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
<span id="cb3-28"><a href="#cb3-28" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-29"><a href="#cb3-29" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-30"><a href="#cb3-30" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-31"><a href="#cb3-31" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-32"><a href="#cb3-32" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-33"><a href="#cb3-33" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-34"><a href="#cb3-34" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-35"><a href="#cb3-35" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-36"><a href="#cb3-36" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-37"><a href="#cb3-37" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-38"><a href="#cb3-38" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-39"><a href="#cb3-39" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-40"><a href="#cb3-40" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb3-41"><a href="#cb3-41" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb3-42"><a href="#cb3-42" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-43"><a href="#cb3-43" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-44"><a href="#cb3-44" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-45"><a href="#cb3-45" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-46"><a href="#cb3-46" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-47"><a href="#cb3-47" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb3-48"><a href="#cb3-48" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>ORPO config for ORPO training</p>
</section>
<section id="axolotl.core.training_args.AxolotlPRMConfig" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.training_args.AxolotlPRMConfig">AxolotlPRMConfig</h3>
<div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>core.training_args.AxolotlPRMConfig(</span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-10"><a href="#cb4-10" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb4-11"><a href="#cb4-11" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
<span id="cb4-12"><a href="#cb4-12" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
<span id="cb4-13"><a href="#cb4-13" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb4-14"><a href="#cb4-14" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-15"><a href="#cb4-15" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-16"><a href="#cb4-16" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-17"><a href="#cb4-17" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
<span id="cb4-18"><a href="#cb4-18" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
<span id="cb4-19"><a href="#cb4-19" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
<span id="cb4-20"><a href="#cb4-20" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb4-21"><a href="#cb4-21" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb4-22"><a href="#cb4-22" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-23"><a href="#cb4-23" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb4-24"><a href="#cb4-24" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-25"><a href="#cb4-25" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-26"><a href="#cb4-26" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-27"><a href="#cb4-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-28"><a href="#cb4-28" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
<span id="cb4-29"><a href="#cb4-29" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-30"><a href="#cb4-30" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-31"><a href="#cb4-31" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-32"><a href="#cb4-32" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb4-33"><a href="#cb4-33" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-34"><a href="#cb4-34" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-35"><a href="#cb4-35" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-36"><a href="#cb4-36" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-37"><a href="#cb4-37" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-38"><a href="#cb4-38" aria-hidden="true" tabindex="-1"></a> alternate_optimizer<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-39"><a href="#cb4-39" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-40"><a href="#cb4-40" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-41"><a href="#cb4-41" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-42"><a href="#cb4-42" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb4-43"><a href="#cb4-43" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb4-44"><a href="#cb4-44" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-45"><a href="#cb4-45" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-46"><a href="#cb4-46" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-47"><a href="#cb4-47" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-48"><a href="#cb4-48" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-49"><a href="#cb4-49" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-50"><a href="#cb4-50" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb4-10"><a href="#cb4-10" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
<span id="cb4-11"><a href="#cb4-11" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
<span id="cb4-12"><a href="#cb4-12" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb4-13"><a href="#cb4-13" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-14"><a href="#cb4-14" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-15"><a href="#cb4-15" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-16"><a href="#cb4-16" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
<span id="cb4-17"><a href="#cb4-17" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
<span id="cb4-18"><a href="#cb4-18" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
<span id="cb4-19"><a href="#cb4-19" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb4-20"><a href="#cb4-20" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb4-21"><a href="#cb4-21" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-22"><a href="#cb4-22" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb4-23"><a href="#cb4-23" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-24"><a href="#cb4-24" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-25"><a href="#cb4-25" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-26"><a href="#cb4-26" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-27"><a href="#cb4-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
<span id="cb4-28"><a href="#cb4-28" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-29"><a href="#cb4-29" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-30"><a href="#cb4-30" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-31"><a href="#cb4-31" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb4-32"><a href="#cb4-32" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-33"><a href="#cb4-33" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-34"><a href="#cb4-34" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-35"><a href="#cb4-35" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-36"><a href="#cb4-36" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-37"><a href="#cb4-37" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-38"><a href="#cb4-38" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-39"><a href="#cb4-39" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-40"><a href="#cb4-40" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb4-41"><a href="#cb4-41" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb4-42"><a href="#cb4-42" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-43"><a href="#cb4-43" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-44"><a href="#cb4-44" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-45"><a href="#cb4-45" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-46"><a href="#cb4-46" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-47"><a href="#cb4-47" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-48"><a href="#cb4-48" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>PRM config for PRM training</p>
</section>
<section id="axolotl.core.training_args.AxolotlRewardConfig" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.training_args.AxolotlRewardConfig">AxolotlRewardConfig</h3>
<div class="sourceCode" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>core.training_args.AxolotlRewardConfig(</span>
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-7"><a href="#cb5-7" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-8"><a href="#cb5-8" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-9"><a href="#cb5-9" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-10"><a href="#cb5-10" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb5-11"><a href="#cb5-11" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
<span id="cb5-12"><a href="#cb5-12" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
<span id="cb5-13"><a href="#cb5-13" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb5-14"><a href="#cb5-14" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-15"><a href="#cb5-15" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-16"><a href="#cb5-16" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-17"><a href="#cb5-17" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
<span id="cb5-18"><a href="#cb5-18" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
<span id="cb5-19"><a href="#cb5-19" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
<span id="cb5-20"><a href="#cb5-20" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-21"><a href="#cb5-21" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-22"><a href="#cb5-22" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-23"><a href="#cb5-23" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb5-24"><a href="#cb5-24" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-25"><a href="#cb5-25" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-26"><a href="#cb5-26" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-27"><a href="#cb5-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-28"><a href="#cb5-28" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
<span id="cb5-29"><a href="#cb5-29" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-30"><a href="#cb5-30" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-31"><a href="#cb5-31" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-32"><a href="#cb5-32" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-33"><a href="#cb5-33" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-34"><a href="#cb5-34" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-35"><a href="#cb5-35" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-36"><a href="#cb5-36" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-37"><a href="#cb5-37" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-38"><a href="#cb5-38" aria-hidden="true" tabindex="-1"></a> alternate_optimizer<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-39"><a href="#cb5-39" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-40"><a href="#cb5-40" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-41"><a href="#cb5-41" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-42"><a href="#cb5-42" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb5-43"><a href="#cb5-43" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb5-44"><a href="#cb5-44" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-45"><a href="#cb5-45" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-46"><a href="#cb5-46" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-47"><a href="#cb5-47" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-48"><a href="#cb5-48" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-49"><a href="#cb5-49" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-50"><a href="#cb5-50" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-7"><a href="#cb5-7" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-8"><a href="#cb5-8" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-9"><a href="#cb5-9" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb5-10"><a href="#cb5-10" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
<span id="cb5-11"><a href="#cb5-11" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
<span id="cb5-12"><a href="#cb5-12" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb5-13"><a href="#cb5-13" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-14"><a href="#cb5-14" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-15"><a href="#cb5-15" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-16"><a href="#cb5-16" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
<span id="cb5-17"><a href="#cb5-17" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
<span id="cb5-18"><a href="#cb5-18" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
<span id="cb5-19"><a href="#cb5-19" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-20"><a href="#cb5-20" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-21"><a href="#cb5-21" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-22"><a href="#cb5-22" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb5-23"><a href="#cb5-23" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-24"><a href="#cb5-24" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-25"><a href="#cb5-25" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-26"><a href="#cb5-26" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-27"><a href="#cb5-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
<span id="cb5-28"><a href="#cb5-28" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-29"><a href="#cb5-29" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-30"><a href="#cb5-30" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-31"><a href="#cb5-31" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-32"><a href="#cb5-32" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-33"><a href="#cb5-33" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-34"><a href="#cb5-34" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-35"><a href="#cb5-35" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-36"><a href="#cb5-36" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-37"><a href="#cb5-37" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-38"><a href="#cb5-38" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-39"><a href="#cb5-39" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-40"><a href="#cb5-40" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb5-41"><a href="#cb5-41" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb5-42"><a href="#cb5-42" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-43"><a href="#cb5-43" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-44"><a href="#cb5-44" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-45"><a href="#cb5-45" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-46"><a href="#cb5-46" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-47"><a href="#cb5-47" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb5-48"><a href="#cb5-48" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Reward config for Reward training</p>
</section>
<section id="axolotl.core.training_args.AxolotlTrainingArguments" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.training_args.AxolotlTrainingArguments">AxolotlTrainingArguments</h3>
<div class="sourceCode" id="cb6"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a>core.training_args.AxolotlTrainingArguments(</span>
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb6-7"><a href="#cb6-7" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb6-8"><a href="#cb6-8" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb6-9"><a href="#cb6-9" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-10"><a href="#cb6-10" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb6-11"><a href="#cb6-11" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
<span id="cb6-12"><a href="#cb6-12" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
<span id="cb6-13"><a href="#cb6-13" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb6-14"><a href="#cb6-14" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-15"><a href="#cb6-15" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-16"><a href="#cb6-16" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-17"><a href="#cb6-17" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
<span id="cb6-18"><a href="#cb6-18" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
<span id="cb6-19"><a href="#cb6-19" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
<span id="cb6-20"><a href="#cb6-20" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb6-21"><a href="#cb6-21" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb6-22"><a href="#cb6-22" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-23"><a href="#cb6-23" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb6-24"><a href="#cb6-24" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-25"><a href="#cb6-25" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-26"><a href="#cb6-26" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-27"><a href="#cb6-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-28"><a href="#cb6-28" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
<span id="cb6-29"><a href="#cb6-29" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-30"><a href="#cb6-30" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-31"><a href="#cb6-31" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-32"><a href="#cb6-32" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb6-33"><a href="#cb6-33" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-34"><a href="#cb6-34" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-35"><a href="#cb6-35" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-36"><a href="#cb6-36" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-37"><a href="#cb6-37" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-38"><a href="#cb6-38" aria-hidden="true" tabindex="-1"></a> alternate_optimizer<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-39"><a href="#cb6-39" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-40"><a href="#cb6-40" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-41"><a href="#cb6-41" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-42"><a href="#cb6-42" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb6-43"><a href="#cb6-43" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb6-44"><a href="#cb6-44" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-45"><a href="#cb6-45" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-46"><a href="#cb6-46" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-47"><a href="#cb6-47" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-48"><a href="#cb6-48" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-49"><a href="#cb6-49" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-50"><a href="#cb6-50" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb6-7"><a href="#cb6-7" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb6-8"><a href="#cb6-8" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-9"><a href="#cb6-9" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb6-10"><a href="#cb6-10" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
<span id="cb6-11"><a href="#cb6-11" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
<span id="cb6-12"><a href="#cb6-12" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb6-13"><a href="#cb6-13" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-14"><a href="#cb6-14" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-15"><a href="#cb6-15" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-16"><a href="#cb6-16" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
<span id="cb6-17"><a href="#cb6-17" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
<span id="cb6-18"><a href="#cb6-18" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
<span id="cb6-19"><a href="#cb6-19" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb6-20"><a href="#cb6-20" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb6-21"><a href="#cb6-21" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-22"><a href="#cb6-22" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb6-23"><a href="#cb6-23" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-24"><a href="#cb6-24" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-25"><a href="#cb6-25" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-26"><a href="#cb6-26" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-27"><a href="#cb6-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
<span id="cb6-28"><a href="#cb6-28" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-29"><a href="#cb6-29" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-30"><a href="#cb6-30" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-31"><a href="#cb6-31" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb6-32"><a href="#cb6-32" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-33"><a href="#cb6-33" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-34"><a href="#cb6-34" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-35"><a href="#cb6-35" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-36"><a href="#cb6-36" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-37"><a href="#cb6-37" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-38"><a href="#cb6-38" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-39"><a href="#cb6-39" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-40"><a href="#cb6-40" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb6-41"><a href="#cb6-41" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb6-42"><a href="#cb6-42" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-43"><a href="#cb6-43" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-44"><a href="#cb6-44" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-45"><a href="#cb6-45" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-46"><a href="#cb6-46" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-47"><a href="#cb6-47" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb6-48"><a href="#cb6-48" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Training arguments for Causal trainer</p>
<p>This code is duplicated due to HF TrainingArguments not setting output_dir with a
default value so it cant be used as a mixin.</p>
@@ -863,55 +851,53 @@ default value so it cant be used as a mixin.</p>
<section id="axolotl.core.training_args.AxolotlTrainingMixins" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.core.training_args.AxolotlTrainingMixins">AxolotlTrainingMixins</h3>
<div class="sourceCode" id="cb7"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a>core.training_args.AxolotlTrainingMixins(</span>
<span id="cb7-2"><a href="#cb7-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb7-3"><a href="#cb7-3" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-4"><a href="#cb7-4" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb7-5"><a href="#cb7-5" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb7-6"><a href="#cb7-6" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb7-7"><a href="#cb7-7" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb7-8"><a href="#cb7-8" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb7-9"><a href="#cb7-9" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-10"><a href="#cb7-10" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb7-11"><a href="#cb7-11" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
<span id="cb7-12"><a href="#cb7-12" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
<span id="cb7-13"><a href="#cb7-13" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb7-14"><a href="#cb7-14" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-15"><a href="#cb7-15" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-16"><a href="#cb7-16" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-17"><a href="#cb7-17" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
<span id="cb7-18"><a href="#cb7-18" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
<span id="cb7-19"><a href="#cb7-19" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
<span id="cb7-20"><a href="#cb7-20" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb7-21"><a href="#cb7-21" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb7-22"><a href="#cb7-22" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-23"><a href="#cb7-23" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb7-24"><a href="#cb7-24" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-25"><a href="#cb7-25" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-26"><a href="#cb7-26" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-27"><a href="#cb7-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-28"><a href="#cb7-28" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
<span id="cb7-29"><a href="#cb7-29" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-30"><a href="#cb7-30" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-31"><a href="#cb7-31" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-32"><a href="#cb7-32" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb7-33"><a href="#cb7-33" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-34"><a href="#cb7-34" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-35"><a href="#cb7-35" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-36"><a href="#cb7-36" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-37"><a href="#cb7-37" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-38"><a href="#cb7-38" aria-hidden="true" tabindex="-1"></a> alternate_optimizer<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-39"><a href="#cb7-39" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-40"><a href="#cb7-40" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-41"><a href="#cb7-41" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-42"><a href="#cb7-42" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb7-43"><a href="#cb7-43" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb7-44"><a href="#cb7-44" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-45"><a href="#cb7-45" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-46"><a href="#cb7-46" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-47"><a href="#cb7-47" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-48"><a href="#cb7-48" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-49"><a href="#cb7-49" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-50"><a href="#cb7-50" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb7-2"><a href="#cb7-2" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-3"><a href="#cb7-3" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb7-4"><a href="#cb7-4" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb7-5"><a href="#cb7-5" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb7-6"><a href="#cb7-6" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb7-7"><a href="#cb7-7" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb7-8"><a href="#cb7-8" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-9"><a href="#cb7-9" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb7-10"><a href="#cb7-10" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
<span id="cb7-11"><a href="#cb7-11" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
<span id="cb7-12"><a href="#cb7-12" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb7-13"><a href="#cb7-13" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-14"><a href="#cb7-14" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-15"><a href="#cb7-15" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-16"><a href="#cb7-16" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
<span id="cb7-17"><a href="#cb7-17" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
<span id="cb7-18"><a href="#cb7-18" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
<span id="cb7-19"><a href="#cb7-19" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb7-20"><a href="#cb7-20" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb7-21"><a href="#cb7-21" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-22"><a href="#cb7-22" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb7-23"><a href="#cb7-23" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-24"><a href="#cb7-24" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-25"><a href="#cb7-25" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-26"><a href="#cb7-26" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-27"><a href="#cb7-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
<span id="cb7-28"><a href="#cb7-28" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-29"><a href="#cb7-29" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-30"><a href="#cb7-30" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-31"><a href="#cb7-31" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb7-32"><a href="#cb7-32" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-33"><a href="#cb7-33" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-34"><a href="#cb7-34" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-35"><a href="#cb7-35" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-36"><a href="#cb7-36" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-37"><a href="#cb7-37" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-38"><a href="#cb7-38" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-39"><a href="#cb7-39" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-40"><a href="#cb7-40" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb7-41"><a href="#cb7-41" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb7-42"><a href="#cb7-42" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-43"><a href="#cb7-43" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-44"><a href="#cb7-44" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-45"><a href="#cb7-45" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-46"><a href="#cb7-46" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-47"><a href="#cb7-47" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb7-48"><a href="#cb7-48" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Mixin class for the Axolotl training args.</p>

View File

@@ -510,7 +510,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</table>
<section id="axolotl.datasets.ConstantLengthDataset" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.datasets.ConstantLengthDataset">ConstantLengthDataset</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>datasets.ConstantLengthDataset(<span class="va">self</span>, tokenizer, datasets, seq_length<span class="op">=</span><span class="dv">2048</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>datasets.ConstantLengthDataset(tokenizer, datasets, seq_length<span class="op">=</span><span class="dv">2048</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Iterable dataset that returns constant length chunks of tokens from stream of text files.
Args:
tokenizer (Tokenizer): The processor used for processing the data.
@@ -520,13 +520,12 @@ seq_length (int): Length of token sequences to return.</p>
<section id="axolotl.datasets.TokenizedPromptDataset" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.datasets.TokenizedPromptDataset">TokenizedPromptDataset</h3>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>datasets.TokenizedPromptDataset(</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> prompt_tokenizer,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> dataset,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> process_count<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> keep_in_memory<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> prompt_tokenizer,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> dataset,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> process_count<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> keep_in_memory<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Dataset that returns tokenized prompts from a stream of text files.
Args:
prompt_tokenizer (PromptTokenizingStrategy): The prompt tokenizing method for processing the data.

View File

@@ -492,8 +492,16 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<td>Common logging module for axolotl</td>
</tr>
<tr class="odd">
<td><a href="../../docs/api/core.trainer_builder.html#axolotl.core.trainer_builder">core.trainer_builder</a></td>
<td>Builder for the training args and trainer</td>
<td><a href="../../docs/api/core.builders.base.html#axolotl.core.builders.base">core.builders.base</a></td>
<td>Base class for trainer builder</td>
</tr>
<tr class="even">
<td><a href="../../docs/api/core.builders.causal.html#axolotl.core.builders.causal">core.builders.causal</a></td>
<td>Builder for causal trainers</td>
</tr>
<tr class="odd">
<td><a href="../../docs/api/core.builders.rl.html#axolotl.core.builders.rl">core.builders.rl</a></td>
<td>Builder for RLHF trainers</td>
</tr>
<tr class="even">
<td><a href="../../docs/api/core.training_args.html#axolotl.core.training_args">core.training_args</a></td>

View File

@@ -527,7 +527,7 @@ Plugins can be used to integrate third-party models, modify the training process
</section>
<section id="axolotl.integrations.base.BasePlugin" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.integrations.base.BasePlugin">BasePlugin</h3>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>integrations.base.BasePlugin(<span class="va">self</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>integrations.base.BasePlugin()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Base class for all plugins. Defines the interface for plugin methods.</p>
<p>A plugin is a reusable, modular, and self-contained piece of code that extends
the functionality of Axolotl. Plugins can be used to integrate third-party models,

View File

@@ -506,13 +506,12 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.integrations.kd.trainer.AxolotlKDTrainer" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.integrations.kd.trainer.AxolotlKDTrainer">AxolotlKDTrainer</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>integrations.kd.trainer.AxolotlKDTrainer(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>_args,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> bench_data_collator<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> eval_data_collator<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> dataset_tags<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>_args,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> bench_data_collator<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> eval_data_collator<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> dataset_tags<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Custom trainer subclass for Knowledge Distillation (KD)</p>
<section id="methods" class="level4">
<h4 class="anchored" data-anchor-id="methods">Methods</h4>

View File

@@ -507,14 +507,13 @@ models.</p>
<section id="axolotl.loaders.model.ModelLoader" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.loaders.model.ModelLoader">ModelLoader</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>loaders.model.ModelLoader(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> cfg,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> inference<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> reference_model<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> cfg,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> inference<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> reference_model<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Manages model configuration, initialization and application of patches during
model loading.</p>
<p>This class orchestrates the entire process of loading a model from configuration to

View File

@@ -506,7 +506,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</table>
<section id="axolotl.loaders.patch_manager.PatchManager" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.loaders.patch_manager.PatchManager">PatchManager</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>loaders.patch_manager.PatchManager(<span class="va">self</span>, cfg, model_config, inference<span class="op">=</span><span class="va">False</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>loaders.patch_manager.PatchManager(cfg, model_config, inference<span class="op">=</span><span class="va">False</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Manages the application of patches during the model loading process.</p>
<section id="attributes" class="level4">
<h4 class="anchored" data-anchor-id="attributes">Attributes</h4>

View File

@@ -510,23 +510,18 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</table>
<section id="axolotl.monkeypatch.attention.mllama.MllamaTextCrossFlashAttention2" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.monkeypatch.attention.mllama.MllamaTextCrossFlashAttention2">MllamaTextCrossFlashAttention2</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>monkeypatch.attention.mllama.MllamaTextCrossFlashAttention2(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>monkeypatch.attention.mllama.MllamaTextCrossFlashAttention2(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Mllama flash cross-attention module. This module inherits from <code>MllamaTextCrossAttention</code> and
implements the forward pass using Flash Attention for improved performance.</p>
</section>
<section id="axolotl.monkeypatch.attention.mllama.MllamaTextSelfFlashAttention2" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.monkeypatch.attention.mllama.MllamaTextSelfFlashAttention2">MllamaTextSelfFlashAttention2</h3>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>monkeypatch.attention.mllama.MllamaTextSelfFlashAttention2(</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> config,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> layer_idx,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> config,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> layer_idx,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Mllama flash self-attention module. This module inherits from <code>MllamaTextSelfAttention</code> and
implements the forward pass using Flash Attention for improved performance.</p>

View File

@@ -572,11 +572,10 @@ Advanced disk-based gradient checkpointer with prefetching.</p>
<section id="axolotl.monkeypatch.gradient_checkpointing.offload_disk.DiskOffloadManager" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.monkeypatch.gradient_checkpointing.offload_disk.DiskOffloadManager">DiskOffloadManager</h3>
<div class="sourceCode" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>monkeypatch.gradient_checkpointing.offload_disk.DiskOffloadManager(</span>
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a> prefetch_size<span class="op">=</span><span class="dv">3</span>,</span>
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a> prefetch_to_gpu<span class="op">=</span><span class="va">True</span>,</span>
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a> save_workers<span class="op">=</span><span class="dv">4</span>,</span>
<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a> prefetch_size<span class="op">=</span><span class="dv">3</span>,</span>
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a> prefetch_to_gpu<span class="op">=</span><span class="va">True</span>,</span>
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a> save_workers<span class="op">=</span><span class="dv">4</span>,</span>
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Manages offloaded tensors and handles prefetching in a separate thread.
Includes synchronization to prevent race conditions.</p>
<section id="methods-1" class="level4">

View File

@@ -516,7 +516,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</table>
<section id="axolotl.monkeypatch.llama_attn_hijack_flash.FusedAttention" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.monkeypatch.llama_attn_hijack_flash.FusedAttention">FusedAttention</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>monkeypatch.llama_attn_hijack_flash.FusedAttention(<span class="va">self</span>, config, q, k, v, o)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>monkeypatch.llama_attn_hijack_flash.FusedAttention(config, q, k, v, o)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Fused QKV Attention layer for incrementally improved training efficiency</p>
</section>
<section id="axolotl.monkeypatch.llama_attn_hijack_flash.LlamaDecoderLayer" class="level3">

View File

@@ -513,7 +513,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</table>
<section id="axolotl.monkeypatch.lora_kernels.FakeMLP" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.monkeypatch.lora_kernels.FakeMLP">FakeMLP</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>monkeypatch.lora_kernels.FakeMLP(<span class="va">self</span>, gate_proj, up_proj, down_proj)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>monkeypatch.lora_kernels.FakeMLP(gate_proj, up_proj, down_proj)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>placeholder MLP for triton patching</p>
</section>
</section>

View File

@@ -510,20 +510,19 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</table>
<section id="axolotl.monkeypatch.relora.ReLoRACallback" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.monkeypatch.relora.ReLoRACallback">ReLoRACallback</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>monkeypatch.relora.ReLoRACallback(<span class="va">self</span>, cfg)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>monkeypatch.relora.ReLoRACallback(cfg)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Callback to merge LoRA weights into the base model and save full-weight checkpoints</p>
</section>
<section id="axolotl.monkeypatch.relora.ReLoRAScheduler" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.monkeypatch.relora.ReLoRAScheduler">ReLoRAScheduler</h3>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>monkeypatch.relora.ReLoRAScheduler(</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> optimizer,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> inner_schedule,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> relora_steps,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> warmup_steps,</span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> anneal_steps<span class="op">=</span><span class="dv">1</span>,</span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> min_lr_scale<span class="op">=</span><span class="fl">0.001</span>,</span>
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> optimizer,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> inner_schedule,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> relora_steps,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> warmup_steps,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> anneal_steps<span class="op">=</span><span class="dv">1</span>,</span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> min_lr_scale<span class="op">=</span><span class="fl">0.001</span>,</span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Wraps another scheduler to apply per-lora-restart learning rate warmups.</p>

View File

@@ -525,42 +525,39 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</table>
<section id="axolotl.prompt_strategies.alpaca_chat.AlpacaChatPrompter" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.alpaca_chat.AlpacaChatPrompter">AlpacaChatPrompter</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.alpaca_chat.AlpacaChatPrompter(<span class="va">self</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.alpaca_chat.AlpacaChatPrompter()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Alpaca Chat Prompter extending the system prompt to for chat-instruct answers</p>
</section>
<section id="axolotl.prompt_strategies.alpaca_chat.AlpacaConcisePrompter" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.alpaca_chat.AlpacaConcisePrompter">AlpacaConcisePrompter</h3>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.alpaca_chat.AlpacaConcisePrompter(</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> prompt_style<span class="op">=</span>PromptStyle.INSTRUCT.value,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> prompt_style<span class="op">=</span>PromptStyle.INSTRUCT.value,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Alpaca Prompter extending the system prompt to ask for concise chat-instruct answers</p>
</section>
<section id="axolotl.prompt_strategies.alpaca_chat.AlpacaQAPromptTokenizingStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.alpaca_chat.AlpacaQAPromptTokenizingStrategy">AlpacaQAPromptTokenizingStrategy</h3>
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.alpaca_chat.AlpacaQAPromptTokenizingStrategy(</span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Tokenizing strategy for AlpacaQA</p>
</section>
<section id="axolotl.prompt_strategies.alpaca_chat.CamelAIPromptTokenizingStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.alpaca_chat.CamelAIPromptTokenizingStrategy">CamelAIPromptTokenizingStrategy</h3>
<div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.alpaca_chat.CamelAIPromptTokenizingStrategy(</span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Tokenizing strategy for CamelAI datasets</p>
</section>
<section id="axolotl.prompt_strategies.alpaca_chat.NoSystemPrompter" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.alpaca_chat.NoSystemPrompter">NoSystemPrompter</h3>
<div class="sourceCode" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.alpaca_chat.NoSystemPrompter(<span class="va">self</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.alpaca_chat.NoSystemPrompter()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Null Prompter with no system prompts</p>

View File

@@ -521,39 +521,35 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.prompt_strategies.alpaca_w_system.InstructionWSystemPromptTokenizingStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.alpaca_w_system.InstructionWSystemPromptTokenizingStrategy">InstructionWSystemPromptTokenizingStrategy</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.alpaca_w_system.InstructionWSystemPromptTokenizingStrategy(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Tokenizing strategy for instruction-based prompts.</p>
</section>
<section id="axolotl.prompt_strategies.alpaca_w_system.OpenOrcaPromptTokenizingStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.alpaca_w_system.OpenOrcaPromptTokenizingStrategy">OpenOrcaPromptTokenizingStrategy</h3>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.alpaca_w_system.OpenOrcaPromptTokenizingStrategy(</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Tokenizing strategy for OpenOrca datasets</p>
</section>
<section id="axolotl.prompt_strategies.alpaca_w_system.OpenOrcaSystemDataPrompter" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.alpaca_w_system.OpenOrcaSystemDataPrompter">OpenOrcaSystemDataPrompter</h3>
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.alpaca_w_system.OpenOrcaSystemDataPrompter(</span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> prompt_style<span class="op">=</span>PromptStyle.INSTRUCT.value,</span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> prompt_style<span class="op">=</span>PromptStyle.INSTRUCT.value,</span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Alpaca Style Prompter that uses system prompts from the dataset, with OpenOrca prompts</p>
</section>
<section id="axolotl.prompt_strategies.alpaca_w_system.SystemDataPrompter" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.alpaca_w_system.SystemDataPrompter">SystemDataPrompter</h3>
<div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.alpaca_w_system.SystemDataPrompter(</span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> prompt_style<span class="op">=</span>PromptStyle.INSTRUCT.value,</span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> prompt_style<span class="op">=</span>PromptStyle.INSTRUCT.value,</span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Alpaca Style Prompter that uses system prompts from the dataset</p>

View File

@@ -516,35 +516,33 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.prompt_strategies.chat_template.ChatTemplatePrompter" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.chat_template.ChatTemplatePrompter">ChatTemplatePrompter</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.chat_template.ChatTemplatePrompter(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> chat_template,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> processor<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> max_length<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> message_property_mappings<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> message_field_training<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> message_field_training_detail<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> field_messages<span class="op">=</span><span class="st">'messages'</span>,</span>
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a> field_system<span class="op">=</span><span class="st">'system'</span>,</span>
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a> roles<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a> drop_system_message<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> chat_template,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> processor<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> max_length<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> message_property_mappings<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> message_field_training<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> message_field_training_detail<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> field_messages<span class="op">=</span><span class="st">'messages'</span>,</span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> field_system<span class="op">=</span><span class="st">'system'</span>,</span>
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a> roles<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a> drop_system_message<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Prompter for HF chat templates</p>
</section>
<section id="axolotl.prompt_strategies.chat_template.ChatTemplateStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.chat_template.ChatTemplateStrategy">ChatTemplateStrategy</h3>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.chat_template.ChatTemplateStrategy(</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> sequence_len,</span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> roles_to_train<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> train_on_eos<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a> train_on_eot<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a> eot_tokens<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-11"><a href="#cb2-11" aria-hidden="true" tabindex="-1"></a> split_thinking<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-12"><a href="#cb2-12" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> sequence_len,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> roles_to_train<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> train_on_eos<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> train_on_eot<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a> eot_tokens<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a> split_thinking<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-11"><a href="#cb2-11" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Tokenizing strategy for instruction-based prompts.</p>
<section id="methods" class="level4">
<h4 class="anchored" data-anchor-id="methods">Methods</h4>

View File

@@ -511,11 +511,10 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.prompt_strategies.completion.CompletionPromptTokenizingStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.completion.CompletionPromptTokenizingStrategy">CompletionPromptTokenizingStrategy</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.completion.CompletionPromptTokenizingStrategy(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> max_length<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> max_length<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Tokenizing strategy for Completion prompts.</p>
</section>
<section id="axolotl.prompt_strategies.completion.CompletionPrompter" class="level3">

View File

@@ -516,11 +516,10 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.prompt_strategies.input_output.RawInputOutputStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.input_output.RawInputOutputStrategy">RawInputOutputStrategy</h3>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.input_output.RawInputOutputStrategy(</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> eos_token<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> eos_token<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Prompt Strategy class for input/output pairs</p>

View File

@@ -528,24 +528,19 @@ For a custom system message, the first “from” can be “system” (followed
</table>
<section id="axolotl.prompt_strategies.llama2_chat.LLama2ChatTokenizingStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.llama2_chat.LLama2ChatTokenizingStrategy">LLama2ChatTokenizingStrategy</h3>
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.llama2_chat.LLama2ChatTokenizingStrategy(</span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.llama2_chat.LLama2ChatTokenizingStrategy(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Tokenizing strategy for Llama2 prompts.
adapted from https://github.com/lm-sys/FastChat/blob/main/fastchat/train/train.py</p>
</section>
<section id="axolotl.prompt_strategies.llama2_chat.Llama2ChatConversation" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.llama2_chat.Llama2ChatConversation">Llama2ChatConversation</h3>
<div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.llama2_chat.Llama2ChatConversation(</span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> name<span class="op">=</span><span class="st">'llama2'</span>,</span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> system<span class="op">=</span><span class="st">"[INST] &lt;&lt;SYS&gt;&gt;</span><span class="ch">\n</span><span class="st">You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.</span><span class="ch">\n\n</span><span class="st">If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.</span><span class="ch">\n</span><span class="st">&lt;&lt;/SYS&gt;&gt;</span><span class="ch">\n\n</span><span class="st">"</span>,</span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> roles<span class="op">=</span>(<span class="st">'[INST]'</span>, <span class="st">'[/INST]'</span>),</span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> messages<span class="op">=</span><span class="bu">list</span>(),</span>
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a> offset<span class="op">=</span><span class="dv">0</span>,</span>
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> name<span class="op">=</span><span class="st">'llama2'</span>,</span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> system<span class="op">=</span><span class="st">"[INST] &lt;&lt;SYS&gt;&gt;</span><span class="ch">\n</span><span class="st">You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.</span><span class="ch">\n\n</span><span class="st">If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.</span><span class="ch">\n</span><span class="st">&lt;&lt;/SYS&gt;&gt;</span><span class="ch">\n\n</span><span class="st">"</span>,</span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> roles<span class="op">=</span>(<span class="st">'[INST]'</span>, <span class="st">'[/INST]'</span>),</span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> messages<span class="op">=</span><span class="bu">list</span>(),</span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> offset<span class="op">=</span><span class="dv">0</span>,</span>
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>A class that manages prompt templates and keeps all conversation history.
copied from https://github.com/lm-sys/FastChat/blob/main/fastchat/conversation.py</p>
<section id="methods" class="level4">

View File

@@ -506,12 +506,11 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.prompt_strategies.messages.chat.ChatMessageDatasetWrappingStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.messages.chat.ChatMessageDatasetWrappingStrategy">ChatMessageDatasetWrappingStrategy</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.messages.chat.ChatMessageDatasetWrappingStrategy(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> processor,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> message_transform<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> formatter<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> processor,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> message_transform<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> formatter<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Chat dataset wrapping strategy for new internal messages representations</p>

View File

@@ -511,17 +511,16 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.prompt_strategies.metharme.MetharmePromptTokenizingStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.metharme.MetharmePromptTokenizingStrategy">MetharmePromptTokenizingStrategy</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.metharme.MetharmePromptTokenizingStrategy(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Tokenizing strategy for the Metharme models</p>
</section>
<section id="axolotl.prompt_strategies.metharme.MetharmePrompter" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.metharme.MetharmePrompter">MetharmePrompter</h3>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.metharme.MetharmePrompter(<span class="va">self</span>, <span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.metharme.MetharmePrompter(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Prompter for the Metharme models.</p>

View File

@@ -511,9 +511,8 @@ this one specifies the system prompt with “### System:”.</p>
<section id="axolotl.prompt_strategies.orcamini.OrcaMiniPrompter" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.orcamini.OrcaMiniPrompter">OrcaMiniPrompter</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.orcamini.OrcaMiniPrompter(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> prompt_style<span class="op">=</span>PromptStyle.INSTRUCT.value,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> prompt_style<span class="op">=</span>PromptStyle.INSTRUCT.value,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Adjusted Prompter for Orca Mini (v2) datasets</p>

View File

@@ -590,21 +590,16 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</section>
<section id="axolotl.prompt_strategies.orpo.chat_template.ORPOPrompter" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.orpo.chat_template.ORPOPrompter">ORPOPrompter</h3>
<div class="sourceCode" id="cb7"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.orpo.chat_template.ORPOPrompter(</span>
<span id="cb7-2"><a href="#cb7-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb7-3"><a href="#cb7-3" aria-hidden="true" tabindex="-1"></a> chat_template,</span>
<span id="cb7-4"><a href="#cb7-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb7-5"><a href="#cb7-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb7"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.orpo.chat_template.ORPOPrompter(chat_template, tokenizer)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Single Turn prompter for ORPO</p>
</section>
<section id="axolotl.prompt_strategies.orpo.chat_template.ORPOTokenizingStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.orpo.chat_template.ORPOTokenizingStrategy">ORPOTokenizingStrategy</h3>
<div class="sourceCode" id="cb8"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.orpo.chat_template.ORPOTokenizingStrategy(</span>
<span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb8-3"><a href="#cb8-3" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
<span id="cb8-4"><a href="#cb8-4" aria-hidden="true" tabindex="-1"></a> dataset_parser<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb8-5"><a href="#cb8-5" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb8-6"><a href="#cb8-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
<span id="cb8-3"><a href="#cb8-3" aria-hidden="true" tabindex="-1"></a> dataset_parser<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb8-4"><a href="#cb8-4" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb8-5"><a href="#cb8-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>rejected_input_ids
input_ids
rejected_attention_mask

View File

@@ -511,17 +511,16 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.prompt_strategies.pygmalion.PygmalionPromptTokenizingStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.pygmalion.PygmalionPromptTokenizingStrategy">PygmalionPromptTokenizingStrategy</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.pygmalion.PygmalionPromptTokenizingStrategy(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Tokenizing strategy for Pygmalion.</p>
</section>
<section id="axolotl.prompt_strategies.pygmalion.PygmalionPrompter" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.pygmalion.PygmalionPrompter">PygmalionPrompter</h3>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.pygmalion.PygmalionPrompter(<span class="va">self</span>, <span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.pygmalion.PygmalionPrompter(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Prompter for Pygmalion.</p>

View File

@@ -507,13 +507,12 @@ and (optionally) per-step, or per-prompt-trace labels for reward modelling.</p>
<section id="axolotl.prompt_strategies.stepwise_supervised.StepwiseSupervisedPromptTokenizingStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.stepwise_supervised.StepwiseSupervisedPromptTokenizingStrategy">StepwiseSupervisedPromptTokenizingStrategy</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.stepwise_supervised.StepwiseSupervisedPromptTokenizingStrategy(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> step_separator<span class="op">=</span><span class="st">'</span><span class="ch">\n</span><span class="st">'</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> max_completion_length<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> train_on_last_step_only<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> step_separator<span class="op">=</span><span class="st">'</span><span class="ch">\n</span><span class="st">'</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> max_completion_length<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> train_on_last_step_only<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Tokenizing strategy for supervised stepwise datasets, typically used for COT-reasoning.
These datasets should include the following columns:
- prompt: the prompt text

View File

@@ -511,27 +511,25 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.prompt_strategies.user_defined.UserDefinedDatasetConfig" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.user_defined.UserDefinedDatasetConfig">UserDefinedDatasetConfig</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.user_defined.UserDefinedDatasetConfig(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> system_prompt<span class="op">=</span><span class="st">''</span>,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> field_system<span class="op">=</span><span class="st">'system'</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> field_instruction<span class="op">=</span><span class="st">'instruction'</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> field_input<span class="op">=</span><span class="st">'input'</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> field_output<span class="op">=</span><span class="st">'output'</span>,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> <span class="bu">format</span><span class="op">=</span><span class="st">'</span><span class="sc">{instruction}</span><span class="st"> </span><span class="sc">{input}</span><span class="st"> '</span>,</span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> no_input_format<span class="op">=</span><span class="st">'</span><span class="sc">{instruction}</span><span class="st"> '</span>,</span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> system_format<span class="op">=</span><span class="st">'</span><span class="sc">{system}</span><span class="st">'</span>,</span>
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> system_prompt<span class="op">=</span><span class="st">''</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> field_system<span class="op">=</span><span class="st">'system'</span>,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> field_instruction<span class="op">=</span><span class="st">'instruction'</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> field_input<span class="op">=</span><span class="st">'input'</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> field_output<span class="op">=</span><span class="st">'output'</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> <span class="bu">format</span><span class="op">=</span><span class="st">'</span><span class="sc">{instruction}</span><span class="st"> </span><span class="sc">{input}</span><span class="st"> '</span>,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> no_input_format<span class="op">=</span><span class="st">'</span><span class="sc">{instruction}</span><span class="st"> '</span>,</span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> system_format<span class="op">=</span><span class="st">'</span><span class="sc">{system}</span><span class="st">'</span>,</span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>dataclass configuration representing a userdefined dataset type</p>
</section>
<section id="axolotl.prompt_strategies.user_defined.UserDefinedPromptTokenizationStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.user_defined.UserDefinedPromptTokenizationStrategy">UserDefinedPromptTokenizationStrategy</h3>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.user_defined.UserDefinedPromptTokenizationStrategy(</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Prompt Tokenization Strategy for user defined prompts</p>

View File

@@ -571,34 +571,31 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.prompt_tokenizers.AlpacaMultipleChoicePromptTokenizingStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_tokenizers.AlpacaMultipleChoicePromptTokenizingStrategy">AlpacaMultipleChoicePromptTokenizingStrategy</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_tokenizers.AlpacaMultipleChoicePromptTokenizingStrategy(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Tokenizing strategy for Alpaca Multiple Choice prompts.</p>
</section>
<section id="axolotl.prompt_tokenizers.AlpacaPromptTokenizingStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_tokenizers.AlpacaPromptTokenizingStrategy">AlpacaPromptTokenizingStrategy</h3>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>prompt_tokenizers.AlpacaPromptTokenizingStrategy(</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Tokenizing strategy for Alpaca prompts.</p>
</section>
<section id="axolotl.prompt_tokenizers.AlpacaReflectionPTStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_tokenizers.AlpacaReflectionPTStrategy">AlpacaReflectionPTStrategy</h3>
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>prompt_tokenizers.AlpacaReflectionPTStrategy(</span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Tokenizing strategy for Alpaca Reflection prompts.</p>
</section>
<section id="axolotl.prompt_tokenizers.DatasetWrappingStrategy" class="level3">
@@ -609,23 +606,21 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.prompt_tokenizers.GPTeacherPromptTokenizingStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_tokenizers.GPTeacherPromptTokenizingStrategy">GPTeacherPromptTokenizingStrategy</h3>
<div class="sourceCode" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>prompt_tokenizers.GPTeacherPromptTokenizingStrategy(</span>
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb5-7"><a href="#cb5-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Tokenizing strategy for GPTeacher prompts.</p>
</section>
<section id="axolotl.prompt_tokenizers.InstructionPromptTokenizingStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_tokenizers.InstructionPromptTokenizingStrategy">InstructionPromptTokenizingStrategy</h3>
<div class="sourceCode" id="cb6"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a>prompt_tokenizers.InstructionPromptTokenizingStrategy(</span>
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb6-7"><a href="#cb6-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Tokenizing strategy for instruction-based prompts.</p>
</section>
<section id="axolotl.prompt_tokenizers.InvalidDataException" class="level3">
@@ -636,67 +631,61 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.prompt_tokenizers.JeopardyPromptTokenizingStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_tokenizers.JeopardyPromptTokenizingStrategy">JeopardyPromptTokenizingStrategy</h3>
<div class="sourceCode" id="cb8"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a>prompt_tokenizers.JeopardyPromptTokenizingStrategy(</span>
<span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb8-3"><a href="#cb8-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb8-4"><a href="#cb8-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb8-5"><a href="#cb8-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb8-6"><a href="#cb8-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb8-7"><a href="#cb8-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb8-3"><a href="#cb8-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb8-4"><a href="#cb8-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb8-5"><a href="#cb8-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb8-6"><a href="#cb8-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Tokenizing strategy for Jeopardy prompts.</p>
</section>
<section id="axolotl.prompt_tokenizers.NomicGPT4AllPromptTokenizingStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_tokenizers.NomicGPT4AllPromptTokenizingStrategy">NomicGPT4AllPromptTokenizingStrategy</h3>
<div class="sourceCode" id="cb9"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb9-1"><a href="#cb9-1" aria-hidden="true" tabindex="-1"></a>prompt_tokenizers.NomicGPT4AllPromptTokenizingStrategy(</span>
<span id="cb9-2"><a href="#cb9-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb9-3"><a href="#cb9-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb9-4"><a href="#cb9-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb9-5"><a href="#cb9-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb9-6"><a href="#cb9-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb9-7"><a href="#cb9-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb9-2"><a href="#cb9-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb9-3"><a href="#cb9-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb9-4"><a href="#cb9-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb9-5"><a href="#cb9-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb9-6"><a href="#cb9-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Tokenizing strategy for NomicGPT4All prompts.</p>
</section>
<section id="axolotl.prompt_tokenizers.OpenAssistantPromptTokenizingStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_tokenizers.OpenAssistantPromptTokenizingStrategy">OpenAssistantPromptTokenizingStrategy</h3>
<div class="sourceCode" id="cb10"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb10-1"><a href="#cb10-1" aria-hidden="true" tabindex="-1"></a>prompt_tokenizers.OpenAssistantPromptTokenizingStrategy(</span>
<span id="cb10-2"><a href="#cb10-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb10-3"><a href="#cb10-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb10-4"><a href="#cb10-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb10-5"><a href="#cb10-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb10-6"><a href="#cb10-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb10-7"><a href="#cb10-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb10-2"><a href="#cb10-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb10-3"><a href="#cb10-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb10-4"><a href="#cb10-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb10-5"><a href="#cb10-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb10-6"><a href="#cb10-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Tokenizing strategy for OpenAssistant prompts.</p>
</section>
<section id="axolotl.prompt_tokenizers.PromptTokenizingStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_tokenizers.PromptTokenizingStrategy">PromptTokenizingStrategy</h3>
<div class="sourceCode" id="cb11"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb11-1"><a href="#cb11-1" aria-hidden="true" tabindex="-1"></a>prompt_tokenizers.PromptTokenizingStrategy(</span>
<span id="cb11-2"><a href="#cb11-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb11-3"><a href="#cb11-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb11-4"><a href="#cb11-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb11-5"><a href="#cb11-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb11-6"><a href="#cb11-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb11-7"><a href="#cb11-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb11-2"><a href="#cb11-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb11-3"><a href="#cb11-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb11-4"><a href="#cb11-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb11-5"><a href="#cb11-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb11-6"><a href="#cb11-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Abstract class for tokenizing strategies</p>
</section>
<section id="axolotl.prompt_tokenizers.ReflectionPromptTokenizingStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_tokenizers.ReflectionPromptTokenizingStrategy">ReflectionPromptTokenizingStrategy</h3>
<div class="sourceCode" id="cb12"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb12-1"><a href="#cb12-1" aria-hidden="true" tabindex="-1"></a>prompt_tokenizers.ReflectionPromptTokenizingStrategy(</span>
<span id="cb12-2"><a href="#cb12-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb12-3"><a href="#cb12-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb12-4"><a href="#cb12-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb12-5"><a href="#cb12-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb12-6"><a href="#cb12-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb12-7"><a href="#cb12-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb12-2"><a href="#cb12-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb12-3"><a href="#cb12-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb12-4"><a href="#cb12-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb12-5"><a href="#cb12-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb12-6"><a href="#cb12-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Tokenizing strategy for Reflection prompts.</p>
</section>
<section id="axolotl.prompt_tokenizers.SummarizeTLDRPromptTokenizingStrategy" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.prompt_tokenizers.SummarizeTLDRPromptTokenizingStrategy">SummarizeTLDRPromptTokenizingStrategy</h3>
<div class="sourceCode" id="cb13"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb13-1"><a href="#cb13-1" aria-hidden="true" tabindex="-1"></a>prompt_tokenizers.SummarizeTLDRPromptTokenizingStrategy(</span>
<span id="cb13-2"><a href="#cb13-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb13-3"><a href="#cb13-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb13-4"><a href="#cb13-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb13-5"><a href="#cb13-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb13-6"><a href="#cb13-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb13-7"><a href="#cb13-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb13-2"><a href="#cb13-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
<span id="cb13-3"><a href="#cb13-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb13-4"><a href="#cb13-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb13-5"><a href="#cb13-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
<span id="cb13-6"><a href="#cb13-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Tokenizing strategy for SummarizeTLDR prompts.</p>
</section>
</section>

View File

@@ -505,10 +505,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</table>
<section id="axolotl.utils.callbacks.comet_.SaveAxolotlConfigtoCometCallback" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.utils.callbacks.comet_.SaveAxolotlConfigtoCometCallback">SaveAxolotlConfigtoCometCallback</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.callbacks.comet_.SaveAxolotlConfigtoCometCallback(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> axolotl_config_path,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.callbacks.comet_.SaveAxolotlConfigtoCometCallback(axolotl_config_path)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Callback to save axolotl config to comet</p>

View File

@@ -505,10 +505,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</table>
<section id="axolotl.utils.callbacks.mlflow_.SaveAxolotlConfigtoMlflowCallback" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.utils.callbacks.mlflow_.SaveAxolotlConfigtoMlflowCallback">SaveAxolotlConfigtoMlflowCallback</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.callbacks.mlflow_.SaveAxolotlConfigtoMlflowCallback(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> axolotl_config_path,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.callbacks.mlflow_.SaveAxolotlConfigtoMlflowCallback(axolotl_config_path)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Callback to save axolotl config to mlflow</p>

View File

@@ -505,7 +505,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</table>
<section id="axolotl.utils.callbacks.perplexity.Perplexity" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.utils.callbacks.perplexity.Perplexity">Perplexity</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.callbacks.perplexity.Perplexity(<span class="va">self</span>, tokenizer, max_seq_len, stride<span class="op">=</span><span class="dv">512</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.callbacks.perplexity.Perplexity(tokenizer, max_seq_len, stride<span class="op">=</span><span class="dv">512</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Calculate perplexity as defined in https://huggingface.co/docs/transformers/en/perplexity.
This is a custom variant that doesnt re-tokenize the input or re-load the model.</p>
<section id="methods" class="level4">

View File

@@ -505,7 +505,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</table>
<section id="axolotl.utils.callbacks.profiler.PytorchProfilerCallback" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.utils.callbacks.profiler.PytorchProfilerCallback">PytorchProfilerCallback</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.callbacks.profiler.PytorchProfilerCallback(<span class="va">self</span>, steps_to_profile<span class="op">=</span><span class="dv">5</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.callbacks.profiler.PytorchProfilerCallback(steps_to_profile<span class="op">=</span><span class="dv">5</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>PyTorch Profiler callback to create snapshots of GPU memory usage at specified steps.</p>

View File

@@ -509,7 +509,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</table>
<section id="axolotl.utils.callbacks.qat.QATCallback" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.utils.callbacks.qat.QATCallback">QATCallback</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.callbacks.qat.QATCallback(<span class="va">self</span>, cfg)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.callbacks.qat.QATCallback(cfg)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Callback to toggle fake quantization for the model.</p>
</section>
</section>

View File

@@ -521,31 +521,29 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.utils.collators.batching.BatchSamplerDataCollatorForSeq2Seq" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.utils.collators.batching.BatchSamplerDataCollatorForSeq2Seq">BatchSamplerDataCollatorForSeq2Seq</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.collators.batching.BatchSamplerDataCollatorForSeq2Seq(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> model<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> padding<span class="op">=</span><span class="va">True</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> max_length<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> pad_to_multiple_of<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> label_pad_token_id<span class="op">=-</span><span class="dv">100</span>,</span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> position_pad_token_id<span class="op">=</span><span class="dv">0</span>,</span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> return_tensors<span class="op">=</span><span class="st">'pt'</span>,</span>
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> model<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> padding<span class="op">=</span><span class="va">True</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> max_length<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> pad_to_multiple_of<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> label_pad_token_id<span class="op">=-</span><span class="dv">100</span>,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> position_pad_token_id<span class="op">=</span><span class="dv">0</span>,</span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> return_tensors<span class="op">=</span><span class="st">'pt'</span>,</span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Collator for multipack specific to the using the BatchSampler</p>
</section>
<section id="axolotl.utils.collators.batching.DataCollatorForSeq2Seq" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.utils.collators.batching.DataCollatorForSeq2Seq">DataCollatorForSeq2Seq</h3>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>utils.collators.batching.DataCollatorForSeq2Seq(</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> model<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> padding<span class="op">=</span><span class="va">True</span>,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> max_length<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> pad_to_multiple_of<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> label_pad_token_id<span class="op">=-</span><span class="dv">100</span>,</span>
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a> position_pad_token_id<span class="op">=</span><span class="dv">0</span>,</span>
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a> return_tensors<span class="op">=</span><span class="st">'pt'</span>,</span>
<span id="cb2-11"><a href="#cb2-11" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> model<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> padding<span class="op">=</span><span class="va">True</span>,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> max_length<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> pad_to_multiple_of<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> label_pad_token_id<span class="op">=-</span><span class="dv">100</span>,</span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> position_pad_token_id<span class="op">=</span><span class="dv">0</span>,</span>
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a> return_tensors<span class="op">=</span><span class="st">'pt'</span>,</span>
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Data collator that will dynamically pad the inputs received, as well as the labels and position_ids</p>
<section id="parameters" class="level4 doc-section doc-section-parameters">
<h4 class="doc-section doc-section-parameters anchored" data-anchor-id="parameters">Parameters</h4>
@@ -614,26 +612,24 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.utils.collators.batching.PretrainingBatchSamplerDataCollatorForSeq2Seq" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.utils.collators.batching.PretrainingBatchSamplerDataCollatorForSeq2Seq">PretrainingBatchSamplerDataCollatorForSeq2Seq</h3>
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>utils.collators.batching.PretrainingBatchSamplerDataCollatorForSeq2Seq(</span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> multipack_attn<span class="op">=</span><span class="va">True</span>,</span>
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> multipack_attn<span class="op">=</span><span class="va">True</span>,</span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Collator for multipack specific to the using the BatchSampler</p>
</section>
<section id="axolotl.utils.collators.batching.V2BatchSamplerDataCollatorForSeq2Seq" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.utils.collators.batching.V2BatchSamplerDataCollatorForSeq2Seq">V2BatchSamplerDataCollatorForSeq2Seq</h3>
<div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>utils.collators.batching.V2BatchSamplerDataCollatorForSeq2Seq(</span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> model<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> padding<span class="op">=</span><span class="va">True</span>,</span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> max_length<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a> pad_to_multiple_of<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a> label_pad_token_id<span class="op">=-</span><span class="dv">100</span>,</span>
<span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a> position_pad_token_id<span class="op">=</span><span class="dv">0</span>,</span>
<span id="cb4-10"><a href="#cb4-10" aria-hidden="true" tabindex="-1"></a> return_tensors<span class="op">=</span><span class="st">'pt'</span>,</span>
<span id="cb4-11"><a href="#cb4-11" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> model<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> padding<span class="op">=</span><span class="va">True</span>,</span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> max_length<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> pad_to_multiple_of<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a> label_pad_token_id<span class="op">=-</span><span class="dv">100</span>,</span>
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a> position_pad_token_id<span class="op">=</span><span class="dv">0</span>,</span>
<span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a> return_tensors<span class="op">=</span><span class="st">'pt'</span>,</span>
<span id="cb4-10"><a href="#cb4-10" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Collator for multipack specific to the using the BatchSampler</p>

View File

@@ -505,7 +505,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
</table>
<section id="axolotl.utils.collators.mamba.MambaDataCollator" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.utils.collators.mamba.MambaDataCollator">MambaDataCollator</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.collators.mamba.MambaDataCollator(<span class="va">self</span>, tokenizer)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.collators.mamba.MambaDataCollator(tokenizer)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Collator for State Space Models (Mamba)</p>

View File

@@ -506,14 +506,13 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.utils.collators.mm_chat.MultiModalChatDataCollator" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.utils.collators.mm_chat.MultiModalChatDataCollator">MultiModalChatDataCollator</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.collators.mm_chat.MultiModalChatDataCollator(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> processing_strategy,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> packing<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> return_tensors<span class="op">=</span><span class="st">'pt'</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> padding<span class="op">=</span><span class="va">True</span>,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> pad_to_multiple_of<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> processing_strategy,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> packing<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> return_tensors<span class="op">=</span><span class="st">'pt'</span>,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> padding<span class="op">=</span><span class="va">True</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> pad_to_multiple_of<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Collator for multi-modal chat messages</p>

View File

@@ -680,13 +680,12 @@ from the full gradient tensor.</p>
<section id="axolotl.utils.ctx_managers.sequence_parallel.SequenceParallelContextManager" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.utils.ctx_managers.sequence_parallel.SequenceParallelContextManager">SequenceParallelContextManager</h3>
<div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>utils.ctx_managers.sequence_parallel.SequenceParallelContextManager(</span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> models,</span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> sequence_parallel_degree,</span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> gradient_accumulation_steps,</span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> ring_attn_func,</span>
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a> heads_k_stride,</span>
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> models,</span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> sequence_parallel_degree,</span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> gradient_accumulation_steps,</span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> ring_attn_func,</span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> heads_k_stride,</span>
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Context manager for sequence parallelism operations.</p>
<p>This class provides a context that will automatically apply sequence parallelism
during model forward passes using a pre-forward hook, and gather outputs from

View File

@@ -538,7 +538,7 @@ window.Quarto = {
</table>
<section id="axolotl.utils.freeze.LayerNamePattern" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.utils.freeze.LayerNamePattern">LayerNamePattern</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.freeze.LayerNamePattern(<span class="va">self</span>, pattern)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.freeze.LayerNamePattern(pattern)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Represents a regex pattern for layer names, potentially including a parameter index range.</p>
<section id="methods" class="level4">
<h4 class="anchored" data-anchor-id="methods">Methods</h4>

View File

@@ -514,21 +514,20 @@ into fixed-capacity batches to optimize memory usage and training throughput.</p
<section id="axolotl.utils.samplers.multipack.MultipackBatchSampler" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.utils.samplers.multipack.MultipackBatchSampler">MultipackBatchSampler</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.samplers.multipack.MultipackBatchSampler(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> sampler,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> batch_size,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> batch_max_len,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> lengths,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> packing_efficiency_estimate<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> drop_last<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> num_count_samples<span class="op">=</span><span class="dv">16</span>,</span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> sequential<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a> group_size<span class="op">=</span><span class="dv">100000</span>,</span>
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a> bin_size<span class="op">=</span><span class="dv">200</span>,</span>
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a> num_processes<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a> safe_mode<span class="op">=</span><span class="va">True</span>,</span>
<span id="cb1-15"><a href="#cb1-15" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb1-16"><a href="#cb1-16" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> sampler,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> batch_size,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> batch_max_len,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> lengths,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> packing_efficiency_estimate<span class="op">=</span><span class="fl">1.0</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> drop_last<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> num_count_samples<span class="op">=</span><span class="dv">16</span>,</span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> sequential<span class="op">=</span><span class="va">False</span>,</span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> group_size<span class="op">=</span><span class="dv">100000</span>,</span>
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a> bin_size<span class="op">=</span><span class="dv">200</span>,</span>
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a> num_processes<span class="op">=</span><span class="va">None</span>,</span>
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a> safe_mode<span class="op">=</span><span class="va">True</span>,</span>
<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
<span id="cb1-15"><a href="#cb1-15" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Batch sampler class for efficient packing of variable-length sequences</p>
<p>This sampler packs sequences into fixed-capacity bins (batches) to maximize
GPU memory utilization and training throughput by reducing padding.</p>

View File

@@ -517,26 +517,24 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
<section id="axolotl.utils.schedulers.InterpolatingLogScheduler" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.utils.schedulers.InterpolatingLogScheduler">InterpolatingLogScheduler</h3>
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.schedulers.InterpolatingLogScheduler(</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> optimizer,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> num_steps,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> min_lr,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> max_lr,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> last_epoch<span class="op">=-</span><span class="dv">1</span>,</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> optimizer,</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> num_steps,</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> min_lr,</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> max_lr,</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> last_epoch<span class="op">=-</span><span class="dv">1</span>,</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>A scheduler that interpolates learning rates in a logarithmic fashion</p>
</section>
<section id="axolotl.utils.schedulers.RexLR" class="level3">
<h3 class="anchored" data-anchor-id="axolotl.utils.schedulers.RexLR">RexLR</h3>
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>utils.schedulers.RexLR(</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> optimizer,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> max_lr,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> min_lr,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> total_steps<span class="op">=</span><span class="dv">0</span>,</span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> num_warmup_steps<span class="op">=</span><span class="dv">0</span>,</span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> last_step<span class="op">=</span><span class="dv">0</span>,</span>
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> optimizer,</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> max_lr,</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> min_lr,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> total_steps<span class="op">=</span><span class="dv">0</span>,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> num_warmup_steps<span class="op">=</span><span class="dv">0</span>,</span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> last_step<span class="op">=</span><span class="dv">0</span>,</span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Reflected Exponential (REX) learning rate scheduler.</p>
<ul>
<li>Original implementation: https://github.com/IvanVassi/REX_LR</li>