Built site for gh-pages
This commit is contained in:
@@ -531,72 +531,67 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.cli.args.EvaluateCliArgs" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.cli.args.EvaluateCliArgs">EvaluateCliArgs</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>cli.args.EvaluateCliArgs(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> debug<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> debug_text_only<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> debug_num_examples<span class="op">=</span><span class="dv">0</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> debug<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> debug_text_only<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> debug_num_examples<span class="op">=</span><span class="dv">0</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Dataclass with CLI arguments for <code>axolotl evaluate</code> command.</p>
|
||||
</section>
|
||||
<section id="axolotl.cli.args.InferenceCliArgs" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.cli.args.InferenceCliArgs">InferenceCliArgs</h3>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>cli.args.InferenceCliArgs(<span class="va">self</span>, prompter<span class="op">=</span><span class="va">None</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>cli.args.InferenceCliArgs(prompter<span class="op">=</span><span class="va">None</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Dataclass with CLI arguments for <code>axolotl inference</code> command.</p>
|
||||
</section>
|
||||
<section id="axolotl.cli.args.PreprocessCliArgs" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.cli.args.PreprocessCliArgs">PreprocessCliArgs</h3>
|
||||
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>cli.args.PreprocessCliArgs(</span>
|
||||
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> debug<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> debug_text_only<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> debug_num_examples<span class="op">=</span><span class="dv">1</span>,</span>
|
||||
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a> prompter<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a> download<span class="op">=</span><span class="va">True</span>,</span>
|
||||
<span id="cb3-8"><a href="#cb3-8" aria-hidden="true" tabindex="-1"></a> iterable<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-9"><a href="#cb3-9" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> debug<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> debug_text_only<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> debug_num_examples<span class="op">=</span><span class="dv">1</span>,</span>
|
||||
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> prompter<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a> download<span class="op">=</span><span class="va">True</span>,</span>
|
||||
<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a> iterable<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-8"><a href="#cb3-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Dataclass with CLI arguments for <code>axolotl preprocess</code> command.</p>
|
||||
</section>
|
||||
<section id="axolotl.cli.args.QuantizeCliArgs" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.cli.args.QuantizeCliArgs">QuantizeCliArgs</h3>
|
||||
<div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>cli.args.QuantizeCliArgs(</span>
|
||||
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> base_model<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> weight_dtype<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> activation_dtype<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> quantize_embedding<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a> group_size<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a> output_dir<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> base_model<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> weight_dtype<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> activation_dtype<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> quantize_embedding<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> group_size<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a> output_dir<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Dataclass with CLI arguments for <code>axolotl quantize</code> command.</p>
|
||||
</section>
|
||||
<section id="axolotl.cli.args.TrainerCliArgs" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.cli.args.TrainerCliArgs">TrainerCliArgs</h3>
|
||||
<div class="sourceCode" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>cli.args.TrainerCliArgs(</span>
|
||||
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a> debug<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a> debug_text_only<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a> debug_num_examples<span class="op">=</span><span class="dv">0</span>,</span>
|
||||
<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a> merge_lora<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-7"><a href="#cb5-7" aria-hidden="true" tabindex="-1"></a> prompter<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-8"><a href="#cb5-8" aria-hidden="true" tabindex="-1"></a> shard<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-9"><a href="#cb5-9" aria-hidden="true" tabindex="-1"></a> main_process_port<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-10"><a href="#cb5-10" aria-hidden="true" tabindex="-1"></a> num_processes<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-11"><a href="#cb5-11" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a> debug<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a> debug_text_only<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a> debug_num_examples<span class="op">=</span><span class="dv">0</span>,</span>
|
||||
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a> merge_lora<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a> prompter<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-7"><a href="#cb5-7" aria-hidden="true" tabindex="-1"></a> shard<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-8"><a href="#cb5-8" aria-hidden="true" tabindex="-1"></a> main_process_port<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-9"><a href="#cb5-9" aria-hidden="true" tabindex="-1"></a> num_processes<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-10"><a href="#cb5-10" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Dataclass with CLI arguments for <code>axolotl train</code> command.</p>
|
||||
</section>
|
||||
<section id="axolotl.cli.args.VllmServeCliArgs" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.cli.args.VllmServeCliArgs">VllmServeCliArgs</h3>
|
||||
<div class="sourceCode" id="cb6"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a>cli.args.VllmServeCliArgs(</span>
|
||||
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a> tensor_parallel_size<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a> host<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a> port<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a> gpu_memory_utilization<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-7"><a href="#cb6-7" aria-hidden="true" tabindex="-1"></a> dtype<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-8"><a href="#cb6-8" aria-hidden="true" tabindex="-1"></a> max_model_len<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-9"><a href="#cb6-9" aria-hidden="true" tabindex="-1"></a> enable_prefix_caching<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-10"><a href="#cb6-10" aria-hidden="true" tabindex="-1"></a> serve_module<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-11"><a href="#cb6-11" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a> tensor_parallel_size<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a> host<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a> port<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a> gpu_memory_utilization<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a> dtype<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-7"><a href="#cb6-7" aria-hidden="true" tabindex="-1"></a> max_model_len<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-8"><a href="#cb6-8" aria-hidden="true" tabindex="-1"></a> enable_prefix_caching<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-9"><a href="#cb6-9" aria-hidden="true" tabindex="-1"></a> serve_module<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-10"><a href="#cb6-10" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Dataclass with CLI arguments for <code>axolotl vllm-serve</code> command.</p>
|
||||
|
||||
|
||||
|
||||
@@ -509,7 +509,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</table>
|
||||
<section id="axolotl.cli.cloud.modal_.ModalCloud" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.cli.cloud.modal_.ModalCloud">ModalCloud</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>cli.cloud.modal_.ModalCloud(<span class="va">self</span>, config, app<span class="op">=</span><span class="va">None</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>cli.cloud.modal_.ModalCloud(config, app<span class="op">=</span><span class="va">None</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Modal Cloud implementation.</p>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
@@ -512,11 +512,10 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.common.datasets.TrainDatasetMeta" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.common.datasets.TrainDatasetMeta">TrainDatasetMeta</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>common.datasets.TrainDatasetMeta(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> train_dataset,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> eval_dataset<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> total_num_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> train_dataset,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> eval_dataset<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> total_num_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Dataclass with fields for training and validation datasets and metadata.</p>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
@@ -535,7 +535,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</section>
|
||||
<section id="axolotl.convert.FileWriter" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.convert.FileWriter">FileWriter</h3>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>convert.FileWriter(<span class="va">self</span>, file_path)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>convert.FileWriter(file_path)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Writes a string to a file</p>
|
||||
</section>
|
||||
<section id="axolotl.convert.JsonParser" class="level3">
|
||||
@@ -546,12 +546,11 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.convert.JsonToJsonlConverter" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.convert.JsonToJsonlConverter">JsonToJsonlConverter</h3>
|
||||
<div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>convert.JsonToJsonlConverter(</span>
|
||||
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> file_reader,</span>
|
||||
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> file_writer,</span>
|
||||
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> json_parser,</span>
|
||||
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> jsonl_serializer,</span>
|
||||
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> file_reader,</span>
|
||||
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> file_writer,</span>
|
||||
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> json_parser,</span>
|
||||
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> jsonl_serializer,</span>
|
||||
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Converts a JSON file to JSONL</p>
|
||||
</section>
|
||||
<section id="axolotl.convert.JsonlSerializer" class="level3">
|
||||
|
||||
943
docs/api/core.builders.base.html
Normal file
943
docs/api/core.builders.base.html
Normal file
@@ -0,0 +1,943 @@
|
||||
<!DOCTYPE html>
|
||||
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>
|
||||
|
||||
<meta charset="utf-8">
|
||||
<meta name="generator" content="quarto-1.7.31">
|
||||
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
|
||||
|
||||
|
||||
<title>core.builders.base – Axolotl</title>
|
||||
<style>
|
||||
code{white-space: pre-wrap;}
|
||||
span.smallcaps{font-variant: small-caps;}
|
||||
div.columns{display: flex; gap: min(4vw, 1.5em);}
|
||||
div.column{flex: auto; overflow-x: auto;}
|
||||
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
|
||||
ul.task-list{list-style: none;}
|
||||
ul.task-list li input[type="checkbox"] {
|
||||
width: 0.8em;
|
||||
margin: 0 0.8em 0.2em -1em; /* quarto-specific, see https://github.com/quarto-dev/quarto-cli/issues/4556 */
|
||||
vertical-align: middle;
|
||||
}
|
||||
/* CSS for syntax highlighting */
|
||||
html { -webkit-text-size-adjust: 100%; }
|
||||
pre > code.sourceCode { white-space: pre; position: relative; }
|
||||
pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
|
||||
pre > code.sourceCode > span:empty { height: 1.2em; }
|
||||
.sourceCode { overflow: visible; }
|
||||
code.sourceCode > span { color: inherit; text-decoration: inherit; }
|
||||
div.sourceCode { margin: 1em 0; }
|
||||
pre.sourceCode { margin: 0; }
|
||||
@media screen {
|
||||
div.sourceCode { overflow: auto; }
|
||||
}
|
||||
@media print {
|
||||
pre > code.sourceCode { white-space: pre-wrap; }
|
||||
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
|
||||
}
|
||||
pre.numberSource code
|
||||
{ counter-reset: source-line 0; }
|
||||
pre.numberSource code > span
|
||||
{ position: relative; left: -4em; counter-increment: source-line; }
|
||||
pre.numberSource code > span > a:first-child::before
|
||||
{ content: counter(source-line);
|
||||
position: relative; left: -1em; text-align: right; vertical-align: baseline;
|
||||
border: none; display: inline-block;
|
||||
-webkit-touch-callout: none; -webkit-user-select: none;
|
||||
-khtml-user-select: none; -moz-user-select: none;
|
||||
-ms-user-select: none; user-select: none;
|
||||
padding: 0 4px; width: 4em;
|
||||
}
|
||||
pre.numberSource { margin-left: 3em; padding-left: 4px; }
|
||||
div.sourceCode
|
||||
{ }
|
||||
@media screen {
|
||||
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
|
||||
}
|
||||
</style>
|
||||
|
||||
|
||||
<script src="../../site_libs/quarto-nav/quarto-nav.js"></script>
|
||||
<script src="../../site_libs/clipboard/clipboard.min.js"></script>
|
||||
<script src="../../site_libs/quarto-search/autocomplete.umd.js"></script>
|
||||
<script src="../../site_libs/quarto-search/fuse.min.js"></script>
|
||||
<script src="../../site_libs/quarto-search/quarto-search.js"></script>
|
||||
<meta name="quarto:offset" content="../../">
|
||||
<link href="../../favicon.jpg" rel="icon" type="image/jpeg">
|
||||
<script src="../../site_libs/quarto-html/quarto.js" type="module"></script>
|
||||
<script src="../../site_libs/quarto-html/tabsets/tabsets.js" type="module"></script>
|
||||
<script src="../../site_libs/quarto-html/popper.min.js"></script>
|
||||
<script src="../../site_libs/quarto-html/tippy.umd.min.js"></script>
|
||||
<script src="../../site_libs/quarto-html/anchor.min.js"></script>
|
||||
<link href="../../site_libs/quarto-html/tippy.css" rel="stylesheet">
|
||||
<link href="../../site_libs/quarto-html/quarto-syntax-highlighting-dark-8ef56b68f8fa1e9d2ba328e99e439f80.css" rel="stylesheet" id="quarto-text-highlighting-styles">
|
||||
<script src="../../site_libs/bootstrap/bootstrap.min.js"></script>
|
||||
<link href="../../site_libs/bootstrap/bootstrap-icons.css" rel="stylesheet">
|
||||
<link href="../../site_libs/bootstrap/bootstrap-2288ecdcbf81d2ab6432743cedd71d9a.min.css" rel="stylesheet" append-hash="true" id="quarto-bootstrap" data-mode="dark">
|
||||
<script id="quarto-search-options" type="application/json">{
|
||||
"location": "navbar",
|
||||
"copy-button": false,
|
||||
"collapse-after": 3,
|
||||
"panel-placement": "end",
|
||||
"type": "overlay",
|
||||
"limit": 50,
|
||||
"keyboard-shortcut": [
|
||||
"f",
|
||||
"/",
|
||||
"s"
|
||||
],
|
||||
"show-item-context": false,
|
||||
"language": {
|
||||
"search-no-results-text": "No results",
|
||||
"search-matching-documents-text": "matching documents",
|
||||
"search-copy-link-title": "Copy link to search",
|
||||
"search-hide-matches-text": "Hide additional matches",
|
||||
"search-more-match-text": "more match in this document",
|
||||
"search-more-matches-text": "more matches in this document",
|
||||
"search-clear-button-title": "Clear",
|
||||
"search-text-placeholder": "",
|
||||
"search-detached-cancel-button-title": "Cancel",
|
||||
"search-submit-button-title": "Submit",
|
||||
"search-label": "Search"
|
||||
}
|
||||
}</script>
|
||||
<script async="" src="https://www.googletagmanager.com/gtag/js?id=G-9KYCVJBNMQ"></script>
|
||||
|
||||
<script type="text/javascript">
|
||||
|
||||
window.dataLayer = window.dataLayer || [];
|
||||
function gtag(){dataLayer.push(arguments);}
|
||||
gtag('js', new Date());
|
||||
gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</script>
|
||||
|
||||
|
||||
<link rel="stylesheet" href="../../styles.css">
|
||||
</head>
|
||||
|
||||
<body class="nav-sidebar docked nav-fixed quarto-light">
|
||||
|
||||
<div id="quarto-search-results"></div>
|
||||
<header id="quarto-header" class="headroom fixed-top">
|
||||
<nav class="navbar navbar-expand " data-bs-theme="dark">
|
||||
<div class="navbar-container container-fluid">
|
||||
<div class="navbar-brand-container mx-auto">
|
||||
<a href="../../index.html" class="navbar-brand navbar-brand-logo">
|
||||
<img src="../../image/axolotl_logo_digital_white.svg" alt="" class="navbar-logo">
|
||||
</a>
|
||||
</div>
|
||||
<div class="quarto-navbar-tools tools-wide tools-end">
|
||||
<a href="https://twitter.com/axolotl_ai" title="" class="quarto-navigation-tool px-1" aria-label=""><i class="bi bi-twitter"></i></a>
|
||||
<a href="https://github.com/axolotl-ai-cloud/axolotl/" title="" class="quarto-navigation-tool px-1" aria-label=""><i class="bi bi-github"></i></a>
|
||||
<a href="https://discord.gg/7m9sfhzaf3" title="" class="quarto-navigation-tool px-1" aria-label=""><i class="bi bi-discord"></i></a>
|
||||
</div>
|
||||
<div id="quarto-search" class="" title="Search"></div>
|
||||
</div> <!-- /container-fluid -->
|
||||
</nav>
|
||||
<nav class="quarto-secondary-nav">
|
||||
<div class="container-fluid d-flex">
|
||||
<button type="button" class="quarto-btn-toggle btn" data-bs-toggle="collapse" role="button" data-bs-target=".quarto-sidebar-collapse-item" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
|
||||
<i class="bi bi-layout-text-sidebar-reverse"></i>
|
||||
</button>
|
||||
<nav class="quarto-page-breadcrumbs" aria-label="breadcrumb"><ol class="breadcrumb"></ol></nav>
|
||||
<a class="flex-grow-1" role="navigation" data-bs-toggle="collapse" data-bs-target=".quarto-sidebar-collapse-item" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
|
||||
</a>
|
||||
</div>
|
||||
</nav>
|
||||
</header>
|
||||
<!-- content -->
|
||||
<div id="quarto-content" class="quarto-container page-columns page-rows-contents page-layout-article page-navbar">
|
||||
<!-- sidebar -->
|
||||
<nav id="quarto-sidebar" class="sidebar collapse collapse-horizontal quarto-sidebar-collapse-item sidebar-navigation docked overflow-auto">
|
||||
<div class="sidebar-menu-container">
|
||||
<ul class="list-unstyled mt-1">
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../index.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Home</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item sidebar-item-section">
|
||||
<div class="sidebar-item-container">
|
||||
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-1" role="navigation" aria-expanded="true">
|
||||
<span class="menu-text">Getting Started</span></a>
|
||||
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-1" role="navigation" aria-expanded="true" aria-label="Toggle section">
|
||||
<i class="bi bi-chevron-right ms-2"></i>
|
||||
</a>
|
||||
</div>
|
||||
<ul id="quarto-sidebar-section-1" class="collapse list-unstyled sidebar-section depth1 show">
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/getting-started.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Quickstart</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/installation.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Installation</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/inference.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Inference and Merging</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/cli.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Command Line Interface (CLI)</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/config.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Config Reference</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/api" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">API Reference</span></a>
|
||||
</div>
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li class="sidebar-item sidebar-item-section">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/dataset-formats/index.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Dataset Formats</span></a>
|
||||
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-2" role="navigation" aria-expanded="true" aria-label="Toggle section">
|
||||
<i class="bi bi-chevron-right ms-2"></i>
|
||||
</a>
|
||||
</div>
|
||||
<ul id="quarto-sidebar-section-2" class="collapse list-unstyled sidebar-section depth1 show">
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/dataset-formats/pretraining.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Pre-training</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/dataset-formats/inst_tune.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Instruction Tuning</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/dataset-formats/conversation.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Conversation</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/dataset-formats/stepwise_supervised.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Stepwise Supervised Format</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/dataset-formats/template_free.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Template-Free</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/dataset-formats/tokenized.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Custom Pre-Tokenized Dataset</span></a>
|
||||
</div>
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li class="sidebar-item sidebar-item-section">
|
||||
<div class="sidebar-item-container">
|
||||
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-3" role="navigation" aria-expanded="true">
|
||||
<span class="menu-text">Deployments</span></a>
|
||||
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-3" role="navigation" aria-expanded="true" aria-label="Toggle section">
|
||||
<i class="bi bi-chevron-right ms-2"></i>
|
||||
</a>
|
||||
</div>
|
||||
<ul id="quarto-sidebar-section-3" class="collapse list-unstyled sidebar-section depth1 show">
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/docker.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Docker</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/multi-gpu.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Multi-GPU</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/multi-node.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Multi Node</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/ray-integration.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Ray Train</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/amd_hpc.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">AMD GPUs on HPC Systems</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/mac.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Mac M-series</span></a>
|
||||
</div>
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li class="sidebar-item sidebar-item-section">
|
||||
<div class="sidebar-item-container">
|
||||
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-4" role="navigation" aria-expanded="true">
|
||||
<span class="menu-text">How To Guides</span></a>
|
||||
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-4" role="navigation" aria-expanded="true" aria-label="Toggle section">
|
||||
<i class="bi bi-chevron-right ms-2"></i>
|
||||
</a>
|
||||
</div>
|
||||
<ul id="quarto-sidebar-section-4" class="collapse list-unstyled sidebar-section depth1 show">
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/multimodal.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">MultiModal / Vision Language Models (BETA)</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/rlhf.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">RLHF (Beta)</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/reward_modelling.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Reward Modelling</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/lr_groups.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Learning Rate Groups</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/lora_optims.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">LoRA Optimizations</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/dataset_loading.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Dataset Loading</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/qat.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Quantization Aware Training (QAT)</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/quantize.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Quantization with torchao</span></a>
|
||||
</div>
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li class="sidebar-item sidebar-item-section">
|
||||
<div class="sidebar-item-container">
|
||||
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-5" role="navigation" aria-expanded="true">
|
||||
<span class="menu-text">Core Concepts</span></a>
|
||||
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-5" role="navigation" aria-expanded="true" aria-label="Toggle section">
|
||||
<i class="bi bi-chevron-right ms-2"></i>
|
||||
</a>
|
||||
</div>
|
||||
<ul id="quarto-sidebar-section-5" class="collapse list-unstyled sidebar-section depth1 show">
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/batch_vs_grad.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Batch size vs Gradient accumulation</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/dataset_preprocessing.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Dataset Preprocessing</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/multipack.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Multipack (Sample Packing)</span></a>
|
||||
</div>
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li class="sidebar-item sidebar-item-section">
|
||||
<div class="sidebar-item-container">
|
||||
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-6" role="navigation" aria-expanded="true">
|
||||
<span class="menu-text">Advanced Features</span></a>
|
||||
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-6" role="navigation" aria-expanded="true" aria-label="Toggle section">
|
||||
<i class="bi bi-chevron-right ms-2"></i>
|
||||
</a>
|
||||
</div>
|
||||
<ul id="quarto-sidebar-section-6" class="collapse list-unstyled sidebar-section depth1 show">
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/fsdp_qlora.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">FDSP + QLoRA</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/unsloth.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Unsloth</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/torchao.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">PyTorch ao</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/custom_integrations.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Custom Integrations</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/sequence_parallelism.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Sequence Parallelism</span></a>
|
||||
</div>
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li class="sidebar-item sidebar-item-section">
|
||||
<div class="sidebar-item-container">
|
||||
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-7" role="navigation" aria-expanded="true">
|
||||
<span class="menu-text">Troubleshooting</span></a>
|
||||
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-7" role="navigation" aria-expanded="true" aria-label="Toggle section">
|
||||
<i class="bi bi-chevron-right ms-2"></i>
|
||||
</a>
|
||||
</div>
|
||||
<ul id="quarto-sidebar-section-7" class="collapse list-unstyled sidebar-section depth1 show">
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/faq.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">FAQ</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/debugging.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Debugging</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/nccl.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">NCCL</span></a>
|
||||
</div>
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
</div>
|
||||
</nav>
|
||||
<div id="quarto-sidebar-glass" class="quarto-sidebar-collapse-item" data-bs-toggle="collapse" data-bs-target=".quarto-sidebar-collapse-item"></div>
|
||||
<!-- margin-sidebar -->
|
||||
<div id="quarto-margin-sidebar" class="sidebar margin-sidebar">
|
||||
<nav id="TOC" role="doc-toc" class="toc-active">
|
||||
<h2 id="toc-title">On this page</h2>
|
||||
|
||||
<ul>
|
||||
<li><a href="#axolotl.core.builders.base" id="toc-axolotl.core.builders.base" class="nav-link active" data-scroll-target="#axolotl.core.builders.base">core.builders.base</a>
|
||||
<ul class="collapse">
|
||||
<li><a href="#classes" id="toc-classes" class="nav-link" data-scroll-target="#classes">Classes</a>
|
||||
<ul class="collapse">
|
||||
<li><a href="#axolotl.core.builders.base.TrainerBuilderBase" id="toc-axolotl.core.builders.base.TrainerBuilderBase" class="nav-link" data-scroll-target="#axolotl.core.builders.base.TrainerBuilderBase">TrainerBuilderBase</a></li>
|
||||
</ul></li>
|
||||
</ul></li>
|
||||
</ul>
|
||||
</nav>
|
||||
</div>
|
||||
<!-- main -->
|
||||
<main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block"></header>
|
||||
|
||||
|
||||
|
||||
|
||||
<section id="axolotl.core.builders.base" class="level1">
|
||||
<h1>core.builders.base</h1>
|
||||
<p><code>core.builders.base</code></p>
|
||||
<p>Base class for trainer builder</p>
|
||||
<section id="classes" class="level2">
|
||||
<h2 class="anchored" data-anchor-id="classes">Classes</h2>
|
||||
<table class="caption-top table">
|
||||
<thead>
|
||||
<tr class="header">
|
||||
<th>Name</th>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr class="odd">
|
||||
<td><a href="#axolotl.core.builders.base.TrainerBuilderBase">TrainerBuilderBase</a></td>
|
||||
<td>Base class for trainer builder.</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<section id="axolotl.core.builders.base.TrainerBuilderBase" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.builders.base.TrainerBuilderBase">TrainerBuilderBase</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.builders.base.TrainerBuilderBase(cfg, model, tokenizer, processor<span class="op">=</span><span class="va">None</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Base class for trainer builder.</p>
|
||||
<section id="methods" class="level4">
|
||||
<h4 class="anchored" data-anchor-id="methods">Methods</h4>
|
||||
<table class="caption-top table">
|
||||
<thead>
|
||||
<tr class="header">
|
||||
<th>Name</th>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr class="odd">
|
||||
<td><a href="#axolotl.core.builders.base.TrainerBuilderBase.get_post_trainer_create_callbacks">get_post_trainer_create_callbacks</a></td>
|
||||
<td>Callbacks added after the trainer is created, usually b/c these need access to the trainer</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<section id="axolotl.core.builders.base.TrainerBuilderBase.get_post_trainer_create_callbacks" class="level5">
|
||||
<h5 class="anchored" data-anchor-id="axolotl.core.builders.base.TrainerBuilderBase.get_post_trainer_create_callbacks">get_post_trainer_create_callbacks</h5>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>core.builders.base.TrainerBuilderBase.get_post_trainer_create_callbacks(trainer)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Callbacks added after the trainer is created, usually b/c these need access to the trainer</p>
|
||||
|
||||
|
||||
</section>
|
||||
</section>
|
||||
</section>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
</main> <!-- /main -->
|
||||
<script id="quarto-html-after-body" type="application/javascript">
|
||||
window.document.addEventListener("DOMContentLoaded", function (event) {
|
||||
const icon = "";
|
||||
const anchorJS = new window.AnchorJS();
|
||||
anchorJS.options = {
|
||||
placement: 'right',
|
||||
icon: icon
|
||||
};
|
||||
anchorJS.add('.anchored');
|
||||
const isCodeAnnotation = (el) => {
|
||||
for (const clz of el.classList) {
|
||||
if (clz.startsWith('code-annotation-')) {
|
||||
return true;
|
||||
}
|
||||
}
|
||||
return false;
|
||||
}
|
||||
const onCopySuccess = function(e) {
|
||||
// button target
|
||||
const button = e.trigger;
|
||||
// don't keep focus
|
||||
button.blur();
|
||||
// flash "checked"
|
||||
button.classList.add('code-copy-button-checked');
|
||||
var currentTitle = button.getAttribute("title");
|
||||
button.setAttribute("title", "Copied!");
|
||||
let tooltip;
|
||||
if (window.bootstrap) {
|
||||
button.setAttribute("data-bs-toggle", "tooltip");
|
||||
button.setAttribute("data-bs-placement", "left");
|
||||
button.setAttribute("data-bs-title", "Copied!");
|
||||
tooltip = new bootstrap.Tooltip(button,
|
||||
{ trigger: "manual",
|
||||
customClass: "code-copy-button-tooltip",
|
||||
offset: [0, -8]});
|
||||
tooltip.show();
|
||||
}
|
||||
setTimeout(function() {
|
||||
if (tooltip) {
|
||||
tooltip.hide();
|
||||
button.removeAttribute("data-bs-title");
|
||||
button.removeAttribute("data-bs-toggle");
|
||||
button.removeAttribute("data-bs-placement");
|
||||
}
|
||||
button.setAttribute("title", currentTitle);
|
||||
button.classList.remove('code-copy-button-checked');
|
||||
}, 1000);
|
||||
// clear code selection
|
||||
e.clearSelection();
|
||||
}
|
||||
const getTextToCopy = function(trigger) {
|
||||
const codeEl = trigger.previousElementSibling.cloneNode(true);
|
||||
for (const childEl of codeEl.children) {
|
||||
if (isCodeAnnotation(childEl)) {
|
||||
childEl.remove();
|
||||
}
|
||||
}
|
||||
return codeEl.innerText;
|
||||
}
|
||||
const clipboard = new window.ClipboardJS('.code-copy-button:not([data-in-quarto-modal])', {
|
||||
text: getTextToCopy
|
||||
});
|
||||
clipboard.on('success', onCopySuccess);
|
||||
if (window.document.getElementById('quarto-embedded-source-code-modal')) {
|
||||
const clipboardModal = new window.ClipboardJS('.code-copy-button[data-in-quarto-modal]', {
|
||||
text: getTextToCopy,
|
||||
container: window.document.getElementById('quarto-embedded-source-code-modal')
|
||||
});
|
||||
clipboardModal.on('success', onCopySuccess);
|
||||
}
|
||||
var localhostRegex = new RegExp(/^(?:http|https):\/\/localhost\:?[0-9]*\//);
|
||||
var mailtoRegex = new RegExp(/^mailto:/);
|
||||
var filterRegex = new RegExp("https:\/\/docs\.axolotl\.ai");
|
||||
var isInternal = (href) => {
|
||||
return filterRegex.test(href) || localhostRegex.test(href) || mailtoRegex.test(href);
|
||||
}
|
||||
// Inspect non-navigation links and adorn them if external
|
||||
var links = window.document.querySelectorAll('a[href]:not(.nav-link):not(.navbar-brand):not(.toc-action):not(.sidebar-link):not(.sidebar-item-toggle):not(.pagination-link):not(.no-external):not([aria-hidden]):not(.dropdown-item):not(.quarto-navigation-tool):not(.about-link)');
|
||||
for (var i=0; i<links.length; i++) {
|
||||
const link = links[i];
|
||||
if (!isInternal(link.href)) {
|
||||
// undo the damage that might have been done by quarto-nav.js in the case of
|
||||
// links that we want to consider external
|
||||
if (link.dataset.originalHref !== undefined) {
|
||||
link.href = link.dataset.originalHref;
|
||||
}
|
||||
}
|
||||
}
|
||||
function tippyHover(el, contentFn, onTriggerFn, onUntriggerFn) {
|
||||
const config = {
|
||||
allowHTML: true,
|
||||
maxWidth: 500,
|
||||
delay: 100,
|
||||
arrow: false,
|
||||
appendTo: function(el) {
|
||||
return el.parentElement;
|
||||
},
|
||||
interactive: true,
|
||||
interactiveBorder: 10,
|
||||
theme: 'quarto',
|
||||
placement: 'bottom-start',
|
||||
};
|
||||
if (contentFn) {
|
||||
config.content = contentFn;
|
||||
}
|
||||
if (onTriggerFn) {
|
||||
config.onTrigger = onTriggerFn;
|
||||
}
|
||||
if (onUntriggerFn) {
|
||||
config.onUntrigger = onUntriggerFn;
|
||||
}
|
||||
window.tippy(el, config);
|
||||
}
|
||||
const noterefs = window.document.querySelectorAll('a[role="doc-noteref"]');
|
||||
for (var i=0; i<noterefs.length; i++) {
|
||||
const ref = noterefs[i];
|
||||
tippyHover(ref, function() {
|
||||
// use id or data attribute instead here
|
||||
let href = ref.getAttribute('data-footnote-href') || ref.getAttribute('href');
|
||||
try { href = new URL(href).hash; } catch {}
|
||||
const id = href.replace(/^#\/?/, "");
|
||||
const note = window.document.getElementById(id);
|
||||
if (note) {
|
||||
return note.innerHTML;
|
||||
} else {
|
||||
return "";
|
||||
}
|
||||
});
|
||||
}
|
||||
const xrefs = window.document.querySelectorAll('a.quarto-xref');
|
||||
const processXRef = (id, note) => {
|
||||
// Strip column container classes
|
||||
const stripColumnClz = (el) => {
|
||||
el.classList.remove("page-full", "page-columns");
|
||||
if (el.children) {
|
||||
for (const child of el.children) {
|
||||
stripColumnClz(child);
|
||||
}
|
||||
}
|
||||
}
|
||||
stripColumnClz(note)
|
||||
if (id === null || id.startsWith('sec-')) {
|
||||
// Special case sections, only their first couple elements
|
||||
const container = document.createElement("div");
|
||||
if (note.children && note.children.length > 2) {
|
||||
container.appendChild(note.children[0].cloneNode(true));
|
||||
for (let i = 1; i < note.children.length; i++) {
|
||||
const child = note.children[i];
|
||||
if (child.tagName === "P" && child.innerText === "") {
|
||||
continue;
|
||||
} else {
|
||||
container.appendChild(child.cloneNode(true));
|
||||
break;
|
||||
}
|
||||
}
|
||||
if (window.Quarto?.typesetMath) {
|
||||
window.Quarto.typesetMath(container);
|
||||
}
|
||||
return container.innerHTML
|
||||
} else {
|
||||
if (window.Quarto?.typesetMath) {
|
||||
window.Quarto.typesetMath(note);
|
||||
}
|
||||
return note.innerHTML;
|
||||
}
|
||||
} else {
|
||||
// Remove any anchor links if they are present
|
||||
const anchorLink = note.querySelector('a.anchorjs-link');
|
||||
if (anchorLink) {
|
||||
anchorLink.remove();
|
||||
}
|
||||
if (window.Quarto?.typesetMath) {
|
||||
window.Quarto.typesetMath(note);
|
||||
}
|
||||
if (note.classList.contains("callout")) {
|
||||
return note.outerHTML;
|
||||
} else {
|
||||
return note.innerHTML;
|
||||
}
|
||||
}
|
||||
}
|
||||
for (var i=0; i<xrefs.length; i++) {
|
||||
const xref = xrefs[i];
|
||||
tippyHover(xref, undefined, function(instance) {
|
||||
instance.disable();
|
||||
let url = xref.getAttribute('href');
|
||||
let hash = undefined;
|
||||
if (url.startsWith('#')) {
|
||||
hash = url;
|
||||
} else {
|
||||
try { hash = new URL(url).hash; } catch {}
|
||||
}
|
||||
if (hash) {
|
||||
const id = hash.replace(/^#\/?/, "");
|
||||
const note = window.document.getElementById(id);
|
||||
if (note !== null) {
|
||||
try {
|
||||
const html = processXRef(id, note.cloneNode(true));
|
||||
instance.setContent(html);
|
||||
} finally {
|
||||
instance.enable();
|
||||
instance.show();
|
||||
}
|
||||
} else {
|
||||
// See if we can fetch this
|
||||
fetch(url.split('#')[0])
|
||||
.then(res => res.text())
|
||||
.then(html => {
|
||||
const parser = new DOMParser();
|
||||
const htmlDoc = parser.parseFromString(html, "text/html");
|
||||
const note = htmlDoc.getElementById(id);
|
||||
if (note !== null) {
|
||||
const html = processXRef(id, note);
|
||||
instance.setContent(html);
|
||||
}
|
||||
}).finally(() => {
|
||||
instance.enable();
|
||||
instance.show();
|
||||
});
|
||||
}
|
||||
} else {
|
||||
// See if we can fetch a full url (with no hash to target)
|
||||
// This is a special case and we should probably do some content thinning / targeting
|
||||
fetch(url)
|
||||
.then(res => res.text())
|
||||
.then(html => {
|
||||
const parser = new DOMParser();
|
||||
const htmlDoc = parser.parseFromString(html, "text/html");
|
||||
const note = htmlDoc.querySelector('main.content');
|
||||
if (note !== null) {
|
||||
// This should only happen for chapter cross references
|
||||
// (since there is no id in the URL)
|
||||
// remove the first header
|
||||
if (note.children.length > 0 && note.children[0].tagName === "HEADER") {
|
||||
note.children[0].remove();
|
||||
}
|
||||
const html = processXRef(null, note);
|
||||
instance.setContent(html);
|
||||
}
|
||||
}).finally(() => {
|
||||
instance.enable();
|
||||
instance.show();
|
||||
});
|
||||
}
|
||||
}, function(instance) {
|
||||
});
|
||||
}
|
||||
let selectedAnnoteEl;
|
||||
const selectorForAnnotation = ( cell, annotation) => {
|
||||
let cellAttr = 'data-code-cell="' + cell + '"';
|
||||
let lineAttr = 'data-code-annotation="' + annotation + '"';
|
||||
const selector = 'span[' + cellAttr + '][' + lineAttr + ']';
|
||||
return selector;
|
||||
}
|
||||
const selectCodeLines = (annoteEl) => {
|
||||
const doc = window.document;
|
||||
const targetCell = annoteEl.getAttribute("data-target-cell");
|
||||
const targetAnnotation = annoteEl.getAttribute("data-target-annotation");
|
||||
const annoteSpan = window.document.querySelector(selectorForAnnotation(targetCell, targetAnnotation));
|
||||
const lines = annoteSpan.getAttribute("data-code-lines").split(",");
|
||||
const lineIds = lines.map((line) => {
|
||||
return targetCell + "-" + line;
|
||||
})
|
||||
let top = null;
|
||||
let height = null;
|
||||
let parent = null;
|
||||
if (lineIds.length > 0) {
|
||||
//compute the position of the single el (top and bottom and make a div)
|
||||
const el = window.document.getElementById(lineIds[0]);
|
||||
top = el.offsetTop;
|
||||
height = el.offsetHeight;
|
||||
parent = el.parentElement.parentElement;
|
||||
if (lineIds.length > 1) {
|
||||
const lastEl = window.document.getElementById(lineIds[lineIds.length - 1]);
|
||||
const bottom = lastEl.offsetTop + lastEl.offsetHeight;
|
||||
height = bottom - top;
|
||||
}
|
||||
if (top !== null && height !== null && parent !== null) {
|
||||
// cook up a div (if necessary) and position it
|
||||
let div = window.document.getElementById("code-annotation-line-highlight");
|
||||
if (div === null) {
|
||||
div = window.document.createElement("div");
|
||||
div.setAttribute("id", "code-annotation-line-highlight");
|
||||
div.style.position = 'absolute';
|
||||
parent.appendChild(div);
|
||||
}
|
||||
div.style.top = top - 2 + "px";
|
||||
div.style.height = height + 4 + "px";
|
||||
div.style.left = 0;
|
||||
let gutterDiv = window.document.getElementById("code-annotation-line-highlight-gutter");
|
||||
if (gutterDiv === null) {
|
||||
gutterDiv = window.document.createElement("div");
|
||||
gutterDiv.setAttribute("id", "code-annotation-line-highlight-gutter");
|
||||
gutterDiv.style.position = 'absolute';
|
||||
const codeCell = window.document.getElementById(targetCell);
|
||||
const gutter = codeCell.querySelector('.code-annotation-gutter');
|
||||
gutter.appendChild(gutterDiv);
|
||||
}
|
||||
gutterDiv.style.top = top - 2 + "px";
|
||||
gutterDiv.style.height = height + 4 + "px";
|
||||
}
|
||||
selectedAnnoteEl = annoteEl;
|
||||
}
|
||||
};
|
||||
const unselectCodeLines = () => {
|
||||
const elementsIds = ["code-annotation-line-highlight", "code-annotation-line-highlight-gutter"];
|
||||
elementsIds.forEach((elId) => {
|
||||
const div = window.document.getElementById(elId);
|
||||
if (div) {
|
||||
div.remove();
|
||||
}
|
||||
});
|
||||
selectedAnnoteEl = undefined;
|
||||
};
|
||||
// Handle positioning of the toggle
|
||||
window.addEventListener(
|
||||
"resize",
|
||||
throttle(() => {
|
||||
elRect = undefined;
|
||||
if (selectedAnnoteEl) {
|
||||
selectCodeLines(selectedAnnoteEl);
|
||||
}
|
||||
}, 10)
|
||||
);
|
||||
function throttle(fn, ms) {
|
||||
let throttle = false;
|
||||
let timer;
|
||||
return (...args) => {
|
||||
if(!throttle) { // first call gets through
|
||||
fn.apply(this, args);
|
||||
throttle = true;
|
||||
} else { // all the others get throttled
|
||||
if(timer) clearTimeout(timer); // cancel #2
|
||||
timer = setTimeout(() => {
|
||||
fn.apply(this, args);
|
||||
timer = throttle = false;
|
||||
}, ms);
|
||||
}
|
||||
};
|
||||
}
|
||||
// Attach click handler to the DT
|
||||
const annoteDls = window.document.querySelectorAll('dt[data-target-cell]');
|
||||
for (const annoteDlNode of annoteDls) {
|
||||
annoteDlNode.addEventListener('click', (event) => {
|
||||
const clickedEl = event.target;
|
||||
if (clickedEl !== selectedAnnoteEl) {
|
||||
unselectCodeLines();
|
||||
const activeEl = window.document.querySelector('dt[data-target-cell].code-annotation-active');
|
||||
if (activeEl) {
|
||||
activeEl.classList.remove('code-annotation-active');
|
||||
}
|
||||
selectCodeLines(clickedEl);
|
||||
clickedEl.classList.add('code-annotation-active');
|
||||
} else {
|
||||
// Unselect the line
|
||||
unselectCodeLines();
|
||||
clickedEl.classList.remove('code-annotation-active');
|
||||
}
|
||||
});
|
||||
}
|
||||
const findCites = (el) => {
|
||||
const parentEl = el.parentElement;
|
||||
if (parentEl) {
|
||||
const cites = parentEl.dataset.cites;
|
||||
if (cites) {
|
||||
return {
|
||||
el,
|
||||
cites: cites.split(' ')
|
||||
};
|
||||
} else {
|
||||
return findCites(el.parentElement)
|
||||
}
|
||||
} else {
|
||||
return undefined;
|
||||
}
|
||||
};
|
||||
var bibliorefs = window.document.querySelectorAll('a[role="doc-biblioref"]');
|
||||
for (var i=0; i<bibliorefs.length; i++) {
|
||||
const ref = bibliorefs[i];
|
||||
const citeInfo = findCites(ref);
|
||||
if (citeInfo) {
|
||||
tippyHover(citeInfo.el, function() {
|
||||
var popup = window.document.createElement('div');
|
||||
citeInfo.cites.forEach(function(cite) {
|
||||
var citeDiv = window.document.createElement('div');
|
||||
citeDiv.classList.add('hanging-indent');
|
||||
citeDiv.classList.add('csl-entry');
|
||||
var biblioDiv = window.document.getElementById('ref-' + cite);
|
||||
if (biblioDiv) {
|
||||
citeDiv.innerHTML = biblioDiv.innerHTML;
|
||||
}
|
||||
popup.appendChild(citeDiv);
|
||||
});
|
||||
return popup.innerHTML;
|
||||
});
|
||||
}
|
||||
}
|
||||
});
|
||||
</script>
|
||||
</div> <!-- /content -->
|
||||
|
||||
|
||||
|
||||
|
||||
</body></html>
|
||||
@@ -7,7 +7,7 @@
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
|
||||
|
||||
|
||||
<title>core.trainer_builder – Axolotl</title>
|
||||
<title>core.builders.causal – Axolotl</title>
|
||||
<style>
|
||||
code{white-space: pre-wrap;}
|
||||
span.smallcaps{font-variant: small-caps;}
|
||||
@@ -467,14 +467,11 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<h2 id="toc-title">On this page</h2>
|
||||
|
||||
<ul>
|
||||
<li><a href="#axolotl.core.trainer_builder" id="toc-axolotl.core.trainer_builder" class="nav-link active" data-scroll-target="#axolotl.core.trainer_builder">core.trainer_builder</a>
|
||||
<li><a href="#axolotl.core.builders.causal" id="toc-axolotl.core.builders.causal" class="nav-link active" data-scroll-target="#axolotl.core.builders.causal">core.builders.causal</a>
|
||||
<ul class="collapse">
|
||||
<li><a href="#classes" id="toc-classes" class="nav-link" data-scroll-target="#classes">Classes</a>
|
||||
<ul class="collapse">
|
||||
<li><a href="#axolotl.core.trainer_builder.HFCausalTrainerBuilder" id="toc-axolotl.core.trainer_builder.HFCausalTrainerBuilder" class="nav-link" data-scroll-target="#axolotl.core.trainer_builder.HFCausalTrainerBuilder">HFCausalTrainerBuilder</a></li>
|
||||
<li><a href="#axolotl.core.trainer_builder.HFPPOTrainerBuilder" id="toc-axolotl.core.trainer_builder.HFPPOTrainerBuilder" class="nav-link" data-scroll-target="#axolotl.core.trainer_builder.HFPPOTrainerBuilder">HFPPOTrainerBuilder</a></li>
|
||||
<li><a href="#axolotl.core.trainer_builder.HFRLTrainerBuilder" id="toc-axolotl.core.trainer_builder.HFRLTrainerBuilder" class="nav-link" data-scroll-target="#axolotl.core.trainer_builder.HFRLTrainerBuilder">HFRLTrainerBuilder</a></li>
|
||||
<li><a href="#axolotl.core.trainer_builder.TrainerBuilderBase" id="toc-axolotl.core.trainer_builder.TrainerBuilderBase" class="nav-link" data-scroll-target="#axolotl.core.trainer_builder.TrainerBuilderBase">TrainerBuilderBase</a></li>
|
||||
<li><a href="#axolotl.core.builders.causal.HFCausalTrainerBuilder" id="toc-axolotl.core.builders.causal.HFCausalTrainerBuilder" class="nav-link" data-scroll-target="#axolotl.core.builders.causal.HFCausalTrainerBuilder">HFCausalTrainerBuilder</a></li>
|
||||
</ul></li>
|
||||
</ul></li>
|
||||
</ul>
|
||||
@@ -486,10 +483,10 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
|
||||
|
||||
|
||||
<section id="axolotl.core.trainer_builder" class="level1">
|
||||
<h1>core.trainer_builder</h1>
|
||||
<p><code>core.trainer_builder</code></p>
|
||||
<p>Builder for the training args and trainer</p>
|
||||
<section id="axolotl.core.builders.causal" class="level1">
|
||||
<h1>core.builders.causal</h1>
|
||||
<p><code>core.builders.causal</code></p>
|
||||
<p>Builder for causal trainers</p>
|
||||
<section id="classes" class="level2">
|
||||
<h2 class="anchored" data-anchor-id="classes">Classes</h2>
|
||||
<table class="caption-top table">
|
||||
@@ -501,93 +498,23 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr class="odd">
|
||||
<td><a href="#axolotl.core.trainer_builder.HFCausalTrainerBuilder">HFCausalTrainerBuilder</a></td>
|
||||
<td><a href="#axolotl.core.builders.causal.HFCausalTrainerBuilder">HFCausalTrainerBuilder</a></td>
|
||||
<td>Build the HuggingFace training args/trainer for causal models and reward modeling</td>
|
||||
</tr>
|
||||
<tr class="even">
|
||||
<td><a href="#axolotl.core.trainer_builder.HFPPOTrainerBuilder">HFPPOTrainerBuilder</a></td>
|
||||
<td>HF Factory class for PPO Trainer</td>
|
||||
</tr>
|
||||
<tr class="odd">
|
||||
<td><a href="#axolotl.core.trainer_builder.HFRLTrainerBuilder">HFRLTrainerBuilder</a></td>
|
||||
<td>Trainer factory class for TRL-based RLHF trainers (e.g. DPO)</td>
|
||||
</tr>
|
||||
<tr class="even">
|
||||
<td><a href="#axolotl.core.trainer_builder.TrainerBuilderBase">TrainerBuilderBase</a></td>
|
||||
<td>Base class for trainer builder.</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<section id="axolotl.core.trainer_builder.HFCausalTrainerBuilder" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.trainer_builder.HFCausalTrainerBuilder">HFCausalTrainerBuilder</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainer_builder.HFCausalTrainerBuilder(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> cfg,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> model,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> processor<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<section id="axolotl.core.builders.causal.HFCausalTrainerBuilder" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.builders.causal.HFCausalTrainerBuilder">HFCausalTrainerBuilder</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.builders.causal.HFCausalTrainerBuilder(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> cfg,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> model,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> processor<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Build the HuggingFace training args/trainer for causal models and reward modeling
|
||||
using TRL.</p>
|
||||
</section>
|
||||
<section id="axolotl.core.trainer_builder.HFPPOTrainerBuilder" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.trainer_builder.HFPPOTrainerBuilder">HFPPOTrainerBuilder</h3>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>core.trainer_builder.HFPPOTrainerBuilder(</span>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> cfg,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> model,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> processor<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>HF Factory class for PPO Trainer</p>
|
||||
</section>
|
||||
<section id="axolotl.core.trainer_builder.HFRLTrainerBuilder" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.trainer_builder.HFRLTrainerBuilder">HFRLTrainerBuilder</h3>
|
||||
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>core.trainer_builder.HFRLTrainerBuilder(</span>
|
||||
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> cfg,</span>
|
||||
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> model,</span>
|
||||
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a> processor<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Trainer factory class for TRL-based RLHF trainers (e.g. DPO)</p>
|
||||
</section>
|
||||
<section id="axolotl.core.trainer_builder.TrainerBuilderBase" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.trainer_builder.TrainerBuilderBase">TrainerBuilderBase</h3>
|
||||
<div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>core.trainer_builder.TrainerBuilderBase(</span>
|
||||
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> cfg,</span>
|
||||
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> model,</span>
|
||||
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> processor<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Base class for trainer builder.</p>
|
||||
<section id="methods" class="level4">
|
||||
<h4 class="anchored" data-anchor-id="methods">Methods</h4>
|
||||
<table class="caption-top table">
|
||||
<thead>
|
||||
<tr class="header">
|
||||
<th>Name</th>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr class="odd">
|
||||
<td><a href="#axolotl.core.trainer_builder.TrainerBuilderBase.get_post_trainer_create_callbacks">get_post_trainer_create_callbacks</a></td>
|
||||
<td>Callbacks added after the trainer is created, usually b/c these need access to the trainer</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<section id="axolotl.core.trainer_builder.TrainerBuilderBase.get_post_trainer_create_callbacks" class="level5">
|
||||
<h5 class="anchored" data-anchor-id="axolotl.core.trainer_builder.TrainerBuilderBase.get_post_trainer_create_callbacks">get_post_trainer_create_callbacks</h5>
|
||||
<div class="sourceCode" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>core.trainer_builder.TrainerBuilderBase.get_post_trainer_create_callbacks(</span>
|
||||
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a> trainer,</span>
|
||||
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Callbacks added after the trainer is created, usually b/c these need access to the trainer</p>
|
||||
|
||||
|
||||
</section>
|
||||
</section>
|
||||
</section>
|
||||
</section>
|
||||
</section>
|
||||
931
docs/api/core.builders.rl.html
Normal file
931
docs/api/core.builders.rl.html
Normal file
@@ -0,0 +1,931 @@
|
||||
<!DOCTYPE html>
|
||||
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>
|
||||
|
||||
<meta charset="utf-8">
|
||||
<meta name="generator" content="quarto-1.7.31">
|
||||
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
|
||||
|
||||
|
||||
<title>core.builders.rl – Axolotl</title>
|
||||
<style>
|
||||
code{white-space: pre-wrap;}
|
||||
span.smallcaps{font-variant: small-caps;}
|
||||
div.columns{display: flex; gap: min(4vw, 1.5em);}
|
||||
div.column{flex: auto; overflow-x: auto;}
|
||||
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
|
||||
ul.task-list{list-style: none;}
|
||||
ul.task-list li input[type="checkbox"] {
|
||||
width: 0.8em;
|
||||
margin: 0 0.8em 0.2em -1em; /* quarto-specific, see https://github.com/quarto-dev/quarto-cli/issues/4556 */
|
||||
vertical-align: middle;
|
||||
}
|
||||
/* CSS for syntax highlighting */
|
||||
html { -webkit-text-size-adjust: 100%; }
|
||||
pre > code.sourceCode { white-space: pre; position: relative; }
|
||||
pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
|
||||
pre > code.sourceCode > span:empty { height: 1.2em; }
|
||||
.sourceCode { overflow: visible; }
|
||||
code.sourceCode > span { color: inherit; text-decoration: inherit; }
|
||||
div.sourceCode { margin: 1em 0; }
|
||||
pre.sourceCode { margin: 0; }
|
||||
@media screen {
|
||||
div.sourceCode { overflow: auto; }
|
||||
}
|
||||
@media print {
|
||||
pre > code.sourceCode { white-space: pre-wrap; }
|
||||
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
|
||||
}
|
||||
pre.numberSource code
|
||||
{ counter-reset: source-line 0; }
|
||||
pre.numberSource code > span
|
||||
{ position: relative; left: -4em; counter-increment: source-line; }
|
||||
pre.numberSource code > span > a:first-child::before
|
||||
{ content: counter(source-line);
|
||||
position: relative; left: -1em; text-align: right; vertical-align: baseline;
|
||||
border: none; display: inline-block;
|
||||
-webkit-touch-callout: none; -webkit-user-select: none;
|
||||
-khtml-user-select: none; -moz-user-select: none;
|
||||
-ms-user-select: none; user-select: none;
|
||||
padding: 0 4px; width: 4em;
|
||||
}
|
||||
pre.numberSource { margin-left: 3em; padding-left: 4px; }
|
||||
div.sourceCode
|
||||
{ }
|
||||
@media screen {
|
||||
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
|
||||
}
|
||||
</style>
|
||||
|
||||
|
||||
<script src="../../site_libs/quarto-nav/quarto-nav.js"></script>
|
||||
<script src="../../site_libs/clipboard/clipboard.min.js"></script>
|
||||
<script src="../../site_libs/quarto-search/autocomplete.umd.js"></script>
|
||||
<script src="../../site_libs/quarto-search/fuse.min.js"></script>
|
||||
<script src="../../site_libs/quarto-search/quarto-search.js"></script>
|
||||
<meta name="quarto:offset" content="../../">
|
||||
<link href="../../favicon.jpg" rel="icon" type="image/jpeg">
|
||||
<script src="../../site_libs/quarto-html/quarto.js" type="module"></script>
|
||||
<script src="../../site_libs/quarto-html/tabsets/tabsets.js" type="module"></script>
|
||||
<script src="../../site_libs/quarto-html/popper.min.js"></script>
|
||||
<script src="../../site_libs/quarto-html/tippy.umd.min.js"></script>
|
||||
<script src="../../site_libs/quarto-html/anchor.min.js"></script>
|
||||
<link href="../../site_libs/quarto-html/tippy.css" rel="stylesheet">
|
||||
<link href="../../site_libs/quarto-html/quarto-syntax-highlighting-dark-8ef56b68f8fa1e9d2ba328e99e439f80.css" rel="stylesheet" id="quarto-text-highlighting-styles">
|
||||
<script src="../../site_libs/bootstrap/bootstrap.min.js"></script>
|
||||
<link href="../../site_libs/bootstrap/bootstrap-icons.css" rel="stylesheet">
|
||||
<link href="../../site_libs/bootstrap/bootstrap-2288ecdcbf81d2ab6432743cedd71d9a.min.css" rel="stylesheet" append-hash="true" id="quarto-bootstrap" data-mode="dark">
|
||||
<script id="quarto-search-options" type="application/json">{
|
||||
"location": "navbar",
|
||||
"copy-button": false,
|
||||
"collapse-after": 3,
|
||||
"panel-placement": "end",
|
||||
"type": "overlay",
|
||||
"limit": 50,
|
||||
"keyboard-shortcut": [
|
||||
"f",
|
||||
"/",
|
||||
"s"
|
||||
],
|
||||
"show-item-context": false,
|
||||
"language": {
|
||||
"search-no-results-text": "No results",
|
||||
"search-matching-documents-text": "matching documents",
|
||||
"search-copy-link-title": "Copy link to search",
|
||||
"search-hide-matches-text": "Hide additional matches",
|
||||
"search-more-match-text": "more match in this document",
|
||||
"search-more-matches-text": "more matches in this document",
|
||||
"search-clear-button-title": "Clear",
|
||||
"search-text-placeholder": "",
|
||||
"search-detached-cancel-button-title": "Cancel",
|
||||
"search-submit-button-title": "Submit",
|
||||
"search-label": "Search"
|
||||
}
|
||||
}</script>
|
||||
<script async="" src="https://www.googletagmanager.com/gtag/js?id=G-9KYCVJBNMQ"></script>
|
||||
|
||||
<script type="text/javascript">
|
||||
|
||||
window.dataLayer = window.dataLayer || [];
|
||||
function gtag(){dataLayer.push(arguments);}
|
||||
gtag('js', new Date());
|
||||
gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</script>
|
||||
|
||||
|
||||
<link rel="stylesheet" href="../../styles.css">
|
||||
</head>
|
||||
|
||||
<body class="nav-sidebar docked nav-fixed quarto-light">
|
||||
|
||||
<div id="quarto-search-results"></div>
|
||||
<header id="quarto-header" class="headroom fixed-top">
|
||||
<nav class="navbar navbar-expand " data-bs-theme="dark">
|
||||
<div class="navbar-container container-fluid">
|
||||
<div class="navbar-brand-container mx-auto">
|
||||
<a href="../../index.html" class="navbar-brand navbar-brand-logo">
|
||||
<img src="../../image/axolotl_logo_digital_white.svg" alt="" class="navbar-logo">
|
||||
</a>
|
||||
</div>
|
||||
<div class="quarto-navbar-tools tools-wide tools-end">
|
||||
<a href="https://twitter.com/axolotl_ai" title="" class="quarto-navigation-tool px-1" aria-label=""><i class="bi bi-twitter"></i></a>
|
||||
<a href="https://github.com/axolotl-ai-cloud/axolotl/" title="" class="quarto-navigation-tool px-1" aria-label=""><i class="bi bi-github"></i></a>
|
||||
<a href="https://discord.gg/7m9sfhzaf3" title="" class="quarto-navigation-tool px-1" aria-label=""><i class="bi bi-discord"></i></a>
|
||||
</div>
|
||||
<div id="quarto-search" class="" title="Search"></div>
|
||||
</div> <!-- /container-fluid -->
|
||||
</nav>
|
||||
<nav class="quarto-secondary-nav">
|
||||
<div class="container-fluid d-flex">
|
||||
<button type="button" class="quarto-btn-toggle btn" data-bs-toggle="collapse" role="button" data-bs-target=".quarto-sidebar-collapse-item" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
|
||||
<i class="bi bi-layout-text-sidebar-reverse"></i>
|
||||
</button>
|
||||
<nav class="quarto-page-breadcrumbs" aria-label="breadcrumb"><ol class="breadcrumb"></ol></nav>
|
||||
<a class="flex-grow-1" role="navigation" data-bs-toggle="collapse" data-bs-target=".quarto-sidebar-collapse-item" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
|
||||
</a>
|
||||
</div>
|
||||
</nav>
|
||||
</header>
|
||||
<!-- content -->
|
||||
<div id="quarto-content" class="quarto-container page-columns page-rows-contents page-layout-article page-navbar">
|
||||
<!-- sidebar -->
|
||||
<nav id="quarto-sidebar" class="sidebar collapse collapse-horizontal quarto-sidebar-collapse-item sidebar-navigation docked overflow-auto">
|
||||
<div class="sidebar-menu-container">
|
||||
<ul class="list-unstyled mt-1">
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../index.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Home</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item sidebar-item-section">
|
||||
<div class="sidebar-item-container">
|
||||
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-1" role="navigation" aria-expanded="true">
|
||||
<span class="menu-text">Getting Started</span></a>
|
||||
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-1" role="navigation" aria-expanded="true" aria-label="Toggle section">
|
||||
<i class="bi bi-chevron-right ms-2"></i>
|
||||
</a>
|
||||
</div>
|
||||
<ul id="quarto-sidebar-section-1" class="collapse list-unstyled sidebar-section depth1 show">
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/getting-started.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Quickstart</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/installation.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Installation</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/inference.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Inference and Merging</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/cli.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Command Line Interface (CLI)</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/config.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Config Reference</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/api" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">API Reference</span></a>
|
||||
</div>
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li class="sidebar-item sidebar-item-section">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/dataset-formats/index.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Dataset Formats</span></a>
|
||||
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-2" role="navigation" aria-expanded="true" aria-label="Toggle section">
|
||||
<i class="bi bi-chevron-right ms-2"></i>
|
||||
</a>
|
||||
</div>
|
||||
<ul id="quarto-sidebar-section-2" class="collapse list-unstyled sidebar-section depth1 show">
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/dataset-formats/pretraining.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Pre-training</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/dataset-formats/inst_tune.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Instruction Tuning</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/dataset-formats/conversation.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Conversation</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/dataset-formats/stepwise_supervised.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Stepwise Supervised Format</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/dataset-formats/template_free.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Template-Free</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/dataset-formats/tokenized.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Custom Pre-Tokenized Dataset</span></a>
|
||||
</div>
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li class="sidebar-item sidebar-item-section">
|
||||
<div class="sidebar-item-container">
|
||||
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-3" role="navigation" aria-expanded="true">
|
||||
<span class="menu-text">Deployments</span></a>
|
||||
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-3" role="navigation" aria-expanded="true" aria-label="Toggle section">
|
||||
<i class="bi bi-chevron-right ms-2"></i>
|
||||
</a>
|
||||
</div>
|
||||
<ul id="quarto-sidebar-section-3" class="collapse list-unstyled sidebar-section depth1 show">
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/docker.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Docker</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/multi-gpu.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Multi-GPU</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/multi-node.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Multi Node</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/ray-integration.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Ray Train</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/amd_hpc.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">AMD GPUs on HPC Systems</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/mac.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Mac M-series</span></a>
|
||||
</div>
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li class="sidebar-item sidebar-item-section">
|
||||
<div class="sidebar-item-container">
|
||||
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-4" role="navigation" aria-expanded="true">
|
||||
<span class="menu-text">How To Guides</span></a>
|
||||
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-4" role="navigation" aria-expanded="true" aria-label="Toggle section">
|
||||
<i class="bi bi-chevron-right ms-2"></i>
|
||||
</a>
|
||||
</div>
|
||||
<ul id="quarto-sidebar-section-4" class="collapse list-unstyled sidebar-section depth1 show">
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/multimodal.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">MultiModal / Vision Language Models (BETA)</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/rlhf.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">RLHF (Beta)</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/reward_modelling.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Reward Modelling</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/lr_groups.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Learning Rate Groups</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/lora_optims.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">LoRA Optimizations</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/dataset_loading.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Dataset Loading</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/qat.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Quantization Aware Training (QAT)</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/quantize.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Quantization with torchao</span></a>
|
||||
</div>
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li class="sidebar-item sidebar-item-section">
|
||||
<div class="sidebar-item-container">
|
||||
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-5" role="navigation" aria-expanded="true">
|
||||
<span class="menu-text">Core Concepts</span></a>
|
||||
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-5" role="navigation" aria-expanded="true" aria-label="Toggle section">
|
||||
<i class="bi bi-chevron-right ms-2"></i>
|
||||
</a>
|
||||
</div>
|
||||
<ul id="quarto-sidebar-section-5" class="collapse list-unstyled sidebar-section depth1 show">
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/batch_vs_grad.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Batch size vs Gradient accumulation</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/dataset_preprocessing.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Dataset Preprocessing</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/multipack.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Multipack (Sample Packing)</span></a>
|
||||
</div>
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li class="sidebar-item sidebar-item-section">
|
||||
<div class="sidebar-item-container">
|
||||
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-6" role="navigation" aria-expanded="true">
|
||||
<span class="menu-text">Advanced Features</span></a>
|
||||
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-6" role="navigation" aria-expanded="true" aria-label="Toggle section">
|
||||
<i class="bi bi-chevron-right ms-2"></i>
|
||||
</a>
|
||||
</div>
|
||||
<ul id="quarto-sidebar-section-6" class="collapse list-unstyled sidebar-section depth1 show">
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/fsdp_qlora.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">FDSP + QLoRA</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/unsloth.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Unsloth</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/torchao.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">PyTorch ao</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/custom_integrations.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Custom Integrations</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/sequence_parallelism.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Sequence Parallelism</span></a>
|
||||
</div>
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li class="sidebar-item sidebar-item-section">
|
||||
<div class="sidebar-item-container">
|
||||
<a class="sidebar-item-text sidebar-link text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-7" role="navigation" aria-expanded="true">
|
||||
<span class="menu-text">Troubleshooting</span></a>
|
||||
<a class="sidebar-item-toggle text-start" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar-section-7" role="navigation" aria-expanded="true" aria-label="Toggle section">
|
||||
<i class="bi bi-chevron-right ms-2"></i>
|
||||
</a>
|
||||
</div>
|
||||
<ul id="quarto-sidebar-section-7" class="collapse list-unstyled sidebar-section depth1 show">
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/faq.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">FAQ</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/debugging.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">Debugging</span></a>
|
||||
</div>
|
||||
</li>
|
||||
<li class="sidebar-item">
|
||||
<div class="sidebar-item-container">
|
||||
<a href="../../docs/nccl.html" class="sidebar-item-text sidebar-link">
|
||||
<span class="menu-text">NCCL</span></a>
|
||||
</div>
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
</div>
|
||||
</nav>
|
||||
<div id="quarto-sidebar-glass" class="quarto-sidebar-collapse-item" data-bs-toggle="collapse" data-bs-target=".quarto-sidebar-collapse-item"></div>
|
||||
<!-- margin-sidebar -->
|
||||
<div id="quarto-margin-sidebar" class="sidebar margin-sidebar">
|
||||
<nav id="TOC" role="doc-toc" class="toc-active">
|
||||
<h2 id="toc-title">On this page</h2>
|
||||
|
||||
<ul>
|
||||
<li><a href="#axolotl.core.builders.rl" id="toc-axolotl.core.builders.rl" class="nav-link active" data-scroll-target="#axolotl.core.builders.rl">core.builders.rl</a>
|
||||
<ul class="collapse">
|
||||
<li><a href="#classes" id="toc-classes" class="nav-link" data-scroll-target="#classes">Classes</a>
|
||||
<ul class="collapse">
|
||||
<li><a href="#axolotl.core.builders.rl.HFPPOTrainerBuilder" id="toc-axolotl.core.builders.rl.HFPPOTrainerBuilder" class="nav-link" data-scroll-target="#axolotl.core.builders.rl.HFPPOTrainerBuilder">HFPPOTrainerBuilder</a></li>
|
||||
<li><a href="#axolotl.core.builders.rl.HFRLTrainerBuilder" id="toc-axolotl.core.builders.rl.HFRLTrainerBuilder" class="nav-link" data-scroll-target="#axolotl.core.builders.rl.HFRLTrainerBuilder">HFRLTrainerBuilder</a></li>
|
||||
</ul></li>
|
||||
</ul></li>
|
||||
</ul>
|
||||
</nav>
|
||||
</div>
|
||||
<!-- main -->
|
||||
<main class="content" id="quarto-document-content"><header id="title-block-header" class="quarto-title-block"></header>
|
||||
|
||||
|
||||
|
||||
|
||||
<section id="axolotl.core.builders.rl" class="level1">
|
||||
<h1>core.builders.rl</h1>
|
||||
<p><code>core.builders.rl</code></p>
|
||||
<p>Builder for RLHF trainers</p>
|
||||
<section id="classes" class="level2">
|
||||
<h2 class="anchored" data-anchor-id="classes">Classes</h2>
|
||||
<table class="caption-top table">
|
||||
<thead>
|
||||
<tr class="header">
|
||||
<th>Name</th>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr class="odd">
|
||||
<td><a href="#axolotl.core.builders.rl.HFPPOTrainerBuilder">HFPPOTrainerBuilder</a></td>
|
||||
<td>HF Factory class for PPO Trainer</td>
|
||||
</tr>
|
||||
<tr class="even">
|
||||
<td><a href="#axolotl.core.builders.rl.HFRLTrainerBuilder">HFRLTrainerBuilder</a></td>
|
||||
<td>Trainer factory class for TRL-based RLHF trainers (e.g. DPO)</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<section id="axolotl.core.builders.rl.HFPPOTrainerBuilder" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.builders.rl.HFPPOTrainerBuilder">HFPPOTrainerBuilder</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.builders.rl.HFPPOTrainerBuilder(cfg, model, tokenizer, processor<span class="op">=</span><span class="va">None</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>HF Factory class for PPO Trainer</p>
|
||||
</section>
|
||||
<section id="axolotl.core.builders.rl.HFRLTrainerBuilder" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.builders.rl.HFRLTrainerBuilder">HFRLTrainerBuilder</h3>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>core.builders.rl.HFRLTrainerBuilder(cfg, model, tokenizer, processor<span class="op">=</span><span class="va">None</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Trainer factory class for TRL-based RLHF trainers (e.g. DPO)</p>
|
||||
|
||||
|
||||
</section>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
</main> <!-- /main -->
|
||||
<script id="quarto-html-after-body" type="application/javascript">
|
||||
window.document.addEventListener("DOMContentLoaded", function (event) {
|
||||
const icon = "";
|
||||
const anchorJS = new window.AnchorJS();
|
||||
anchorJS.options = {
|
||||
placement: 'right',
|
||||
icon: icon
|
||||
};
|
||||
anchorJS.add('.anchored');
|
||||
const isCodeAnnotation = (el) => {
|
||||
for (const clz of el.classList) {
|
||||
if (clz.startsWith('code-annotation-')) {
|
||||
return true;
|
||||
}
|
||||
}
|
||||
return false;
|
||||
}
|
||||
const onCopySuccess = function(e) {
|
||||
// button target
|
||||
const button = e.trigger;
|
||||
// don't keep focus
|
||||
button.blur();
|
||||
// flash "checked"
|
||||
button.classList.add('code-copy-button-checked');
|
||||
var currentTitle = button.getAttribute("title");
|
||||
button.setAttribute("title", "Copied!");
|
||||
let tooltip;
|
||||
if (window.bootstrap) {
|
||||
button.setAttribute("data-bs-toggle", "tooltip");
|
||||
button.setAttribute("data-bs-placement", "left");
|
||||
button.setAttribute("data-bs-title", "Copied!");
|
||||
tooltip = new bootstrap.Tooltip(button,
|
||||
{ trigger: "manual",
|
||||
customClass: "code-copy-button-tooltip",
|
||||
offset: [0, -8]});
|
||||
tooltip.show();
|
||||
}
|
||||
setTimeout(function() {
|
||||
if (tooltip) {
|
||||
tooltip.hide();
|
||||
button.removeAttribute("data-bs-title");
|
||||
button.removeAttribute("data-bs-toggle");
|
||||
button.removeAttribute("data-bs-placement");
|
||||
}
|
||||
button.setAttribute("title", currentTitle);
|
||||
button.classList.remove('code-copy-button-checked');
|
||||
}, 1000);
|
||||
// clear code selection
|
||||
e.clearSelection();
|
||||
}
|
||||
const getTextToCopy = function(trigger) {
|
||||
const codeEl = trigger.previousElementSibling.cloneNode(true);
|
||||
for (const childEl of codeEl.children) {
|
||||
if (isCodeAnnotation(childEl)) {
|
||||
childEl.remove();
|
||||
}
|
||||
}
|
||||
return codeEl.innerText;
|
||||
}
|
||||
const clipboard = new window.ClipboardJS('.code-copy-button:not([data-in-quarto-modal])', {
|
||||
text: getTextToCopy
|
||||
});
|
||||
clipboard.on('success', onCopySuccess);
|
||||
if (window.document.getElementById('quarto-embedded-source-code-modal')) {
|
||||
const clipboardModal = new window.ClipboardJS('.code-copy-button[data-in-quarto-modal]', {
|
||||
text: getTextToCopy,
|
||||
container: window.document.getElementById('quarto-embedded-source-code-modal')
|
||||
});
|
||||
clipboardModal.on('success', onCopySuccess);
|
||||
}
|
||||
var localhostRegex = new RegExp(/^(?:http|https):\/\/localhost\:?[0-9]*\//);
|
||||
var mailtoRegex = new RegExp(/^mailto:/);
|
||||
var filterRegex = new RegExp("https:\/\/docs\.axolotl\.ai");
|
||||
var isInternal = (href) => {
|
||||
return filterRegex.test(href) || localhostRegex.test(href) || mailtoRegex.test(href);
|
||||
}
|
||||
// Inspect non-navigation links and adorn them if external
|
||||
var links = window.document.querySelectorAll('a[href]:not(.nav-link):not(.navbar-brand):not(.toc-action):not(.sidebar-link):not(.sidebar-item-toggle):not(.pagination-link):not(.no-external):not([aria-hidden]):not(.dropdown-item):not(.quarto-navigation-tool):not(.about-link)');
|
||||
for (var i=0; i<links.length; i++) {
|
||||
const link = links[i];
|
||||
if (!isInternal(link.href)) {
|
||||
// undo the damage that might have been done by quarto-nav.js in the case of
|
||||
// links that we want to consider external
|
||||
if (link.dataset.originalHref !== undefined) {
|
||||
link.href = link.dataset.originalHref;
|
||||
}
|
||||
}
|
||||
}
|
||||
function tippyHover(el, contentFn, onTriggerFn, onUntriggerFn) {
|
||||
const config = {
|
||||
allowHTML: true,
|
||||
maxWidth: 500,
|
||||
delay: 100,
|
||||
arrow: false,
|
||||
appendTo: function(el) {
|
||||
return el.parentElement;
|
||||
},
|
||||
interactive: true,
|
||||
interactiveBorder: 10,
|
||||
theme: 'quarto',
|
||||
placement: 'bottom-start',
|
||||
};
|
||||
if (contentFn) {
|
||||
config.content = contentFn;
|
||||
}
|
||||
if (onTriggerFn) {
|
||||
config.onTrigger = onTriggerFn;
|
||||
}
|
||||
if (onUntriggerFn) {
|
||||
config.onUntrigger = onUntriggerFn;
|
||||
}
|
||||
window.tippy(el, config);
|
||||
}
|
||||
const noterefs = window.document.querySelectorAll('a[role="doc-noteref"]');
|
||||
for (var i=0; i<noterefs.length; i++) {
|
||||
const ref = noterefs[i];
|
||||
tippyHover(ref, function() {
|
||||
// use id or data attribute instead here
|
||||
let href = ref.getAttribute('data-footnote-href') || ref.getAttribute('href');
|
||||
try { href = new URL(href).hash; } catch {}
|
||||
const id = href.replace(/^#\/?/, "");
|
||||
const note = window.document.getElementById(id);
|
||||
if (note) {
|
||||
return note.innerHTML;
|
||||
} else {
|
||||
return "";
|
||||
}
|
||||
});
|
||||
}
|
||||
const xrefs = window.document.querySelectorAll('a.quarto-xref');
|
||||
const processXRef = (id, note) => {
|
||||
// Strip column container classes
|
||||
const stripColumnClz = (el) => {
|
||||
el.classList.remove("page-full", "page-columns");
|
||||
if (el.children) {
|
||||
for (const child of el.children) {
|
||||
stripColumnClz(child);
|
||||
}
|
||||
}
|
||||
}
|
||||
stripColumnClz(note)
|
||||
if (id === null || id.startsWith('sec-')) {
|
||||
// Special case sections, only their first couple elements
|
||||
const container = document.createElement("div");
|
||||
if (note.children && note.children.length > 2) {
|
||||
container.appendChild(note.children[0].cloneNode(true));
|
||||
for (let i = 1; i < note.children.length; i++) {
|
||||
const child = note.children[i];
|
||||
if (child.tagName === "P" && child.innerText === "") {
|
||||
continue;
|
||||
} else {
|
||||
container.appendChild(child.cloneNode(true));
|
||||
break;
|
||||
}
|
||||
}
|
||||
if (window.Quarto?.typesetMath) {
|
||||
window.Quarto.typesetMath(container);
|
||||
}
|
||||
return container.innerHTML
|
||||
} else {
|
||||
if (window.Quarto?.typesetMath) {
|
||||
window.Quarto.typesetMath(note);
|
||||
}
|
||||
return note.innerHTML;
|
||||
}
|
||||
} else {
|
||||
// Remove any anchor links if they are present
|
||||
const anchorLink = note.querySelector('a.anchorjs-link');
|
||||
if (anchorLink) {
|
||||
anchorLink.remove();
|
||||
}
|
||||
if (window.Quarto?.typesetMath) {
|
||||
window.Quarto.typesetMath(note);
|
||||
}
|
||||
if (note.classList.contains("callout")) {
|
||||
return note.outerHTML;
|
||||
} else {
|
||||
return note.innerHTML;
|
||||
}
|
||||
}
|
||||
}
|
||||
for (var i=0; i<xrefs.length; i++) {
|
||||
const xref = xrefs[i];
|
||||
tippyHover(xref, undefined, function(instance) {
|
||||
instance.disable();
|
||||
let url = xref.getAttribute('href');
|
||||
let hash = undefined;
|
||||
if (url.startsWith('#')) {
|
||||
hash = url;
|
||||
} else {
|
||||
try { hash = new URL(url).hash; } catch {}
|
||||
}
|
||||
if (hash) {
|
||||
const id = hash.replace(/^#\/?/, "");
|
||||
const note = window.document.getElementById(id);
|
||||
if (note !== null) {
|
||||
try {
|
||||
const html = processXRef(id, note.cloneNode(true));
|
||||
instance.setContent(html);
|
||||
} finally {
|
||||
instance.enable();
|
||||
instance.show();
|
||||
}
|
||||
} else {
|
||||
// See if we can fetch this
|
||||
fetch(url.split('#')[0])
|
||||
.then(res => res.text())
|
||||
.then(html => {
|
||||
const parser = new DOMParser();
|
||||
const htmlDoc = parser.parseFromString(html, "text/html");
|
||||
const note = htmlDoc.getElementById(id);
|
||||
if (note !== null) {
|
||||
const html = processXRef(id, note);
|
||||
instance.setContent(html);
|
||||
}
|
||||
}).finally(() => {
|
||||
instance.enable();
|
||||
instance.show();
|
||||
});
|
||||
}
|
||||
} else {
|
||||
// See if we can fetch a full url (with no hash to target)
|
||||
// This is a special case and we should probably do some content thinning / targeting
|
||||
fetch(url)
|
||||
.then(res => res.text())
|
||||
.then(html => {
|
||||
const parser = new DOMParser();
|
||||
const htmlDoc = parser.parseFromString(html, "text/html");
|
||||
const note = htmlDoc.querySelector('main.content');
|
||||
if (note !== null) {
|
||||
// This should only happen for chapter cross references
|
||||
// (since there is no id in the URL)
|
||||
// remove the first header
|
||||
if (note.children.length > 0 && note.children[0].tagName === "HEADER") {
|
||||
note.children[0].remove();
|
||||
}
|
||||
const html = processXRef(null, note);
|
||||
instance.setContent(html);
|
||||
}
|
||||
}).finally(() => {
|
||||
instance.enable();
|
||||
instance.show();
|
||||
});
|
||||
}
|
||||
}, function(instance) {
|
||||
});
|
||||
}
|
||||
let selectedAnnoteEl;
|
||||
const selectorForAnnotation = ( cell, annotation) => {
|
||||
let cellAttr = 'data-code-cell="' + cell + '"';
|
||||
let lineAttr = 'data-code-annotation="' + annotation + '"';
|
||||
const selector = 'span[' + cellAttr + '][' + lineAttr + ']';
|
||||
return selector;
|
||||
}
|
||||
const selectCodeLines = (annoteEl) => {
|
||||
const doc = window.document;
|
||||
const targetCell = annoteEl.getAttribute("data-target-cell");
|
||||
const targetAnnotation = annoteEl.getAttribute("data-target-annotation");
|
||||
const annoteSpan = window.document.querySelector(selectorForAnnotation(targetCell, targetAnnotation));
|
||||
const lines = annoteSpan.getAttribute("data-code-lines").split(",");
|
||||
const lineIds = lines.map((line) => {
|
||||
return targetCell + "-" + line;
|
||||
})
|
||||
let top = null;
|
||||
let height = null;
|
||||
let parent = null;
|
||||
if (lineIds.length > 0) {
|
||||
//compute the position of the single el (top and bottom and make a div)
|
||||
const el = window.document.getElementById(lineIds[0]);
|
||||
top = el.offsetTop;
|
||||
height = el.offsetHeight;
|
||||
parent = el.parentElement.parentElement;
|
||||
if (lineIds.length > 1) {
|
||||
const lastEl = window.document.getElementById(lineIds[lineIds.length - 1]);
|
||||
const bottom = lastEl.offsetTop + lastEl.offsetHeight;
|
||||
height = bottom - top;
|
||||
}
|
||||
if (top !== null && height !== null && parent !== null) {
|
||||
// cook up a div (if necessary) and position it
|
||||
let div = window.document.getElementById("code-annotation-line-highlight");
|
||||
if (div === null) {
|
||||
div = window.document.createElement("div");
|
||||
div.setAttribute("id", "code-annotation-line-highlight");
|
||||
div.style.position = 'absolute';
|
||||
parent.appendChild(div);
|
||||
}
|
||||
div.style.top = top - 2 + "px";
|
||||
div.style.height = height + 4 + "px";
|
||||
div.style.left = 0;
|
||||
let gutterDiv = window.document.getElementById("code-annotation-line-highlight-gutter");
|
||||
if (gutterDiv === null) {
|
||||
gutterDiv = window.document.createElement("div");
|
||||
gutterDiv.setAttribute("id", "code-annotation-line-highlight-gutter");
|
||||
gutterDiv.style.position = 'absolute';
|
||||
const codeCell = window.document.getElementById(targetCell);
|
||||
const gutter = codeCell.querySelector('.code-annotation-gutter');
|
||||
gutter.appendChild(gutterDiv);
|
||||
}
|
||||
gutterDiv.style.top = top - 2 + "px";
|
||||
gutterDiv.style.height = height + 4 + "px";
|
||||
}
|
||||
selectedAnnoteEl = annoteEl;
|
||||
}
|
||||
};
|
||||
const unselectCodeLines = () => {
|
||||
const elementsIds = ["code-annotation-line-highlight", "code-annotation-line-highlight-gutter"];
|
||||
elementsIds.forEach((elId) => {
|
||||
const div = window.document.getElementById(elId);
|
||||
if (div) {
|
||||
div.remove();
|
||||
}
|
||||
});
|
||||
selectedAnnoteEl = undefined;
|
||||
};
|
||||
// Handle positioning of the toggle
|
||||
window.addEventListener(
|
||||
"resize",
|
||||
throttle(() => {
|
||||
elRect = undefined;
|
||||
if (selectedAnnoteEl) {
|
||||
selectCodeLines(selectedAnnoteEl);
|
||||
}
|
||||
}, 10)
|
||||
);
|
||||
function throttle(fn, ms) {
|
||||
let throttle = false;
|
||||
let timer;
|
||||
return (...args) => {
|
||||
if(!throttle) { // first call gets through
|
||||
fn.apply(this, args);
|
||||
throttle = true;
|
||||
} else { // all the others get throttled
|
||||
if(timer) clearTimeout(timer); // cancel #2
|
||||
timer = setTimeout(() => {
|
||||
fn.apply(this, args);
|
||||
timer = throttle = false;
|
||||
}, ms);
|
||||
}
|
||||
};
|
||||
}
|
||||
// Attach click handler to the DT
|
||||
const annoteDls = window.document.querySelectorAll('dt[data-target-cell]');
|
||||
for (const annoteDlNode of annoteDls) {
|
||||
annoteDlNode.addEventListener('click', (event) => {
|
||||
const clickedEl = event.target;
|
||||
if (clickedEl !== selectedAnnoteEl) {
|
||||
unselectCodeLines();
|
||||
const activeEl = window.document.querySelector('dt[data-target-cell].code-annotation-active');
|
||||
if (activeEl) {
|
||||
activeEl.classList.remove('code-annotation-active');
|
||||
}
|
||||
selectCodeLines(clickedEl);
|
||||
clickedEl.classList.add('code-annotation-active');
|
||||
} else {
|
||||
// Unselect the line
|
||||
unselectCodeLines();
|
||||
clickedEl.classList.remove('code-annotation-active');
|
||||
}
|
||||
});
|
||||
}
|
||||
const findCites = (el) => {
|
||||
const parentEl = el.parentElement;
|
||||
if (parentEl) {
|
||||
const cites = parentEl.dataset.cites;
|
||||
if (cites) {
|
||||
return {
|
||||
el,
|
||||
cites: cites.split(' ')
|
||||
};
|
||||
} else {
|
||||
return findCites(el.parentElement)
|
||||
}
|
||||
} else {
|
||||
return undefined;
|
||||
}
|
||||
};
|
||||
var bibliorefs = window.document.querySelectorAll('a[role="doc-biblioref"]');
|
||||
for (var i=0; i<bibliorefs.length; i++) {
|
||||
const ref = bibliorefs[i];
|
||||
const citeInfo = findCites(ref);
|
||||
if (citeInfo) {
|
||||
tippyHover(citeInfo.el, function() {
|
||||
var popup = window.document.createElement('div');
|
||||
citeInfo.cites.forEach(function(cite) {
|
||||
var citeDiv = window.document.createElement('div');
|
||||
citeDiv.classList.add('hanging-indent');
|
||||
citeDiv.classList.add('csl-entry');
|
||||
var biblioDiv = window.document.getElementById('ref-' + cite);
|
||||
if (biblioDiv) {
|
||||
citeDiv.innerHTML = biblioDiv.innerHTML;
|
||||
}
|
||||
popup.appendChild(citeDiv);
|
||||
});
|
||||
return popup.innerHTML;
|
||||
});
|
||||
}
|
||||
}
|
||||
});
|
||||
</script>
|
||||
</div> <!-- /content -->
|
||||
|
||||
|
||||
|
||||
|
||||
</body></html>
|
||||
@@ -506,16 +506,15 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.core.datasets.chat.TokenizedChatDataset" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.datasets.chat.TokenizedChatDataset">TokenizedChatDataset</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.datasets.chat.TokenizedChatDataset(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> data,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> model_transform,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> message_transform<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> formatter<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> process_count<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> keep_in_memory<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> data,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> model_transform,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> message_transform<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> formatter<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> process_count<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> keep_in_memory<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Tokenized chat dataset</p>
|
||||
|
||||
|
||||
|
||||
@@ -506,13 +506,12 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.core.trainers.base.AxolotlTrainer" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.base.AxolotlTrainer">AxolotlTrainer</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.base.AxolotlTrainer(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>_args,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> bench_data_collator<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> eval_data_collator<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> dataset_tags<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>_args,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> bench_data_collator<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> eval_data_collator<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> dataset_tags<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Extend the base Trainer for axolotl helpers</p>
|
||||
<section id="methods" class="level4">
|
||||
<h4 class="anchored" data-anchor-id="methods">Methods</h4>
|
||||
|
||||
@@ -505,12 +505,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</table>
|
||||
<section id="axolotl.core.trainers.dpo.trainer.AxolotlDPOTrainer" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.dpo.trainer.AxolotlDPOTrainer">AxolotlDPOTrainer</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.dpo.trainer.AxolotlDPOTrainer(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> dataset_tags<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.dpo.trainer.AxolotlDPOTrainer(<span class="op">*</span>args, dataset_tags<span class="op">=</span><span class="va">None</span>, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Extend the base DPOTrainer for axolotl helpers.</p>
|
||||
<section id="methods" class="level4">
|
||||
<h4 class="anchored" data-anchor-id="methods">Methods</h4>
|
||||
|
||||
@@ -509,18 +509,17 @@ sequence parallel group.</p>
|
||||
<section id="axolotl.core.trainers.grpo.sampler.SequenceParallelRepeatRandomSampler" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.grpo.sampler.SequenceParallelRepeatRandomSampler">SequenceParallelRepeatRandomSampler</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.grpo.sampler.SequenceParallelRepeatRandomSampler(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> dataset,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> mini_repeat_count,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> world_size,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> rank,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> batch_size<span class="op">=</span><span class="dv">1</span>,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> repeat_count<span class="op">=</span><span class="dv">1</span>,</span>
|
||||
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> sequence_parallel_degree<span class="op">=</span><span class="dv">1</span>,</span>
|
||||
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> shuffle<span class="op">=</span><span class="va">True</span>,</span>
|
||||
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a> seed<span class="op">=</span><span class="dv">0</span>,</span>
|
||||
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a> drop_last<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> dataset,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> mini_repeat_count,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> world_size,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> rank,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> batch_size<span class="op">=</span><span class="dv">1</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> repeat_count<span class="op">=</span><span class="dv">1</span>,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> sequence_parallel_degree<span class="op">=</span><span class="dv">1</span>,</span>
|
||||
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> shuffle<span class="op">=</span><span class="va">True</span>,</span>
|
||||
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> seed<span class="op">=</span><span class="dv">0</span>,</span>
|
||||
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a> drop_last<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Sampler for GRPO training with sequence parallelism.</p>
|
||||
<p>This sampler ensures:
|
||||
- Ranks in the same sequence parallel (SP) group receive identical data.
|
||||
|
||||
@@ -511,17 +511,17 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.core.trainers.grpo.trainer.AxolotlGRPOSequenceParallelTrainer" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.grpo.trainer.AxolotlGRPOSequenceParallelTrainer">AxolotlGRPOSequenceParallelTrainer</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.grpo.trainer.AxolotlGRPOSequenceParallelTrainer(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> model,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> reward_funcs,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> args<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> train_dataset<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> eval_dataset<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> processing_class<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> reward_processing_classes<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> callbacks<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a> optimizers<span class="op">=</span>(<span class="va">None</span>, <span class="va">None</span>),</span>
|
||||
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a> peft_config<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> model,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> reward_funcs,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> args<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> train_dataset<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> eval_dataset<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> processing_class<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> reward_processing_classes<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> callbacks<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> optimizers<span class="op">=</span>(<span class="va">None</span>, <span class="va">None</span>),</span>
|
||||
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a> peft_config<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a> optimizer_cls_and_kwargs<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Extend the base GRPOTrainer for sequence parallelism handling</p>
|
||||
<section id="methods" class="level4">
|
||||
@@ -550,7 +550,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</section>
|
||||
<section id="axolotl.core.trainers.grpo.trainer.AxolotlGRPOTrainer" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.grpo.trainer.AxolotlGRPOTrainer">AxolotlGRPOTrainer</h3>
|
||||
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>core.trainers.grpo.trainer.AxolotlGRPOTrainer()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>core.trainers.grpo.trainer.AxolotlGRPOTrainer(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Extend the base GRPOTrainer for axolotl helpers</p>
|
||||
|
||||
|
||||
|
||||
@@ -506,13 +506,12 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.core.trainers.mamba.AxolotlMambaTrainer" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.mamba.AxolotlMambaTrainer">AxolotlMambaTrainer</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.mamba.AxolotlMambaTrainer(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>_args,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> bench_data_collator<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> eval_data_collator<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> dataset_tags<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>_args,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> bench_data_collator<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> eval_data_collator<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> dataset_tags<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Mamba specific trainer to handle loss calculation</p>
|
||||
|
||||
|
||||
|
||||
@@ -471,6 +471,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<ul class="collapse">
|
||||
<li><a href="#classes" id="toc-classes" class="nav-link" data-scroll-target="#classes">Classes</a>
|
||||
<ul class="collapse">
|
||||
<li><a href="#axolotl.core.trainers.mixins.optimizer.OptimizerInitMixin" id="toc-axolotl.core.trainers.mixins.optimizer.OptimizerInitMixin" class="nav-link" data-scroll-target="#axolotl.core.trainers.mixins.optimizer.OptimizerInitMixin">OptimizerInitMixin</a></li>
|
||||
<li><a href="#axolotl.core.trainers.mixins.optimizer.OptimizerMixin" id="toc-axolotl.core.trainers.mixins.optimizer.OptimizerMixin" class="nav-link" data-scroll-target="#axolotl.core.trainers.mixins.optimizer.OptimizerMixin">OptimizerMixin</a></li>
|
||||
</ul></li>
|
||||
</ul></li>
|
||||
@@ -498,14 +499,24 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr class="odd">
|
||||
<td><a href="#axolotl.core.trainers.mixins.optimizer.OptimizerInitMixin">OptimizerInitMixin</a></td>
|
||||
<td>Mixin to handle common optimizer initialization logic for Trainers (mostly TRL) that do not</td>
|
||||
</tr>
|
||||
<tr class="even">
|
||||
<td><a href="#axolotl.core.trainers.mixins.optimizer.OptimizerMixin">OptimizerMixin</a></td>
|
||||
<td>Mixin class for shared handling of building custom optimizers</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<section id="axolotl.core.trainers.mixins.optimizer.OptimizerInitMixin" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.mixins.optimizer.OptimizerInitMixin">OptimizerInitMixin</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.mixins.optimizer.OptimizerInitMixin(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Mixin to handle common optimizer initialization logic for Trainers (mostly TRL) that do not
|
||||
accept optimizer_cls_and_kwargs as kwarg in constructor.</p>
|
||||
</section>
|
||||
<section id="axolotl.core.trainers.mixins.optimizer.OptimizerMixin" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.mixins.optimizer.OptimizerMixin">OptimizerMixin</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.mixins.optimizer.OptimizerMixin()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>core.trainers.mixins.optimizer.OptimizerMixin()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Mixin class for shared handling of building custom optimizers</p>
|
||||
|
||||
|
||||
|
||||
@@ -505,7 +505,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</table>
|
||||
<section id="axolotl.core.trainers.relora.ReLoRATrainer" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.relora.ReLoRATrainer">ReLoRATrainer</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.relora.ReLoRATrainer(<span class="va">self</span>, <span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.relora.ReLoRATrainer(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Trainer subclass that uses the <code>OneCycleLR</code> scheduler</p>
|
||||
|
||||
|
||||
|
||||
@@ -530,84 +530,32 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</table>
|
||||
<section id="axolotl.core.trainers.trl.AxolotlCPOTrainer" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.trl.AxolotlCPOTrainer">AxolotlCPOTrainer</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlCPOTrainer()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlCPOTrainer(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Extend the base CPOTrainer for axolotl helpers</p>
|
||||
<section id="methods" class="level4">
|
||||
<h4 class="anchored" data-anchor-id="methods">Methods</h4>
|
||||
<table class="caption-top table">
|
||||
<thead>
|
||||
<tr class="header">
|
||||
<th>Name</th>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr class="odd">
|
||||
<td><a href="#axolotl.core.trainers.trl.AxolotlCPOTrainer.get_batch_loss_metrics">get_batch_loss_metrics</a></td>
|
||||
<td>Compute the CPO loss and other metrics for the given batch of inputs for train or test.</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<section id="axolotl.core.trainers.trl.AxolotlCPOTrainer.get_batch_loss_metrics" class="level5">
|
||||
<h5 class="anchored" data-anchor-id="axolotl.core.trainers.trl.AxolotlCPOTrainer.get_batch_loss_metrics">get_batch_loss_metrics</h5>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlCPOTrainer.get_batch_loss_metrics(</span>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> model,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> batch,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> train_eval<span class="op">=</span><span class="st">'train'</span>,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Compute the CPO loss and other metrics for the given batch of inputs for train or test.</p>
|
||||
</section>
|
||||
</section>
|
||||
</section>
|
||||
<section id="axolotl.core.trainers.trl.AxolotlKTOTrainer" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.trl.AxolotlKTOTrainer">AxolotlKTOTrainer</h3>
|
||||
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlKTOTrainer()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlKTOTrainer(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Extend the base KTOTrainer for axolotl helpers</p>
|
||||
</section>
|
||||
<section id="axolotl.core.trainers.trl.AxolotlORPOTrainer" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.trl.AxolotlORPOTrainer">AxolotlORPOTrainer</h3>
|
||||
<div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlORPOTrainer()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlORPOTrainer(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Extend the base ORPOTrainer for axolotl helpers</p>
|
||||
<section id="methods-1" class="level4">
|
||||
<h4 class="anchored" data-anchor-id="methods-1">Methods</h4>
|
||||
<table class="caption-top table">
|
||||
<thead>
|
||||
<tr class="header">
|
||||
<th>Name</th>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr class="odd">
|
||||
<td><a href="#axolotl.core.trainers.trl.AxolotlORPOTrainer.get_batch_loss_metrics">get_batch_loss_metrics</a></td>
|
||||
<td>Compute the ORPO loss and other metrics for the given batch of inputs for train or test.</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<section id="axolotl.core.trainers.trl.AxolotlORPOTrainer.get_batch_loss_metrics" class="level5">
|
||||
<h5 class="anchored" data-anchor-id="axolotl.core.trainers.trl.AxolotlORPOTrainer.get_batch_loss_metrics">get_batch_loss_metrics</h5>
|
||||
<div class="sourceCode" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlORPOTrainer.get_batch_loss_metrics(</span>
|
||||
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a> model,</span>
|
||||
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a> batch,</span>
|
||||
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a> train_eval<span class="op">=</span><span class="st">'train'</span>,</span>
|
||||
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Compute the ORPO loss and other metrics for the given batch of inputs for train or test.</p>
|
||||
</section>
|
||||
</section>
|
||||
</section>
|
||||
<section id="axolotl.core.trainers.trl.AxolotlPRMTrainer" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.trl.AxolotlPRMTrainer">AxolotlPRMTrainer</h3>
|
||||
<div class="sourceCode" id="cb6"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlPRMTrainer()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlPRMTrainer(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Extend the base trl.PRMTrainer for axolotl helpers</p>
|
||||
</section>
|
||||
<section id="axolotl.core.trainers.trl.AxolotlRewardTrainer" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.trl.AxolotlRewardTrainer">AxolotlRewardTrainer</h3>
|
||||
<div class="sourceCode" id="cb7"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlRewardTrainer()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.AxolotlRewardTrainer(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Extend the base RewardTrainer for axolotl helpers</p>
|
||||
</section>
|
||||
<section id="axolotl.core.trainers.trl.TRLPPOTrainer" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.trainers.trl.TRLPPOTrainer">TRLPPOTrainer</h3>
|
||||
<div class="sourceCode" id="cb8"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.TRLPPOTrainer()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb6"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a>core.trainers.trl.TRLPPOTrainer()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Wrapper for TRL PPO trainer to handle customizations</p>
|
||||
|
||||
|
||||
|
||||
@@ -536,326 +536,314 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.core.training_args.AxolotlCPOConfig" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.training_args.AxolotlCPOConfig">AxolotlCPOConfig</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>core.training_args.AxolotlCPOConfig(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
|
||||
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
|
||||
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-15"><a href="#cb1-15" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-16"><a href="#cb1-16" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-17"><a href="#cb1-17" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
|
||||
<span id="cb1-18"><a href="#cb1-18" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
|
||||
<span id="cb1-19"><a href="#cb1-19" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
|
||||
<span id="cb1-20"><a href="#cb1-20" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-21"><a href="#cb1-21" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-22"><a href="#cb1-22" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-23"><a href="#cb1-23" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb1-24"><a href="#cb1-24" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-25"><a href="#cb1-25" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-26"><a href="#cb1-26" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-27"><a href="#cb1-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-28"><a href="#cb1-28" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
|
||||
<span id="cb1-29"><a href="#cb1-29" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-30"><a href="#cb1-30" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-31"><a href="#cb1-31" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-32"><a href="#cb1-32" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-33"><a href="#cb1-33" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-34"><a href="#cb1-34" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-35"><a href="#cb1-35" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-36"><a href="#cb1-36" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-37"><a href="#cb1-37" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-38"><a href="#cb1-38" aria-hidden="true" tabindex="-1"></a> alternate_optimizer<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-39"><a href="#cb1-39" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-40"><a href="#cb1-40" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-41"><a href="#cb1-41" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-42"><a href="#cb1-42" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb1-43"><a href="#cb1-43" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb1-44"><a href="#cb1-44" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-45"><a href="#cb1-45" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-46"><a href="#cb1-46" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-47"><a href="#cb1-47" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-48"><a href="#cb1-48" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-49"><a href="#cb1-49" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-50"><a href="#cb1-50" aria-hidden="true" tabindex="-1"></a> simpo_gamma<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-51"><a href="#cb1-51" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
|
||||
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
|
||||
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-15"><a href="#cb1-15" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-16"><a href="#cb1-16" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
|
||||
<span id="cb1-17"><a href="#cb1-17" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
|
||||
<span id="cb1-18"><a href="#cb1-18" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
|
||||
<span id="cb1-19"><a href="#cb1-19" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-20"><a href="#cb1-20" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-21"><a href="#cb1-21" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-22"><a href="#cb1-22" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb1-23"><a href="#cb1-23" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-24"><a href="#cb1-24" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-25"><a href="#cb1-25" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-26"><a href="#cb1-26" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-27"><a href="#cb1-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
|
||||
<span id="cb1-28"><a href="#cb1-28" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-29"><a href="#cb1-29" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-30"><a href="#cb1-30" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-31"><a href="#cb1-31" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-32"><a href="#cb1-32" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-33"><a href="#cb1-33" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-34"><a href="#cb1-34" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-35"><a href="#cb1-35" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-36"><a href="#cb1-36" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-37"><a href="#cb1-37" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-38"><a href="#cb1-38" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-39"><a href="#cb1-39" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-40"><a href="#cb1-40" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb1-41"><a href="#cb1-41" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb1-42"><a href="#cb1-42" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-43"><a href="#cb1-43" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-44"><a href="#cb1-44" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-45"><a href="#cb1-45" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-46"><a href="#cb1-46" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-47"><a href="#cb1-47" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-48"><a href="#cb1-48" aria-hidden="true" tabindex="-1"></a> simpo_gamma<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-49"><a href="#cb1-49" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>CPO config for CPO training</p>
|
||||
</section>
|
||||
<section id="axolotl.core.training_args.AxolotlKTOConfig" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.training_args.AxolotlKTOConfig">AxolotlKTOConfig</h3>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>core.training_args.AxolotlKTOConfig(</span>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb2-11"><a href="#cb2-11" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
|
||||
<span id="cb2-12"><a href="#cb2-12" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
|
||||
<span id="cb2-13"><a href="#cb2-13" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb2-14"><a href="#cb2-14" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-15"><a href="#cb2-15" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-16"><a href="#cb2-16" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-17"><a href="#cb2-17" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
|
||||
<span id="cb2-18"><a href="#cb2-18" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
|
||||
<span id="cb2-19"><a href="#cb2-19" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
|
||||
<span id="cb2-20"><a href="#cb2-20" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-21"><a href="#cb2-21" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-22"><a href="#cb2-22" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-23"><a href="#cb2-23" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb2-24"><a href="#cb2-24" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-25"><a href="#cb2-25" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-26"><a href="#cb2-26" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-27"><a href="#cb2-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-28"><a href="#cb2-28" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
|
||||
<span id="cb2-29"><a href="#cb2-29" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-30"><a href="#cb2-30" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-31"><a href="#cb2-31" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-32"><a href="#cb2-32" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-33"><a href="#cb2-33" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-34"><a href="#cb2-34" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-35"><a href="#cb2-35" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-36"><a href="#cb2-36" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-37"><a href="#cb2-37" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-38"><a href="#cb2-38" aria-hidden="true" tabindex="-1"></a> alternate_optimizer<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-39"><a href="#cb2-39" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-40"><a href="#cb2-40" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-41"><a href="#cb2-41" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-42"><a href="#cb2-42" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb2-43"><a href="#cb2-43" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb2-44"><a href="#cb2-44" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-45"><a href="#cb2-45" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-46"><a href="#cb2-46" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-47"><a href="#cb2-47" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-48"><a href="#cb2-48" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-49"><a href="#cb2-49" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-50"><a href="#cb2-50" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
|
||||
<span id="cb2-11"><a href="#cb2-11" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
|
||||
<span id="cb2-12"><a href="#cb2-12" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb2-13"><a href="#cb2-13" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-14"><a href="#cb2-14" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-15"><a href="#cb2-15" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-16"><a href="#cb2-16" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
|
||||
<span id="cb2-17"><a href="#cb2-17" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
|
||||
<span id="cb2-18"><a href="#cb2-18" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
|
||||
<span id="cb2-19"><a href="#cb2-19" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-20"><a href="#cb2-20" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-21"><a href="#cb2-21" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-22"><a href="#cb2-22" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb2-23"><a href="#cb2-23" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-24"><a href="#cb2-24" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-25"><a href="#cb2-25" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-26"><a href="#cb2-26" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-27"><a href="#cb2-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
|
||||
<span id="cb2-28"><a href="#cb2-28" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-29"><a href="#cb2-29" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-30"><a href="#cb2-30" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-31"><a href="#cb2-31" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-32"><a href="#cb2-32" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-33"><a href="#cb2-33" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-34"><a href="#cb2-34" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-35"><a href="#cb2-35" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-36"><a href="#cb2-36" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-37"><a href="#cb2-37" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-38"><a href="#cb2-38" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-39"><a href="#cb2-39" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-40"><a href="#cb2-40" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb2-41"><a href="#cb2-41" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb2-42"><a href="#cb2-42" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-43"><a href="#cb2-43" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-44"><a href="#cb2-44" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-45"><a href="#cb2-45" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-46"><a href="#cb2-46" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-47"><a href="#cb2-47" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-48"><a href="#cb2-48" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>KTO config for KTO training</p>
|
||||
</section>
|
||||
<section id="axolotl.core.training_args.AxolotlORPOConfig" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.training_args.AxolotlORPOConfig">AxolotlORPOConfig</h3>
|
||||
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>core.training_args.AxolotlORPOConfig(</span>
|
||||
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-8"><a href="#cb3-8" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-9"><a href="#cb3-9" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-10"><a href="#cb3-10" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb3-11"><a href="#cb3-11" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
|
||||
<span id="cb3-12"><a href="#cb3-12" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
|
||||
<span id="cb3-13"><a href="#cb3-13" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb3-14"><a href="#cb3-14" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-15"><a href="#cb3-15" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-16"><a href="#cb3-16" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-17"><a href="#cb3-17" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
|
||||
<span id="cb3-18"><a href="#cb3-18" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
|
||||
<span id="cb3-19"><a href="#cb3-19" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
|
||||
<span id="cb3-20"><a href="#cb3-20" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-21"><a href="#cb3-21" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-22"><a href="#cb3-22" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-23"><a href="#cb3-23" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb3-24"><a href="#cb3-24" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-25"><a href="#cb3-25" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-26"><a href="#cb3-26" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-27"><a href="#cb3-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-28"><a href="#cb3-28" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
|
||||
<span id="cb3-29"><a href="#cb3-29" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-30"><a href="#cb3-30" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-31"><a href="#cb3-31" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-32"><a href="#cb3-32" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-33"><a href="#cb3-33" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-34"><a href="#cb3-34" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-35"><a href="#cb3-35" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-36"><a href="#cb3-36" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-37"><a href="#cb3-37" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-38"><a href="#cb3-38" aria-hidden="true" tabindex="-1"></a> alternate_optimizer<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-39"><a href="#cb3-39" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-40"><a href="#cb3-40" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-41"><a href="#cb3-41" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-42"><a href="#cb3-42" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb3-43"><a href="#cb3-43" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb3-44"><a href="#cb3-44" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-45"><a href="#cb3-45" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-46"><a href="#cb3-46" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-47"><a href="#cb3-47" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-48"><a href="#cb3-48" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-49"><a href="#cb3-49" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-50"><a href="#cb3-50" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-8"><a href="#cb3-8" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-9"><a href="#cb3-9" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb3-10"><a href="#cb3-10" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
|
||||
<span id="cb3-11"><a href="#cb3-11" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
|
||||
<span id="cb3-12"><a href="#cb3-12" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb3-13"><a href="#cb3-13" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-14"><a href="#cb3-14" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-15"><a href="#cb3-15" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-16"><a href="#cb3-16" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
|
||||
<span id="cb3-17"><a href="#cb3-17" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
|
||||
<span id="cb3-18"><a href="#cb3-18" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
|
||||
<span id="cb3-19"><a href="#cb3-19" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-20"><a href="#cb3-20" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-21"><a href="#cb3-21" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-22"><a href="#cb3-22" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb3-23"><a href="#cb3-23" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-24"><a href="#cb3-24" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-25"><a href="#cb3-25" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-26"><a href="#cb3-26" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-27"><a href="#cb3-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
|
||||
<span id="cb3-28"><a href="#cb3-28" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-29"><a href="#cb3-29" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-30"><a href="#cb3-30" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-31"><a href="#cb3-31" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-32"><a href="#cb3-32" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-33"><a href="#cb3-33" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-34"><a href="#cb3-34" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-35"><a href="#cb3-35" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-36"><a href="#cb3-36" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-37"><a href="#cb3-37" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-38"><a href="#cb3-38" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-39"><a href="#cb3-39" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-40"><a href="#cb3-40" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb3-41"><a href="#cb3-41" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb3-42"><a href="#cb3-42" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-43"><a href="#cb3-43" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-44"><a href="#cb3-44" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-45"><a href="#cb3-45" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-46"><a href="#cb3-46" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-47"><a href="#cb3-47" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb3-48"><a href="#cb3-48" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>ORPO config for ORPO training</p>
|
||||
</section>
|
||||
<section id="axolotl.core.training_args.AxolotlPRMConfig" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.training_args.AxolotlPRMConfig">AxolotlPRMConfig</h3>
|
||||
<div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>core.training_args.AxolotlPRMConfig(</span>
|
||||
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-10"><a href="#cb4-10" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb4-11"><a href="#cb4-11" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
|
||||
<span id="cb4-12"><a href="#cb4-12" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
|
||||
<span id="cb4-13"><a href="#cb4-13" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb4-14"><a href="#cb4-14" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-15"><a href="#cb4-15" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-16"><a href="#cb4-16" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-17"><a href="#cb4-17" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
|
||||
<span id="cb4-18"><a href="#cb4-18" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
|
||||
<span id="cb4-19"><a href="#cb4-19" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
|
||||
<span id="cb4-20"><a href="#cb4-20" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb4-21"><a href="#cb4-21" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb4-22"><a href="#cb4-22" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-23"><a href="#cb4-23" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb4-24"><a href="#cb4-24" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-25"><a href="#cb4-25" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-26"><a href="#cb4-26" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-27"><a href="#cb4-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-28"><a href="#cb4-28" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
|
||||
<span id="cb4-29"><a href="#cb4-29" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-30"><a href="#cb4-30" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-31"><a href="#cb4-31" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-32"><a href="#cb4-32" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb4-33"><a href="#cb4-33" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-34"><a href="#cb4-34" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-35"><a href="#cb4-35" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-36"><a href="#cb4-36" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-37"><a href="#cb4-37" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-38"><a href="#cb4-38" aria-hidden="true" tabindex="-1"></a> alternate_optimizer<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-39"><a href="#cb4-39" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-40"><a href="#cb4-40" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-41"><a href="#cb4-41" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-42"><a href="#cb4-42" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb4-43"><a href="#cb4-43" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb4-44"><a href="#cb4-44" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-45"><a href="#cb4-45" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-46"><a href="#cb4-46" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-47"><a href="#cb4-47" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-48"><a href="#cb4-48" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-49"><a href="#cb4-49" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-50"><a href="#cb4-50" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb4-10"><a href="#cb4-10" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
|
||||
<span id="cb4-11"><a href="#cb4-11" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
|
||||
<span id="cb4-12"><a href="#cb4-12" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb4-13"><a href="#cb4-13" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-14"><a href="#cb4-14" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-15"><a href="#cb4-15" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-16"><a href="#cb4-16" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
|
||||
<span id="cb4-17"><a href="#cb4-17" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
|
||||
<span id="cb4-18"><a href="#cb4-18" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
|
||||
<span id="cb4-19"><a href="#cb4-19" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb4-20"><a href="#cb4-20" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb4-21"><a href="#cb4-21" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-22"><a href="#cb4-22" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb4-23"><a href="#cb4-23" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-24"><a href="#cb4-24" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-25"><a href="#cb4-25" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-26"><a href="#cb4-26" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-27"><a href="#cb4-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
|
||||
<span id="cb4-28"><a href="#cb4-28" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-29"><a href="#cb4-29" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-30"><a href="#cb4-30" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-31"><a href="#cb4-31" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb4-32"><a href="#cb4-32" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-33"><a href="#cb4-33" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-34"><a href="#cb4-34" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-35"><a href="#cb4-35" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-36"><a href="#cb4-36" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-37"><a href="#cb4-37" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-38"><a href="#cb4-38" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-39"><a href="#cb4-39" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-40"><a href="#cb4-40" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb4-41"><a href="#cb4-41" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb4-42"><a href="#cb4-42" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-43"><a href="#cb4-43" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-44"><a href="#cb4-44" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-45"><a href="#cb4-45" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-46"><a href="#cb4-46" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-47"><a href="#cb4-47" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-48"><a href="#cb4-48" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>PRM config for PRM training</p>
|
||||
</section>
|
||||
<section id="axolotl.core.training_args.AxolotlRewardConfig" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.training_args.AxolotlRewardConfig">AxolotlRewardConfig</h3>
|
||||
<div class="sourceCode" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>core.training_args.AxolotlRewardConfig(</span>
|
||||
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-7"><a href="#cb5-7" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-8"><a href="#cb5-8" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-9"><a href="#cb5-9" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-10"><a href="#cb5-10" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb5-11"><a href="#cb5-11" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
|
||||
<span id="cb5-12"><a href="#cb5-12" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
|
||||
<span id="cb5-13"><a href="#cb5-13" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb5-14"><a href="#cb5-14" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-15"><a href="#cb5-15" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-16"><a href="#cb5-16" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-17"><a href="#cb5-17" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
|
||||
<span id="cb5-18"><a href="#cb5-18" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
|
||||
<span id="cb5-19"><a href="#cb5-19" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
|
||||
<span id="cb5-20"><a href="#cb5-20" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-21"><a href="#cb5-21" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-22"><a href="#cb5-22" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-23"><a href="#cb5-23" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb5-24"><a href="#cb5-24" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-25"><a href="#cb5-25" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-26"><a href="#cb5-26" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-27"><a href="#cb5-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-28"><a href="#cb5-28" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
|
||||
<span id="cb5-29"><a href="#cb5-29" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-30"><a href="#cb5-30" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-31"><a href="#cb5-31" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-32"><a href="#cb5-32" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-33"><a href="#cb5-33" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-34"><a href="#cb5-34" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-35"><a href="#cb5-35" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-36"><a href="#cb5-36" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-37"><a href="#cb5-37" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-38"><a href="#cb5-38" aria-hidden="true" tabindex="-1"></a> alternate_optimizer<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-39"><a href="#cb5-39" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-40"><a href="#cb5-40" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-41"><a href="#cb5-41" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-42"><a href="#cb5-42" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb5-43"><a href="#cb5-43" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb5-44"><a href="#cb5-44" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-45"><a href="#cb5-45" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-46"><a href="#cb5-46" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-47"><a href="#cb5-47" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-48"><a href="#cb5-48" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-49"><a href="#cb5-49" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-50"><a href="#cb5-50" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-7"><a href="#cb5-7" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-8"><a href="#cb5-8" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-9"><a href="#cb5-9" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb5-10"><a href="#cb5-10" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
|
||||
<span id="cb5-11"><a href="#cb5-11" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
|
||||
<span id="cb5-12"><a href="#cb5-12" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb5-13"><a href="#cb5-13" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-14"><a href="#cb5-14" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-15"><a href="#cb5-15" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-16"><a href="#cb5-16" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
|
||||
<span id="cb5-17"><a href="#cb5-17" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
|
||||
<span id="cb5-18"><a href="#cb5-18" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
|
||||
<span id="cb5-19"><a href="#cb5-19" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-20"><a href="#cb5-20" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-21"><a href="#cb5-21" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-22"><a href="#cb5-22" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb5-23"><a href="#cb5-23" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-24"><a href="#cb5-24" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-25"><a href="#cb5-25" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-26"><a href="#cb5-26" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-27"><a href="#cb5-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
|
||||
<span id="cb5-28"><a href="#cb5-28" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-29"><a href="#cb5-29" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-30"><a href="#cb5-30" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-31"><a href="#cb5-31" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-32"><a href="#cb5-32" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-33"><a href="#cb5-33" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-34"><a href="#cb5-34" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-35"><a href="#cb5-35" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-36"><a href="#cb5-36" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-37"><a href="#cb5-37" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-38"><a href="#cb5-38" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-39"><a href="#cb5-39" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-40"><a href="#cb5-40" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb5-41"><a href="#cb5-41" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb5-42"><a href="#cb5-42" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-43"><a href="#cb5-43" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-44"><a href="#cb5-44" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-45"><a href="#cb5-45" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-46"><a href="#cb5-46" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-47"><a href="#cb5-47" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb5-48"><a href="#cb5-48" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Reward config for Reward training</p>
|
||||
</section>
|
||||
<section id="axolotl.core.training_args.AxolotlTrainingArguments" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.training_args.AxolotlTrainingArguments">AxolotlTrainingArguments</h3>
|
||||
<div class="sourceCode" id="cb6"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a>core.training_args.AxolotlTrainingArguments(</span>
|
||||
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb6-7"><a href="#cb6-7" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb6-8"><a href="#cb6-8" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb6-9"><a href="#cb6-9" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-10"><a href="#cb6-10" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb6-11"><a href="#cb6-11" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
|
||||
<span id="cb6-12"><a href="#cb6-12" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
|
||||
<span id="cb6-13"><a href="#cb6-13" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb6-14"><a href="#cb6-14" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-15"><a href="#cb6-15" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-16"><a href="#cb6-16" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-17"><a href="#cb6-17" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
|
||||
<span id="cb6-18"><a href="#cb6-18" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
|
||||
<span id="cb6-19"><a href="#cb6-19" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
|
||||
<span id="cb6-20"><a href="#cb6-20" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb6-21"><a href="#cb6-21" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb6-22"><a href="#cb6-22" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-23"><a href="#cb6-23" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb6-24"><a href="#cb6-24" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-25"><a href="#cb6-25" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-26"><a href="#cb6-26" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-27"><a href="#cb6-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-28"><a href="#cb6-28" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
|
||||
<span id="cb6-29"><a href="#cb6-29" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-30"><a href="#cb6-30" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-31"><a href="#cb6-31" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-32"><a href="#cb6-32" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb6-33"><a href="#cb6-33" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-34"><a href="#cb6-34" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-35"><a href="#cb6-35" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-36"><a href="#cb6-36" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-37"><a href="#cb6-37" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-38"><a href="#cb6-38" aria-hidden="true" tabindex="-1"></a> alternate_optimizer<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-39"><a href="#cb6-39" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-40"><a href="#cb6-40" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-41"><a href="#cb6-41" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-42"><a href="#cb6-42" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb6-43"><a href="#cb6-43" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb6-44"><a href="#cb6-44" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-45"><a href="#cb6-45" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-46"><a href="#cb6-46" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-47"><a href="#cb6-47" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-48"><a href="#cb6-48" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-49"><a href="#cb6-49" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-50"><a href="#cb6-50" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb6-7"><a href="#cb6-7" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb6-8"><a href="#cb6-8" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-9"><a href="#cb6-9" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb6-10"><a href="#cb6-10" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
|
||||
<span id="cb6-11"><a href="#cb6-11" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
|
||||
<span id="cb6-12"><a href="#cb6-12" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb6-13"><a href="#cb6-13" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-14"><a href="#cb6-14" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-15"><a href="#cb6-15" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-16"><a href="#cb6-16" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
|
||||
<span id="cb6-17"><a href="#cb6-17" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
|
||||
<span id="cb6-18"><a href="#cb6-18" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
|
||||
<span id="cb6-19"><a href="#cb6-19" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb6-20"><a href="#cb6-20" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb6-21"><a href="#cb6-21" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-22"><a href="#cb6-22" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb6-23"><a href="#cb6-23" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-24"><a href="#cb6-24" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-25"><a href="#cb6-25" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-26"><a href="#cb6-26" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-27"><a href="#cb6-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
|
||||
<span id="cb6-28"><a href="#cb6-28" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-29"><a href="#cb6-29" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-30"><a href="#cb6-30" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-31"><a href="#cb6-31" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb6-32"><a href="#cb6-32" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-33"><a href="#cb6-33" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-34"><a href="#cb6-34" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-35"><a href="#cb6-35" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-36"><a href="#cb6-36" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-37"><a href="#cb6-37" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-38"><a href="#cb6-38" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-39"><a href="#cb6-39" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-40"><a href="#cb6-40" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb6-41"><a href="#cb6-41" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb6-42"><a href="#cb6-42" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-43"><a href="#cb6-43" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-44"><a href="#cb6-44" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-45"><a href="#cb6-45" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-46"><a href="#cb6-46" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-47"><a href="#cb6-47" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb6-48"><a href="#cb6-48" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Training arguments for Causal trainer</p>
|
||||
<p>This code is duplicated due to HF TrainingArguments not setting output_dir with a
|
||||
default value so it can’t be used as a mixin.</p>
|
||||
@@ -863,55 +851,53 @@ default value so it can’t be used as a mixin.</p>
|
||||
<section id="axolotl.core.training_args.AxolotlTrainingMixins" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.core.training_args.AxolotlTrainingMixins">AxolotlTrainingMixins</h3>
|
||||
<div class="sourceCode" id="cb7"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a>core.training_args.AxolotlTrainingMixins(</span>
|
||||
<span id="cb7-2"><a href="#cb7-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb7-3"><a href="#cb7-3" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-4"><a href="#cb7-4" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb7-5"><a href="#cb7-5" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb7-6"><a href="#cb7-6" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb7-7"><a href="#cb7-7" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb7-8"><a href="#cb7-8" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb7-9"><a href="#cb7-9" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-10"><a href="#cb7-10" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb7-11"><a href="#cb7-11" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
|
||||
<span id="cb7-12"><a href="#cb7-12" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
|
||||
<span id="cb7-13"><a href="#cb7-13" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb7-14"><a href="#cb7-14" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-15"><a href="#cb7-15" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-16"><a href="#cb7-16" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-17"><a href="#cb7-17" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
|
||||
<span id="cb7-18"><a href="#cb7-18" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
|
||||
<span id="cb7-19"><a href="#cb7-19" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
|
||||
<span id="cb7-20"><a href="#cb7-20" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb7-21"><a href="#cb7-21" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb7-22"><a href="#cb7-22" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-23"><a href="#cb7-23" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb7-24"><a href="#cb7-24" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-25"><a href="#cb7-25" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-26"><a href="#cb7-26" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-27"><a href="#cb7-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-28"><a href="#cb7-28" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
|
||||
<span id="cb7-29"><a href="#cb7-29" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-30"><a href="#cb7-30" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-31"><a href="#cb7-31" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-32"><a href="#cb7-32" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb7-33"><a href="#cb7-33" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-34"><a href="#cb7-34" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-35"><a href="#cb7-35" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-36"><a href="#cb7-36" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-37"><a href="#cb7-37" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-38"><a href="#cb7-38" aria-hidden="true" tabindex="-1"></a> alternate_optimizer<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-39"><a href="#cb7-39" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-40"><a href="#cb7-40" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-41"><a href="#cb7-41" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-42"><a href="#cb7-42" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb7-43"><a href="#cb7-43" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb7-44"><a href="#cb7-44" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-45"><a href="#cb7-45" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-46"><a href="#cb7-46" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-47"><a href="#cb7-47" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-48"><a href="#cb7-48" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-49"><a href="#cb7-49" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-50"><a href="#cb7-50" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb7-2"><a href="#cb7-2" aria-hidden="true" tabindex="-1"></a> model_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-3"><a href="#cb7-3" aria-hidden="true" tabindex="-1"></a> lr_quadratic_warmup<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb7-4"><a href="#cb7-4" aria-hidden="true" tabindex="-1"></a> pretraining<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb7-5"><a href="#cb7-5" aria-hidden="true" tabindex="-1"></a> sample_packing<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb7-6"><a href="#cb7-6" aria-hidden="true" tabindex="-1"></a> sample_packing_sequentially<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb7-7"><a href="#cb7-7" aria-hidden="true" tabindex="-1"></a> multipack_real_batches<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb7-8"><a href="#cb7-8" aria-hidden="true" tabindex="-1"></a> eval_sample_packing<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-9"><a href="#cb7-9" aria-hidden="true" tabindex="-1"></a> sample_packing_efficiency<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb7-10"><a href="#cb7-10" aria-hidden="true" tabindex="-1"></a> sample_packing_bin_size<span class="op">=</span><span class="dv">200</span>,</span>
|
||||
<span id="cb7-11"><a href="#cb7-11" aria-hidden="true" tabindex="-1"></a> sample_packing_group_size<span class="op">=</span><span class="dv">100000</span>,</span>
|
||||
<span id="cb7-12"><a href="#cb7-12" aria-hidden="true" tabindex="-1"></a> max_seq_length<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb7-13"><a href="#cb7-13" aria-hidden="true" tabindex="-1"></a> relora_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-14"><a href="#cb7-14" aria-hidden="true" tabindex="-1"></a> relora_warmup_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-15"><a href="#cb7-15" aria-hidden="true" tabindex="-1"></a> relora_anneal_steps<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-16"><a href="#cb7-16" aria-hidden="true" tabindex="-1"></a> relora_prune_ratio<span class="op">=</span><span class="fl">0.9</span>,</span>
|
||||
<span id="cb7-17"><a href="#cb7-17" aria-hidden="true" tabindex="-1"></a> bench_split<span class="op">=</span><span class="st">'eval'</span>,</span>
|
||||
<span id="cb7-18"><a href="#cb7-18" aria-hidden="true" tabindex="-1"></a> bench_dataset<span class="op">=</span><span class="st">'pharaouk/dharma-1/dharma_1_mini.json'</span>,</span>
|
||||
<span id="cb7-19"><a href="#cb7-19" aria-hidden="true" tabindex="-1"></a> do_bench_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb7-20"><a href="#cb7-20" aria-hidden="true" tabindex="-1"></a> do_causal_lm_eval<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb7-21"><a href="#cb7-21" aria-hidden="true" tabindex="-1"></a> max_bench_samples<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-22"><a href="#cb7-22" aria-hidden="true" tabindex="-1"></a> bench_source_max_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb7-23"><a href="#cb7-23" aria-hidden="true" tabindex="-1"></a> dataloader_prefetch_factor<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-24"><a href="#cb7-24" aria-hidden="true" tabindex="-1"></a> cosine_min_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-25"><a href="#cb7-25" aria-hidden="true" tabindex="-1"></a> cosine_constant_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-26"><a href="#cb7-26" aria-hidden="true" tabindex="-1"></a> loraplus_lr_ratio<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-27"><a href="#cb7-27" aria-hidden="true" tabindex="-1"></a> loraplus_lr_embedding<span class="op">=</span><span class="fl">1e-06</span>,</span>
|
||||
<span id="cb7-28"><a href="#cb7-28" aria-hidden="true" tabindex="-1"></a> embedding_lr_scale<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-29"><a href="#cb7-29" aria-hidden="true" tabindex="-1"></a> lr_groups<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-30"><a href="#cb7-30" aria-hidden="true" tabindex="-1"></a> embedding_lr<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-31"><a href="#cb7-31" aria-hidden="true" tabindex="-1"></a> qlora<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb7-32"><a href="#cb7-32" aria-hidden="true" tabindex="-1"></a> orpo_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-33"><a href="#cb7-33" aria-hidden="true" tabindex="-1"></a> lisa_n_layers<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-34"><a href="#cb7-34" aria-hidden="true" tabindex="-1"></a> lisa_step_interval<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-35"><a href="#cb7-35" aria-hidden="true" tabindex="-1"></a> lisa_layers_attribute<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-36"><a href="#cb7-36" aria-hidden="true" tabindex="-1"></a> curriculum_sampling<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-37"><a href="#cb7-37" aria-hidden="true" tabindex="-1"></a> alternate_lr_scheduler_type<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-38"><a href="#cb7-38" aria-hidden="true" tabindex="-1"></a> chat_template<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-39"><a href="#cb7-39" aria-hidden="true" tabindex="-1"></a> kd_ce_alpha<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-40"><a href="#cb7-40" aria-hidden="true" tabindex="-1"></a> kd_alpha<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb7-41"><a href="#cb7-41" aria-hidden="true" tabindex="-1"></a> kd_temperature<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb7-42"><a href="#cb7-42" aria-hidden="true" tabindex="-1"></a> kd_zscore_base_temp<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-43"><a href="#cb7-43" aria-hidden="true" tabindex="-1"></a> kd_top_k_before_softmax<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-44"><a href="#cb7-44" aria-hidden="true" tabindex="-1"></a> adam_beta3<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-45"><a href="#cb7-45" aria-hidden="true" tabindex="-1"></a> adam_epsilon2<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-46"><a href="#cb7-46" aria-hidden="true" tabindex="-1"></a> image_size<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-47"><a href="#cb7-47" aria-hidden="true" tabindex="-1"></a> image_resize_algorithm<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb7-48"><a href="#cb7-48" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Mixin class for the Axolotl training args.</p>
|
||||
|
||||
|
||||
|
||||
@@ -510,7 +510,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</table>
|
||||
<section id="axolotl.datasets.ConstantLengthDataset" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.datasets.ConstantLengthDataset">ConstantLengthDataset</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>datasets.ConstantLengthDataset(<span class="va">self</span>, tokenizer, datasets, seq_length<span class="op">=</span><span class="dv">2048</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>datasets.ConstantLengthDataset(tokenizer, datasets, seq_length<span class="op">=</span><span class="dv">2048</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Iterable dataset that returns constant length chunks of tokens from stream of text files.
|
||||
Args:
|
||||
tokenizer (Tokenizer): The processor used for processing the data.
|
||||
@@ -520,13 +520,12 @@ seq_length (int): Length of token sequences to return.</p>
|
||||
<section id="axolotl.datasets.TokenizedPromptDataset" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.datasets.TokenizedPromptDataset">TokenizedPromptDataset</h3>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>datasets.TokenizedPromptDataset(</span>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> prompt_tokenizer,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> dataset,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> process_count<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> keep_in_memory<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> prompt_tokenizer,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> dataset,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> process_count<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> keep_in_memory<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Dataset that returns tokenized prompts from a stream of text files.
|
||||
Args:
|
||||
prompt_tokenizer (PromptTokenizingStrategy): The prompt tokenizing method for processing the data.
|
||||
|
||||
@@ -492,8 +492,16 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<td>Common logging module for axolotl</td>
|
||||
</tr>
|
||||
<tr class="odd">
|
||||
<td><a href="../../docs/api/core.trainer_builder.html#axolotl.core.trainer_builder">core.trainer_builder</a></td>
|
||||
<td>Builder for the training args and trainer</td>
|
||||
<td><a href="../../docs/api/core.builders.base.html#axolotl.core.builders.base">core.builders.base</a></td>
|
||||
<td>Base class for trainer builder</td>
|
||||
</tr>
|
||||
<tr class="even">
|
||||
<td><a href="../../docs/api/core.builders.causal.html#axolotl.core.builders.causal">core.builders.causal</a></td>
|
||||
<td>Builder for causal trainers</td>
|
||||
</tr>
|
||||
<tr class="odd">
|
||||
<td><a href="../../docs/api/core.builders.rl.html#axolotl.core.builders.rl">core.builders.rl</a></td>
|
||||
<td>Builder for RLHF trainers</td>
|
||||
</tr>
|
||||
<tr class="even">
|
||||
<td><a href="../../docs/api/core.training_args.html#axolotl.core.training_args">core.training_args</a></td>
|
||||
|
||||
@@ -527,7 +527,7 @@ Plugins can be used to integrate third-party models, modify the training process
|
||||
</section>
|
||||
<section id="axolotl.integrations.base.BasePlugin" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.integrations.base.BasePlugin">BasePlugin</h3>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>integrations.base.BasePlugin(<span class="va">self</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>integrations.base.BasePlugin()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Base class for all plugins. Defines the interface for plugin methods.</p>
|
||||
<p>A plugin is a reusable, modular, and self-contained piece of code that extends
|
||||
the functionality of Axolotl. Plugins can be used to integrate third-party models,
|
||||
|
||||
@@ -506,13 +506,12 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.integrations.kd.trainer.AxolotlKDTrainer" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.integrations.kd.trainer.AxolotlKDTrainer">AxolotlKDTrainer</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>integrations.kd.trainer.AxolotlKDTrainer(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>_args,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> bench_data_collator<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> eval_data_collator<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> dataset_tags<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>_args,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> bench_data_collator<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> eval_data_collator<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> dataset_tags<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Custom trainer subclass for Knowledge Distillation (KD)</p>
|
||||
<section id="methods" class="level4">
|
||||
<h4 class="anchored" data-anchor-id="methods">Methods</h4>
|
||||
|
||||
@@ -507,14 +507,13 @@ models.</p>
|
||||
<section id="axolotl.loaders.model.ModelLoader" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.loaders.model.ModelLoader">ModelLoader</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>loaders.model.ModelLoader(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> cfg,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> inference<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> reference_model<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> cfg,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> inference<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> reference_model<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Manages model configuration, initialization and application of patches during
|
||||
model loading.</p>
|
||||
<p>This class orchestrates the entire process of loading a model from configuration to
|
||||
|
||||
@@ -506,7 +506,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</table>
|
||||
<section id="axolotl.loaders.patch_manager.PatchManager" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.loaders.patch_manager.PatchManager">PatchManager</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>loaders.patch_manager.PatchManager(<span class="va">self</span>, cfg, model_config, inference<span class="op">=</span><span class="va">False</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>loaders.patch_manager.PatchManager(cfg, model_config, inference<span class="op">=</span><span class="va">False</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Manages the application of patches during the model loading process.</p>
|
||||
<section id="attributes" class="level4">
|
||||
<h4 class="anchored" data-anchor-id="attributes">Attributes</h4>
|
||||
|
||||
@@ -510,23 +510,18 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</table>
|
||||
<section id="axolotl.monkeypatch.attention.mllama.MllamaTextCrossFlashAttention2" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.monkeypatch.attention.mllama.MllamaTextCrossFlashAttention2">MllamaTextCrossFlashAttention2</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>monkeypatch.attention.mllama.MllamaTextCrossFlashAttention2(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>monkeypatch.attention.mllama.MllamaTextCrossFlashAttention2(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Mllama flash cross-attention module. This module inherits from <code>MllamaTextCrossAttention</code> and
|
||||
implements the forward pass using Flash Attention for improved performance.</p>
|
||||
</section>
|
||||
<section id="axolotl.monkeypatch.attention.mllama.MllamaTextSelfFlashAttention2" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.monkeypatch.attention.mllama.MllamaTextSelfFlashAttention2">MllamaTextSelfFlashAttention2</h3>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>monkeypatch.attention.mllama.MllamaTextSelfFlashAttention2(</span>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> config,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> layer_idx,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
|
||||
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> config,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> layer_idx,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Mllama flash self-attention module. This module inherits from <code>MllamaTextSelfAttention</code> and
|
||||
implements the forward pass using Flash Attention for improved performance.</p>
|
||||
|
||||
|
||||
@@ -572,11 +572,10 @@ Advanced disk-based gradient checkpointer with prefetching.</p>
|
||||
<section id="axolotl.monkeypatch.gradient_checkpointing.offload_disk.DiskOffloadManager" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.monkeypatch.gradient_checkpointing.offload_disk.DiskOffloadManager">DiskOffloadManager</h3>
|
||||
<div class="sourceCode" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>monkeypatch.gradient_checkpointing.offload_disk.DiskOffloadManager(</span>
|
||||
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a> prefetch_size<span class="op">=</span><span class="dv">3</span>,</span>
|
||||
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a> prefetch_to_gpu<span class="op">=</span><span class="va">True</span>,</span>
|
||||
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a> save_workers<span class="op">=</span><span class="dv">4</span>,</span>
|
||||
<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a> prefetch_size<span class="op">=</span><span class="dv">3</span>,</span>
|
||||
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a> prefetch_to_gpu<span class="op">=</span><span class="va">True</span>,</span>
|
||||
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a> save_workers<span class="op">=</span><span class="dv">4</span>,</span>
|
||||
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Manages offloaded tensors and handles prefetching in a separate thread.
|
||||
Includes synchronization to prevent race conditions.</p>
|
||||
<section id="methods-1" class="level4">
|
||||
|
||||
@@ -516,7 +516,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</table>
|
||||
<section id="axolotl.monkeypatch.llama_attn_hijack_flash.FusedAttention" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.monkeypatch.llama_attn_hijack_flash.FusedAttention">FusedAttention</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>monkeypatch.llama_attn_hijack_flash.FusedAttention(<span class="va">self</span>, config, q, k, v, o)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>monkeypatch.llama_attn_hijack_flash.FusedAttention(config, q, k, v, o)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Fused QKV Attention layer for incrementally improved training efficiency</p>
|
||||
</section>
|
||||
<section id="axolotl.monkeypatch.llama_attn_hijack_flash.LlamaDecoderLayer" class="level3">
|
||||
|
||||
@@ -513,7 +513,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</table>
|
||||
<section id="axolotl.monkeypatch.lora_kernels.FakeMLP" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.monkeypatch.lora_kernels.FakeMLP">FakeMLP</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>monkeypatch.lora_kernels.FakeMLP(<span class="va">self</span>, gate_proj, up_proj, down_proj)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>monkeypatch.lora_kernels.FakeMLP(gate_proj, up_proj, down_proj)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>placeholder MLP for triton patching</p>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
@@ -510,20 +510,19 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</table>
|
||||
<section id="axolotl.monkeypatch.relora.ReLoRACallback" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.monkeypatch.relora.ReLoRACallback">ReLoRACallback</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>monkeypatch.relora.ReLoRACallback(<span class="va">self</span>, cfg)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>monkeypatch.relora.ReLoRACallback(cfg)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Callback to merge LoRA weights into the base model and save full-weight checkpoints</p>
|
||||
</section>
|
||||
<section id="axolotl.monkeypatch.relora.ReLoRAScheduler" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.monkeypatch.relora.ReLoRAScheduler">ReLoRAScheduler</h3>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>monkeypatch.relora.ReLoRAScheduler(</span>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> optimizer,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> inner_schedule,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> relora_steps,</span>
|
||||
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> warmup_steps,</span>
|
||||
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> anneal_steps<span class="op">=</span><span class="dv">1</span>,</span>
|
||||
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> min_lr_scale<span class="op">=</span><span class="fl">0.001</span>,</span>
|
||||
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> optimizer,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> inner_schedule,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> relora_steps,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> warmup_steps,</span>
|
||||
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> anneal_steps<span class="op">=</span><span class="dv">1</span>,</span>
|
||||
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> min_lr_scale<span class="op">=</span><span class="fl">0.001</span>,</span>
|
||||
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Wraps another scheduler to apply per-lora-restart learning rate warmups.</p>
|
||||
|
||||
|
||||
|
||||
@@ -525,42 +525,39 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</table>
|
||||
<section id="axolotl.prompt_strategies.alpaca_chat.AlpacaChatPrompter" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.alpaca_chat.AlpacaChatPrompter">AlpacaChatPrompter</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.alpaca_chat.AlpacaChatPrompter(<span class="va">self</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.alpaca_chat.AlpacaChatPrompter()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Alpaca Chat Prompter extending the system prompt to for chat-instruct answers</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_strategies.alpaca_chat.AlpacaConcisePrompter" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.alpaca_chat.AlpacaConcisePrompter">AlpacaConcisePrompter</h3>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.alpaca_chat.AlpacaConcisePrompter(</span>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> prompt_style<span class="op">=</span>PromptStyle.INSTRUCT.value,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> prompt_style<span class="op">=</span>PromptStyle.INSTRUCT.value,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Alpaca Prompter extending the system prompt to ask for concise chat-instruct answers</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_strategies.alpaca_chat.AlpacaQAPromptTokenizingStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.alpaca_chat.AlpacaQAPromptTokenizingStrategy">AlpacaQAPromptTokenizingStrategy</h3>
|
||||
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.alpaca_chat.AlpacaQAPromptTokenizingStrategy(</span>
|
||||
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Tokenizing strategy for AlpacaQA</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_strategies.alpaca_chat.CamelAIPromptTokenizingStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.alpaca_chat.CamelAIPromptTokenizingStrategy">CamelAIPromptTokenizingStrategy</h3>
|
||||
<div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.alpaca_chat.CamelAIPromptTokenizingStrategy(</span>
|
||||
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Tokenizing strategy for CamelAI datasets</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_strategies.alpaca_chat.NoSystemPrompter" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.alpaca_chat.NoSystemPrompter">NoSystemPrompter</h3>
|
||||
<div class="sourceCode" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.alpaca_chat.NoSystemPrompter(<span class="va">self</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.alpaca_chat.NoSystemPrompter()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Null Prompter with no system prompts</p>
|
||||
|
||||
|
||||
|
||||
@@ -521,39 +521,35 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.prompt_strategies.alpaca_w_system.InstructionWSystemPromptTokenizingStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.alpaca_w_system.InstructionWSystemPromptTokenizingStrategy">InstructionWSystemPromptTokenizingStrategy</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.alpaca_w_system.InstructionWSystemPromptTokenizingStrategy(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Tokenizing strategy for instruction-based prompts.</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_strategies.alpaca_w_system.OpenOrcaPromptTokenizingStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.alpaca_w_system.OpenOrcaPromptTokenizingStrategy">OpenOrcaPromptTokenizingStrategy</h3>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.alpaca_w_system.OpenOrcaPromptTokenizingStrategy(</span>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Tokenizing strategy for OpenOrca datasets</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_strategies.alpaca_w_system.OpenOrcaSystemDataPrompter" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.alpaca_w_system.OpenOrcaSystemDataPrompter">OpenOrcaSystemDataPrompter</h3>
|
||||
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.alpaca_w_system.OpenOrcaSystemDataPrompter(</span>
|
||||
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> prompt_style<span class="op">=</span>PromptStyle.INSTRUCT.value,</span>
|
||||
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> prompt_style<span class="op">=</span>PromptStyle.INSTRUCT.value,</span>
|
||||
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Alpaca Style Prompter that uses system prompts from the dataset, with OpenOrca prompts</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_strategies.alpaca_w_system.SystemDataPrompter" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.alpaca_w_system.SystemDataPrompter">SystemDataPrompter</h3>
|
||||
<div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.alpaca_w_system.SystemDataPrompter(</span>
|
||||
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> prompt_style<span class="op">=</span>PromptStyle.INSTRUCT.value,</span>
|
||||
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> prompt_style<span class="op">=</span>PromptStyle.INSTRUCT.value,</span>
|
||||
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Alpaca Style Prompter that uses system prompts from the dataset</p>
|
||||
|
||||
|
||||
|
||||
@@ -516,35 +516,33 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.prompt_strategies.chat_template.ChatTemplatePrompter" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.chat_template.ChatTemplatePrompter">ChatTemplatePrompter</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.chat_template.ChatTemplatePrompter(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> chat_template,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> processor<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> max_length<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> message_property_mappings<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> message_field_training<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> message_field_training_detail<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> field_messages<span class="op">=</span><span class="st">'messages'</span>,</span>
|
||||
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a> field_system<span class="op">=</span><span class="st">'system'</span>,</span>
|
||||
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a> roles<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a> drop_system_message<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> chat_template,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> processor<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> max_length<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> message_property_mappings<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> message_field_training<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> message_field_training_detail<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> field_messages<span class="op">=</span><span class="st">'messages'</span>,</span>
|
||||
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> field_system<span class="op">=</span><span class="st">'system'</span>,</span>
|
||||
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a> roles<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a> drop_system_message<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Prompter for HF chat templates</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_strategies.chat_template.ChatTemplateStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.chat_template.ChatTemplateStrategy">ChatTemplateStrategy</h3>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.chat_template.ChatTemplateStrategy(</span>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs,</span>
|
||||
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> sequence_len,</span>
|
||||
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> roles_to_train<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> train_on_eos<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a> train_on_eot<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a> eot_tokens<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-11"><a href="#cb2-11" aria-hidden="true" tabindex="-1"></a> split_thinking<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-12"><a href="#cb2-12" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> sequence_len,</span>
|
||||
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> roles_to_train<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> train_on_eos<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> train_on_eot<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a> eot_tokens<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a> split_thinking<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-11"><a href="#cb2-11" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Tokenizing strategy for instruction-based prompts.</p>
|
||||
<section id="methods" class="level4">
|
||||
<h4 class="anchored" data-anchor-id="methods">Methods</h4>
|
||||
|
||||
@@ -511,11 +511,10 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.prompt_strategies.completion.CompletionPromptTokenizingStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.completion.CompletionPromptTokenizingStrategy">CompletionPromptTokenizingStrategy</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.completion.CompletionPromptTokenizingStrategy(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> max_length<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> max_length<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Tokenizing strategy for Completion prompts.</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_strategies.completion.CompletionPrompter" class="level3">
|
||||
|
||||
@@ -516,11 +516,10 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.prompt_strategies.input_output.RawInputOutputStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.input_output.RawInputOutputStrategy">RawInputOutputStrategy</h3>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.input_output.RawInputOutputStrategy(</span>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> eos_token<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> eos_token<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Prompt Strategy class for input/output pairs</p>
|
||||
|
||||
|
||||
|
||||
@@ -528,24 +528,19 @@ For a custom system message, the first “from” can be “system” (followed
|
||||
</table>
|
||||
<section id="axolotl.prompt_strategies.llama2_chat.LLama2ChatTokenizingStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.llama2_chat.LLama2ChatTokenizingStrategy">LLama2ChatTokenizingStrategy</h3>
|
||||
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.llama2_chat.LLama2ChatTokenizingStrategy(</span>
|
||||
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
|
||||
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.llama2_chat.LLama2ChatTokenizingStrategy(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Tokenizing strategy for Llama2 prompts.
|
||||
adapted from https://github.com/lm-sys/FastChat/blob/main/fastchat/train/train.py</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_strategies.llama2_chat.Llama2ChatConversation" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.llama2_chat.Llama2ChatConversation">Llama2ChatConversation</h3>
|
||||
<div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.llama2_chat.Llama2ChatConversation(</span>
|
||||
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> name<span class="op">=</span><span class="st">'llama2'</span>,</span>
|
||||
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> system<span class="op">=</span><span class="st">"[INST] <<SYS>></span><span class="ch">\n</span><span class="st">You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.</span><span class="ch">\n\n</span><span class="st">If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.</span><span class="ch">\n</span><span class="st"><</SYS>></span><span class="ch">\n\n</span><span class="st">"</span>,</span>
|
||||
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> roles<span class="op">=</span>(<span class="st">'[INST]'</span>, <span class="st">'[/INST]'</span>),</span>
|
||||
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> messages<span class="op">=</span><span class="bu">list</span>(),</span>
|
||||
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a> offset<span class="op">=</span><span class="dv">0</span>,</span>
|
||||
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> name<span class="op">=</span><span class="st">'llama2'</span>,</span>
|
||||
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> system<span class="op">=</span><span class="st">"[INST] <<SYS>></span><span class="ch">\n</span><span class="st">You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.</span><span class="ch">\n\n</span><span class="st">If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.</span><span class="ch">\n</span><span class="st"><</SYS>></span><span class="ch">\n\n</span><span class="st">"</span>,</span>
|
||||
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> roles<span class="op">=</span>(<span class="st">'[INST]'</span>, <span class="st">'[/INST]'</span>),</span>
|
||||
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> messages<span class="op">=</span><span class="bu">list</span>(),</span>
|
||||
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> offset<span class="op">=</span><span class="dv">0</span>,</span>
|
||||
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>A class that manages prompt templates and keeps all conversation history.
|
||||
copied from https://github.com/lm-sys/FastChat/blob/main/fastchat/conversation.py</p>
|
||||
<section id="methods" class="level4">
|
||||
|
||||
@@ -506,12 +506,11 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.prompt_strategies.messages.chat.ChatMessageDatasetWrappingStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.messages.chat.ChatMessageDatasetWrappingStrategy">ChatMessageDatasetWrappingStrategy</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.messages.chat.ChatMessageDatasetWrappingStrategy(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> processor,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> message_transform<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> formatter<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> processor,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> message_transform<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> formatter<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Chat dataset wrapping strategy for new internal messages representations</p>
|
||||
|
||||
|
||||
|
||||
@@ -511,17 +511,16 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.prompt_strategies.metharme.MetharmePromptTokenizingStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.metharme.MetharmePromptTokenizingStrategy">MetharmePromptTokenizingStrategy</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.metharme.MetharmePromptTokenizingStrategy(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Tokenizing strategy for the Metharme models</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_strategies.metharme.MetharmePrompter" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.metharme.MetharmePrompter">MetharmePrompter</h3>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.metharme.MetharmePrompter(<span class="va">self</span>, <span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.metharme.MetharmePrompter(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Prompter for the Metharme models.</p>
|
||||
|
||||
|
||||
|
||||
@@ -511,9 +511,8 @@ this one specifies the system prompt with “### System:”.</p>
|
||||
<section id="axolotl.prompt_strategies.orcamini.OrcaMiniPrompter" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.orcamini.OrcaMiniPrompter">OrcaMiniPrompter</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.orcamini.OrcaMiniPrompter(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> prompt_style<span class="op">=</span>PromptStyle.INSTRUCT.value,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> prompt_style<span class="op">=</span>PromptStyle.INSTRUCT.value,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Adjusted Prompter for Orca Mini (v2) datasets</p>
|
||||
|
||||
|
||||
|
||||
@@ -590,21 +590,16 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</section>
|
||||
<section id="axolotl.prompt_strategies.orpo.chat_template.ORPOPrompter" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.orpo.chat_template.ORPOPrompter">ORPOPrompter</h3>
|
||||
<div class="sourceCode" id="cb7"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.orpo.chat_template.ORPOPrompter(</span>
|
||||
<span id="cb7-2"><a href="#cb7-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb7-3"><a href="#cb7-3" aria-hidden="true" tabindex="-1"></a> chat_template,</span>
|
||||
<span id="cb7-4"><a href="#cb7-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb7-5"><a href="#cb7-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb7"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.orpo.chat_template.ORPOPrompter(chat_template, tokenizer)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Single Turn prompter for ORPO</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_strategies.orpo.chat_template.ORPOTokenizingStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.orpo.chat_template.ORPOTokenizingStrategy">ORPOTokenizingStrategy</h3>
|
||||
<div class="sourceCode" id="cb8"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.orpo.chat_template.ORPOTokenizingStrategy(</span>
|
||||
<span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb8-3"><a href="#cb8-3" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
|
||||
<span id="cb8-4"><a href="#cb8-4" aria-hidden="true" tabindex="-1"></a> dataset_parser<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb8-5"><a href="#cb8-5" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb8-6"><a href="#cb8-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
|
||||
<span id="cb8-3"><a href="#cb8-3" aria-hidden="true" tabindex="-1"></a> dataset_parser<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb8-4"><a href="#cb8-4" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb8-5"><a href="#cb8-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>rejected_input_ids
|
||||
input_ids
|
||||
rejected_attention_mask
|
||||
|
||||
@@ -511,17 +511,16 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.prompt_strategies.pygmalion.PygmalionPromptTokenizingStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.pygmalion.PygmalionPromptTokenizingStrategy">PygmalionPromptTokenizingStrategy</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.pygmalion.PygmalionPromptTokenizingStrategy(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Tokenizing strategy for Pygmalion.</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_strategies.pygmalion.PygmalionPrompter" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.pygmalion.PygmalionPrompter">PygmalionPrompter</h3>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.pygmalion.PygmalionPrompter(<span class="va">self</span>, <span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.pygmalion.PygmalionPrompter(<span class="op">*</span>args, <span class="op">**</span>kwargs)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Prompter for Pygmalion.</p>
|
||||
|
||||
|
||||
|
||||
@@ -507,13 +507,12 @@ and (optionally) per-step, or per-prompt-trace labels for reward modelling.</p>
|
||||
<section id="axolotl.prompt_strategies.stepwise_supervised.StepwiseSupervisedPromptTokenizingStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.stepwise_supervised.StepwiseSupervisedPromptTokenizingStrategy">StepwiseSupervisedPromptTokenizingStrategy</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.stepwise_supervised.StepwiseSupervisedPromptTokenizingStrategy(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> step_separator<span class="op">=</span><span class="st">'</span><span class="ch">\n</span><span class="st">'</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> max_completion_length<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> train_on_last_step_only<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> step_separator<span class="op">=</span><span class="st">'</span><span class="ch">\n</span><span class="st">'</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> max_completion_length<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> train_on_last_step_only<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Tokenizing strategy for supervised stepwise datasets, typically used for COT-reasoning.
|
||||
These datasets should include the following columns:
|
||||
- prompt: the prompt text
|
||||
|
||||
@@ -511,27 +511,25 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.prompt_strategies.user_defined.UserDefinedDatasetConfig" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.user_defined.UserDefinedDatasetConfig">UserDefinedDatasetConfig</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.user_defined.UserDefinedDatasetConfig(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> system_prompt<span class="op">=</span><span class="st">''</span>,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> field_system<span class="op">=</span><span class="st">'system'</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> field_instruction<span class="op">=</span><span class="st">'instruction'</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> field_input<span class="op">=</span><span class="st">'input'</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> field_output<span class="op">=</span><span class="st">'output'</span>,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> <span class="bu">format</span><span class="op">=</span><span class="st">'</span><span class="sc">{instruction}</span><span class="st"> </span><span class="sc">{input}</span><span class="st"> '</span>,</span>
|
||||
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> no_input_format<span class="op">=</span><span class="st">'</span><span class="sc">{instruction}</span><span class="st"> '</span>,</span>
|
||||
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> system_format<span class="op">=</span><span class="st">'</span><span class="sc">{system}</span><span class="st">'</span>,</span>
|
||||
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> system_prompt<span class="op">=</span><span class="st">''</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> field_system<span class="op">=</span><span class="st">'system'</span>,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> field_instruction<span class="op">=</span><span class="st">'instruction'</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> field_input<span class="op">=</span><span class="st">'input'</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> field_output<span class="op">=</span><span class="st">'output'</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> <span class="bu">format</span><span class="op">=</span><span class="st">'</span><span class="sc">{instruction}</span><span class="st"> </span><span class="sc">{input}</span><span class="st"> '</span>,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> no_input_format<span class="op">=</span><span class="st">'</span><span class="sc">{instruction}</span><span class="st"> '</span>,</span>
|
||||
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> system_format<span class="op">=</span><span class="st">'</span><span class="sc">{system}</span><span class="st">'</span>,</span>
|
||||
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>dataclass configuration representing a userdefined dataset type</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_strategies.user_defined.UserDefinedPromptTokenizationStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_strategies.user_defined.UserDefinedPromptTokenizationStrategy">UserDefinedPromptTokenizationStrategy</h3>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>prompt_strategies.user_defined.UserDefinedPromptTokenizationStrategy(</span>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Prompt Tokenization Strategy for user defined prompts</p>
|
||||
|
||||
|
||||
|
||||
@@ -571,34 +571,31 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.prompt_tokenizers.AlpacaMultipleChoicePromptTokenizingStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_tokenizers.AlpacaMultipleChoicePromptTokenizingStrategy">AlpacaMultipleChoicePromptTokenizingStrategy</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>prompt_tokenizers.AlpacaMultipleChoicePromptTokenizingStrategy(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Tokenizing strategy for Alpaca Multiple Choice prompts.</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_tokenizers.AlpacaPromptTokenizingStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_tokenizers.AlpacaPromptTokenizingStrategy">AlpacaPromptTokenizingStrategy</h3>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>prompt_tokenizers.AlpacaPromptTokenizingStrategy(</span>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Tokenizing strategy for Alpaca prompts.</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_tokenizers.AlpacaReflectionPTStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_tokenizers.AlpacaReflectionPTStrategy">AlpacaReflectionPTStrategy</h3>
|
||||
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>prompt_tokenizers.AlpacaReflectionPTStrategy(</span>
|
||||
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Tokenizing strategy for Alpaca Reflection prompts.</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_tokenizers.DatasetWrappingStrategy" class="level3">
|
||||
@@ -609,23 +606,21 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.prompt_tokenizers.GPTeacherPromptTokenizingStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_tokenizers.GPTeacherPromptTokenizingStrategy">GPTeacherPromptTokenizingStrategy</h3>
|
||||
<div class="sourceCode" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>prompt_tokenizers.GPTeacherPromptTokenizingStrategy(</span>
|
||||
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb5-7"><a href="#cb5-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Tokenizing strategy for GPTeacher prompts.</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_tokenizers.InstructionPromptTokenizingStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_tokenizers.InstructionPromptTokenizingStrategy">InstructionPromptTokenizingStrategy</h3>
|
||||
<div class="sourceCode" id="cb6"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a>prompt_tokenizers.InstructionPromptTokenizingStrategy(</span>
|
||||
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb6-7"><a href="#cb6-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Tokenizing strategy for instruction-based prompts.</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_tokenizers.InvalidDataException" class="level3">
|
||||
@@ -636,67 +631,61 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.prompt_tokenizers.JeopardyPromptTokenizingStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_tokenizers.JeopardyPromptTokenizingStrategy">JeopardyPromptTokenizingStrategy</h3>
|
||||
<div class="sourceCode" id="cb8"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a>prompt_tokenizers.JeopardyPromptTokenizingStrategy(</span>
|
||||
<span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb8-3"><a href="#cb8-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb8-4"><a href="#cb8-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb8-5"><a href="#cb8-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb8-6"><a href="#cb8-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb8-7"><a href="#cb8-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb8-3"><a href="#cb8-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb8-4"><a href="#cb8-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb8-5"><a href="#cb8-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb8-6"><a href="#cb8-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Tokenizing strategy for Jeopardy prompts.</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_tokenizers.NomicGPT4AllPromptTokenizingStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_tokenizers.NomicGPT4AllPromptTokenizingStrategy">NomicGPT4AllPromptTokenizingStrategy</h3>
|
||||
<div class="sourceCode" id="cb9"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb9-1"><a href="#cb9-1" aria-hidden="true" tabindex="-1"></a>prompt_tokenizers.NomicGPT4AllPromptTokenizingStrategy(</span>
|
||||
<span id="cb9-2"><a href="#cb9-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb9-3"><a href="#cb9-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb9-4"><a href="#cb9-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb9-5"><a href="#cb9-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb9-6"><a href="#cb9-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb9-7"><a href="#cb9-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb9-2"><a href="#cb9-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb9-3"><a href="#cb9-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb9-4"><a href="#cb9-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb9-5"><a href="#cb9-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb9-6"><a href="#cb9-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Tokenizing strategy for NomicGPT4All prompts.</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_tokenizers.OpenAssistantPromptTokenizingStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_tokenizers.OpenAssistantPromptTokenizingStrategy">OpenAssistantPromptTokenizingStrategy</h3>
|
||||
<div class="sourceCode" id="cb10"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb10-1"><a href="#cb10-1" aria-hidden="true" tabindex="-1"></a>prompt_tokenizers.OpenAssistantPromptTokenizingStrategy(</span>
|
||||
<span id="cb10-2"><a href="#cb10-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb10-3"><a href="#cb10-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb10-4"><a href="#cb10-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb10-5"><a href="#cb10-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb10-6"><a href="#cb10-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb10-7"><a href="#cb10-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb10-2"><a href="#cb10-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb10-3"><a href="#cb10-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb10-4"><a href="#cb10-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb10-5"><a href="#cb10-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb10-6"><a href="#cb10-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Tokenizing strategy for OpenAssistant prompts.</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_tokenizers.PromptTokenizingStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_tokenizers.PromptTokenizingStrategy">PromptTokenizingStrategy</h3>
|
||||
<div class="sourceCode" id="cb11"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb11-1"><a href="#cb11-1" aria-hidden="true" tabindex="-1"></a>prompt_tokenizers.PromptTokenizingStrategy(</span>
|
||||
<span id="cb11-2"><a href="#cb11-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb11-3"><a href="#cb11-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb11-4"><a href="#cb11-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb11-5"><a href="#cb11-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb11-6"><a href="#cb11-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb11-7"><a href="#cb11-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb11-2"><a href="#cb11-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb11-3"><a href="#cb11-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb11-4"><a href="#cb11-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb11-5"><a href="#cb11-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb11-6"><a href="#cb11-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Abstract class for tokenizing strategies</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_tokenizers.ReflectionPromptTokenizingStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_tokenizers.ReflectionPromptTokenizingStrategy">ReflectionPromptTokenizingStrategy</h3>
|
||||
<div class="sourceCode" id="cb12"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb12-1"><a href="#cb12-1" aria-hidden="true" tabindex="-1"></a>prompt_tokenizers.ReflectionPromptTokenizingStrategy(</span>
|
||||
<span id="cb12-2"><a href="#cb12-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb12-3"><a href="#cb12-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb12-4"><a href="#cb12-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb12-5"><a href="#cb12-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb12-6"><a href="#cb12-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb12-7"><a href="#cb12-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb12-2"><a href="#cb12-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb12-3"><a href="#cb12-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb12-4"><a href="#cb12-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb12-5"><a href="#cb12-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb12-6"><a href="#cb12-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Tokenizing strategy for Reflection prompts.</p>
|
||||
</section>
|
||||
<section id="axolotl.prompt_tokenizers.SummarizeTLDRPromptTokenizingStrategy" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.prompt_tokenizers.SummarizeTLDRPromptTokenizingStrategy">SummarizeTLDRPromptTokenizingStrategy</h3>
|
||||
<div class="sourceCode" id="cb13"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb13-1"><a href="#cb13-1" aria-hidden="true" tabindex="-1"></a>prompt_tokenizers.SummarizeTLDRPromptTokenizingStrategy(</span>
|
||||
<span id="cb13-2"><a href="#cb13-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb13-3"><a href="#cb13-3" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb13-4"><a href="#cb13-4" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb13-5"><a href="#cb13-5" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb13-6"><a href="#cb13-6" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb13-7"><a href="#cb13-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb13-2"><a href="#cb13-2" aria-hidden="true" tabindex="-1"></a> prompter,</span>
|
||||
<span id="cb13-3"><a href="#cb13-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb13-4"><a href="#cb13-4" aria-hidden="true" tabindex="-1"></a> train_on_inputs<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb13-5"><a href="#cb13-5" aria-hidden="true" tabindex="-1"></a> sequence_len<span class="op">=</span><span class="dv">2048</span>,</span>
|
||||
<span id="cb13-6"><a href="#cb13-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Tokenizing strategy for SummarizeTLDR prompts.</p>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
@@ -505,10 +505,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</table>
|
||||
<section id="axolotl.utils.callbacks.comet_.SaveAxolotlConfigtoCometCallback" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.utils.callbacks.comet_.SaveAxolotlConfigtoCometCallback">SaveAxolotlConfigtoCometCallback</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.callbacks.comet_.SaveAxolotlConfigtoCometCallback(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> axolotl_config_path,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.callbacks.comet_.SaveAxolotlConfigtoCometCallback(axolotl_config_path)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Callback to save axolotl config to comet</p>
|
||||
|
||||
|
||||
|
||||
@@ -505,10 +505,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</table>
|
||||
<section id="axolotl.utils.callbacks.mlflow_.SaveAxolotlConfigtoMlflowCallback" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.utils.callbacks.mlflow_.SaveAxolotlConfigtoMlflowCallback">SaveAxolotlConfigtoMlflowCallback</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.callbacks.mlflow_.SaveAxolotlConfigtoMlflowCallback(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> axolotl_config_path,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.callbacks.mlflow_.SaveAxolotlConfigtoMlflowCallback(axolotl_config_path)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Callback to save axolotl config to mlflow</p>
|
||||
|
||||
|
||||
|
||||
@@ -505,7 +505,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</table>
|
||||
<section id="axolotl.utils.callbacks.perplexity.Perplexity" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.utils.callbacks.perplexity.Perplexity">Perplexity</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.callbacks.perplexity.Perplexity(<span class="va">self</span>, tokenizer, max_seq_len, stride<span class="op">=</span><span class="dv">512</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.callbacks.perplexity.Perplexity(tokenizer, max_seq_len, stride<span class="op">=</span><span class="dv">512</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Calculate perplexity as defined in https://huggingface.co/docs/transformers/en/perplexity.
|
||||
This is a custom variant that doesn’t re-tokenize the input or re-load the model.</p>
|
||||
<section id="methods" class="level4">
|
||||
|
||||
@@ -505,7 +505,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</table>
|
||||
<section id="axolotl.utils.callbacks.profiler.PytorchProfilerCallback" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.utils.callbacks.profiler.PytorchProfilerCallback">PytorchProfilerCallback</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.callbacks.profiler.PytorchProfilerCallback(<span class="va">self</span>, steps_to_profile<span class="op">=</span><span class="dv">5</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.callbacks.profiler.PytorchProfilerCallback(steps_to_profile<span class="op">=</span><span class="dv">5</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>PyTorch Profiler callback to create snapshots of GPU memory usage at specified steps.</p>
|
||||
|
||||
|
||||
|
||||
@@ -509,7 +509,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</table>
|
||||
<section id="axolotl.utils.callbacks.qat.QATCallback" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.utils.callbacks.qat.QATCallback">QATCallback</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.callbacks.qat.QATCallback(<span class="va">self</span>, cfg)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.callbacks.qat.QATCallback(cfg)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Callback to toggle fake quantization for the model.</p>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
@@ -521,31 +521,29 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.utils.collators.batching.BatchSamplerDataCollatorForSeq2Seq" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.utils.collators.batching.BatchSamplerDataCollatorForSeq2Seq">BatchSamplerDataCollatorForSeq2Seq</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.collators.batching.BatchSamplerDataCollatorForSeq2Seq(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> model<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> padding<span class="op">=</span><span class="va">True</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> max_length<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> pad_to_multiple_of<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> label_pad_token_id<span class="op">=-</span><span class="dv">100</span>,</span>
|
||||
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> position_pad_token_id<span class="op">=</span><span class="dv">0</span>,</span>
|
||||
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> return_tensors<span class="op">=</span><span class="st">'pt'</span>,</span>
|
||||
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> model<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> padding<span class="op">=</span><span class="va">True</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> max_length<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> pad_to_multiple_of<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> label_pad_token_id<span class="op">=-</span><span class="dv">100</span>,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> position_pad_token_id<span class="op">=</span><span class="dv">0</span>,</span>
|
||||
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> return_tensors<span class="op">=</span><span class="st">'pt'</span>,</span>
|
||||
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Collator for multipack specific to the using the BatchSampler</p>
|
||||
</section>
|
||||
<section id="axolotl.utils.collators.batching.DataCollatorForSeq2Seq" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.utils.collators.batching.DataCollatorForSeq2Seq">DataCollatorForSeq2Seq</h3>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>utils.collators.batching.DataCollatorForSeq2Seq(</span>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> model<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> padding<span class="op">=</span><span class="va">True</span>,</span>
|
||||
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> max_length<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> pad_to_multiple_of<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> label_pad_token_id<span class="op">=-</span><span class="dv">100</span>,</span>
|
||||
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a> position_pad_token_id<span class="op">=</span><span class="dv">0</span>,</span>
|
||||
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a> return_tensors<span class="op">=</span><span class="st">'pt'</span>,</span>
|
||||
<span id="cb2-11"><a href="#cb2-11" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> model<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> padding<span class="op">=</span><span class="va">True</span>,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> max_length<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> pad_to_multiple_of<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> label_pad_token_id<span class="op">=-</span><span class="dv">100</span>,</span>
|
||||
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> position_pad_token_id<span class="op">=</span><span class="dv">0</span>,</span>
|
||||
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a> return_tensors<span class="op">=</span><span class="st">'pt'</span>,</span>
|
||||
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Data collator that will dynamically pad the inputs received, as well as the labels and position_ids</p>
|
||||
<section id="parameters" class="level4 doc-section doc-section-parameters">
|
||||
<h4 class="doc-section doc-section-parameters anchored" data-anchor-id="parameters">Parameters</h4>
|
||||
@@ -614,26 +612,24 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.utils.collators.batching.PretrainingBatchSamplerDataCollatorForSeq2Seq" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.utils.collators.batching.PretrainingBatchSamplerDataCollatorForSeq2Seq">PretrainingBatchSamplerDataCollatorForSeq2Seq</h3>
|
||||
<div class="sourceCode" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>utils.collators.batching.PretrainingBatchSamplerDataCollatorForSeq2Seq(</span>
|
||||
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
|
||||
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> multipack_attn<span class="op">=</span><span class="va">True</span>,</span>
|
||||
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a> <span class="op">*</span>args,</span>
|
||||
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> multipack_attn<span class="op">=</span><span class="va">True</span>,</span>
|
||||
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Collator for multipack specific to the using the BatchSampler</p>
|
||||
</section>
|
||||
<section id="axolotl.utils.collators.batching.V2BatchSamplerDataCollatorForSeq2Seq" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.utils.collators.batching.V2BatchSamplerDataCollatorForSeq2Seq">V2BatchSamplerDataCollatorForSeq2Seq</h3>
|
||||
<div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>utils.collators.batching.V2BatchSamplerDataCollatorForSeq2Seq(</span>
|
||||
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> model<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> padding<span class="op">=</span><span class="va">True</span>,</span>
|
||||
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> max_length<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a> pad_to_multiple_of<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a> label_pad_token_id<span class="op">=-</span><span class="dv">100</span>,</span>
|
||||
<span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a> position_pad_token_id<span class="op">=</span><span class="dv">0</span>,</span>
|
||||
<span id="cb4-10"><a href="#cb4-10" aria-hidden="true" tabindex="-1"></a> return_tensors<span class="op">=</span><span class="st">'pt'</span>,</span>
|
||||
<span id="cb4-11"><a href="#cb4-11" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> model<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> padding<span class="op">=</span><span class="va">True</span>,</span>
|
||||
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> max_length<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> pad_to_multiple_of<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a> label_pad_token_id<span class="op">=-</span><span class="dv">100</span>,</span>
|
||||
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a> position_pad_token_id<span class="op">=</span><span class="dv">0</span>,</span>
|
||||
<span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a> return_tensors<span class="op">=</span><span class="st">'pt'</span>,</span>
|
||||
<span id="cb4-10"><a href="#cb4-10" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Collator for multipack specific to the using the BatchSampler</p>
|
||||
|
||||
|
||||
|
||||
@@ -505,7 +505,7 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
</table>
|
||||
<section id="axolotl.utils.collators.mamba.MambaDataCollator" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.utils.collators.mamba.MambaDataCollator">MambaDataCollator</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.collators.mamba.MambaDataCollator(<span class="va">self</span>, tokenizer)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.collators.mamba.MambaDataCollator(tokenizer)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Collator for State Space Models (Mamba)</p>
|
||||
|
||||
|
||||
|
||||
@@ -506,14 +506,13 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.utils.collators.mm_chat.MultiModalChatDataCollator" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.utils.collators.mm_chat.MultiModalChatDataCollator">MultiModalChatDataCollator</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.collators.mm_chat.MultiModalChatDataCollator(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> processing_strategy,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> packing<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> return_tensors<span class="op">=</span><span class="st">'pt'</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> padding<span class="op">=</span><span class="va">True</span>,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> pad_to_multiple_of<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> tokenizer,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> processing_strategy,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> packing<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> return_tensors<span class="op">=</span><span class="st">'pt'</span>,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> padding<span class="op">=</span><span class="va">True</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> pad_to_multiple_of<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Collator for multi-modal chat messages</p>
|
||||
|
||||
|
||||
|
||||
@@ -680,13 +680,12 @@ from the full gradient tensor.</p>
|
||||
<section id="axolotl.utils.ctx_managers.sequence_parallel.SequenceParallelContextManager" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.utils.ctx_managers.sequence_parallel.SequenceParallelContextManager">SequenceParallelContextManager</h3>
|
||||
<div class="sourceCode" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>utils.ctx_managers.sequence_parallel.SequenceParallelContextManager(</span>
|
||||
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> models,</span>
|
||||
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> sequence_parallel_degree,</span>
|
||||
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> gradient_accumulation_steps,</span>
|
||||
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> ring_attn_func,</span>
|
||||
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a> heads_k_stride,</span>
|
||||
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> models,</span>
|
||||
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> sequence_parallel_degree,</span>
|
||||
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> gradient_accumulation_steps,</span>
|
||||
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> ring_attn_func,</span>
|
||||
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> heads_k_stride,</span>
|
||||
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Context manager for sequence parallelism operations.</p>
|
||||
<p>This class provides a context that will automatically apply sequence parallelism
|
||||
during model forward passes using a pre-forward hook, and gather outputs from
|
||||
|
||||
@@ -538,7 +538,7 @@ window.Quarto = {
|
||||
</table>
|
||||
<section id="axolotl.utils.freeze.LayerNamePattern" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.utils.freeze.LayerNamePattern">LayerNamePattern</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.freeze.LayerNamePattern(<span class="va">self</span>, pattern)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.freeze.LayerNamePattern(pattern)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Represents a regex pattern for layer names, potentially including a parameter index range.</p>
|
||||
<section id="methods" class="level4">
|
||||
<h4 class="anchored" data-anchor-id="methods">Methods</h4>
|
||||
|
||||
@@ -514,21 +514,20 @@ into fixed-capacity batches to optimize memory usage and training throughput.</p
|
||||
<section id="axolotl.utils.samplers.multipack.MultipackBatchSampler" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.utils.samplers.multipack.MultipackBatchSampler">MultipackBatchSampler</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.samplers.multipack.MultipackBatchSampler(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> sampler,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> batch_size,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> batch_max_len,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> lengths,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> packing_efficiency_estimate<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> drop_last<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> num_count_samples<span class="op">=</span><span class="dv">16</span>,</span>
|
||||
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> sequential<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a> group_size<span class="op">=</span><span class="dv">100000</span>,</span>
|
||||
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a> bin_size<span class="op">=</span><span class="dv">200</span>,</span>
|
||||
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a> num_processes<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a> safe_mode<span class="op">=</span><span class="va">True</span>,</span>
|
||||
<span id="cb1-15"><a href="#cb1-15" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb1-16"><a href="#cb1-16" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> sampler,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> batch_size,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> batch_max_len,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> lengths,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> packing_efficiency_estimate<span class="op">=</span><span class="fl">1.0</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> drop_last<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> num_count_samples<span class="op">=</span><span class="dv">16</span>,</span>
|
||||
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> sequential<span class="op">=</span><span class="va">False</span>,</span>
|
||||
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> group_size<span class="op">=</span><span class="dv">100000</span>,</span>
|
||||
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a> bin_size<span class="op">=</span><span class="dv">200</span>,</span>
|
||||
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a> num_processes<span class="op">=</span><span class="va">None</span>,</span>
|
||||
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a> safe_mode<span class="op">=</span><span class="va">True</span>,</span>
|
||||
<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a> <span class="op">**</span>kwargs,</span>
|
||||
<span id="cb1-15"><a href="#cb1-15" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Batch sampler class for efficient packing of variable-length sequences</p>
|
||||
<p>This sampler packs sequences into fixed-capacity bins (batches) to maximize
|
||||
GPU memory utilization and training throughput by reducing padding.</p>
|
||||
|
||||
@@ -517,26 +517,24 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true});
|
||||
<section id="axolotl.utils.schedulers.InterpolatingLogScheduler" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.utils.schedulers.InterpolatingLogScheduler">InterpolatingLogScheduler</h3>
|
||||
<div class="sourceCode" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a>utils.schedulers.InterpolatingLogScheduler(</span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> optimizer,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> num_steps,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> min_lr,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> max_lr,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> last_epoch<span class="op">=-</span><span class="dv">1</span>,</span>
|
||||
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> optimizer,</span>
|
||||
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> num_steps,</span>
|
||||
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> min_lr,</span>
|
||||
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> max_lr,</span>
|
||||
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a> last_epoch<span class="op">=-</span><span class="dv">1</span>,</span>
|
||||
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>A scheduler that interpolates learning rates in a logarithmic fashion</p>
|
||||
</section>
|
||||
<section id="axolotl.utils.schedulers.RexLR" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="axolotl.utils.schedulers.RexLR">RexLR</h3>
|
||||
<div class="sourceCode" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>utils.schedulers.RexLR(</span>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="va">self</span>,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> optimizer,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> max_lr,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> min_lr,</span>
|
||||
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> total_steps<span class="op">=</span><span class="dv">0</span>,</span>
|
||||
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> num_warmup_steps<span class="op">=</span><span class="dv">0</span>,</span>
|
||||
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> last_step<span class="op">=</span><span class="dv">0</span>,</span>
|
||||
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> optimizer,</span>
|
||||
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> max_lr,</span>
|
||||
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> min_lr,</span>
|
||||
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> total_steps<span class="op">=</span><span class="dv">0</span>,</span>
|
||||
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> num_warmup_steps<span class="op">=</span><span class="dv">0</span>,</span>
|
||||
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> last_step<span class="op">=</span><span class="dv">0</span>,</span>
|
||||
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<p>Reflected Exponential (REX) learning rate scheduler.</p>
|
||||
<ul>
|
||||
<li>Original implementation: https://github.com/IvanVassi/REX_LR</li>
|
||||
|
||||
Reference in New Issue
Block a user