<p>The stepwise supervised format is designed for chain-of-thought (COT) reasoning datasets where each example contains multiple completion steps and a preference label for each step.</p>
<p>The stepwise supervised format is designed for chain-of-thought (COT) reasoning
datasets where each example contains multiple completion steps and a preference label
<p>One of the most popular features of<ahref="https://github.com/axolotl-ai-cloud/axolotl">axolotl</a> is setting the following configuration value:</p>
<p>One of the most popular features of
<ahref="https://github.com/axolotl-ai-cloud/axolotl">axolotl</a> is
setting the following configuration value:</p>
<divclass="sourceCode"id="cb1"><preclass="sourceCode yaml code-with-copy"><codeclass="sourceCode yaml"><spanid="cb1-1"><ahref="#cb1-1"aria-hidden="true"tabindex="-1"></a><spanclass="fu">train_on_inputs</span><spanclass="kw">:</span><spanclass="at"></span><spanclass="ch">false</span></span></code><buttontitle="Copy to Clipboard"class="code-copy-button"><iclass="bi"></i></button></pre></div>
<p>If you declare a <ahref="https://github.com/axolotl-ai-cloud/axolotl?tab=readme-ov-file#dataset">dataset formats</a> such as <code>alpaca</code> or <code>chatml</code>, axolotl knows what is an input (i.e. human) vs. an output (i.e. the assistant) and masks the input labels so that your model can focus on predicting the outputs only.</p>
<p>If you declare a <ahref="https://github.com/axolotl-ai-cloud/axolotl?tab=readme-ov-file#dataset">dataset formats</a>
such as <code>alpaca</code> or <code>chatml</code>, axolotl knows what is an input
(i.e. human) vs. an output (i.e. the assistant) and masks the input
labels so that your model can focus on predicting the outputs only.</p>
<h3class="anchored"data-anchor-id="sec-you-may-not-want-prompt-templates">You may not want prompt templates</h3>
<p>However, there are many situations where you don’t want to use one of these formats or templates. This is because they can:</p>
<p>However, there are many situations where you don’t want to use one of
these formats or templates. This is because they can:</p>
<ul>
<li>Add unnecessary boilerplate to your prompts.</li>
<li>Create artifacts like special delimiters <code><|im_start|></code> that can quickly become footguns if you don’t include them correctly at inference time.</li>
<li>Enforce a <em>chat</em> interface when you do not want one. Sometimes you just want to fine-tune a model to a very specific task and do NOT want multi-turn conversations, roles, etc.</li>
<li>Create artifacts like special delimiters <code><|im_start|></code> that can
quickly become footguns if you don’t include them correctly at
inference time.</li>
<li>Enforce a <em>chat</em> interface when you do not want one. Sometimes you
just want to fine-tune a model to a very specific task and do NOT
want multi-turn conversations, roles, etc.</li>
<li>Limit you to only certain roles that the template allows.</li>
<p>You can construct your prompts without a template by using the<code>input_output</code> format, by setting <code>type: input_output</code> in your configuration file like this:</p>
<p>You can construct your prompts without a template by using the
<code>input_output</code> format, by setting <code>type: input_output</code> in your
configuration file like this:</p>
<p><strong>config.yml</strong></p>
<divclass="sourceCode"id="cb2"><preclass="sourceCode yaml code-with-copy"><codeclass="sourceCode yaml"><spanid="cb2-1"><ahref="#cb2-1"aria-hidden="true"tabindex="-1"></a><spanclass="fu">train_on_inputs</span><spanclass="kw">:</span><spanclass="at"></span><spanclass="ch">false</span><spanclass="co"> # Mask segments of your data</span></span>
<spanid="cb2-4"><ahref="#cb2-4"aria-hidden="true"tabindex="-1"></a><spanclass="at"></span><spanclass="fu">type</span><spanclass="kw">:</span><spanclass="at"> input_output</span><spanclass="co"> # use template free prompt construction</span></span></code><buttontitle="Copy to Clipboard"class="code-copy-button"><iclass="bi"></i></button></pre></div>
<p>Unlike <code>type: completion</code>, which is also template-free,<code>type: input_output</code> allows you to mask segments of your text. More details on how this works are described below.</p>
<p>Unlike <code>type: completion</code>, which is also template-free,
<code>type: input_output</code> allows you to mask segments of your text. More
details on how this works are described below.</p>
<p>To use the <code>input_output</code> format, collect your data in the following format into a jsonl file (below is the first row from the file <code>output</code>.jsonl` pretty printed):</p>
<p>To use the <code>input_output</code> format, collect your data in the following
format into a jsonl file (below is the first row from the file
<code>output</code>.jsonl` pretty printed):</p>
<divclass="sourceCode"id="cb3"><preclass="sourceCode bash code-with-copy"><codeclass="sourceCode bash"><spanid="cb3-1"><ahref="#cb3-1"aria-hidden="true"tabindex="-1"></a><spanclass="ex">$</span> head <spanclass="at">-n1</span> output.jsonl <spanclass="kw">|</span><spanclass="ex">python</span><spanclass="at">-m</span> json.tool</span></code><buttontitle="Copy to Clipboard"class="code-copy-button"><iclass="bi"></i></button></pre></div>
<p>Set <code>label:false</code> when you want to mask a segment of text so that the model isn’t trained on it. Some things to keep in mind:</p>
<p>Set <code>label:false</code> when you want to mask a segment of text so that the
model isn’t trained on it. Some things to keep in mind:</p>
<blockquoteclass="blockquote">
<p>[!IMPORTANT] 1. <strong>EOS, BOS, spaces, newlines etc. are entirely up to you. Axolotl concatenates all the segments as-is.</strong> The tokenizer doesn’t add anything additional. Notice how I added spaces, newlines, <code><s></code> (BOS), and <code></s></code> (EOS) myself. 2. Make sure you check the materialized output to validate that the prompt is getting assembled how you like.</p>
<p>[!IMPORTANT]
1. <strong>EOS, BOS, spaces, newlines etc. are entirely up to you. Axolotl
concatenates all the segments as-is.</strong> The tokenizer doesn’t add
anything additional. Notice how I added spaces, newlines, <code><s></code>
(BOS), and <code></s></code> (EOS) myself.
2. Make sure you check the materialized output to validate that the
<spanid="cb5-24"><ahref="#cb5-24"aria-hidden="true"tabindex="-1"></a><spanclass="at"></span><spanclass="fu">unk_token</span><spanclass="kw">:</span><spanclass="at"></span><spanclass="st">"<unk>"</span></span></code><buttontitle="Copy to Clipboard"class="code-copy-button"><iclass="bi"></i></button></pre></div>
<p>You can use the following command to materialize your data. The<code>--debug</code> flag will print the tokens, along with the labels so you can verify that the correct items are being ignored:</p>
<p>You can use the following command to materialize your data. The
<code>--debug</code> flag will print the tokens, along with the labels so you can
verify that the correct items are being ignored:</p>
<p>The format is <code>decoded_token</code>(<code>label</code>, <code>token_id</code>), for example,<code><s>(1, 1)</code> means that the token is <code><s></code>, the label is <code>1</code> and the token_id is <code>1</code>. When the label is <code>-100</code> then that token is ignored for training.</p>
<p>The format is <code>decoded_token</code>(<code>label</code>, <code>token_id</code>), for example,
<code><s>(1, 1)</code> means that the token is <code><s></code>, the label is <code>1</code> and the
token_id is <code>1</code>. When the label is <code>-100</code> then that token is ignored for
<spanid="cb8-4"><ahref="#cb8-4"aria-hidden="true"tabindex="-1"></a> hi there<spanclass="op">!</span>. goodbye farewell<spanclass="op"></</span>s<spanclass="op">></span></span></code><buttontitle="Copy to Clipboard"class="code-copy-button"><iclass="bi"></i></button></pre></div>
<p>We can check that the right tokens are ignored by comparing the labels to each token:</p>
<p>We can check that the right tokens are ignored by comparing the labels
<spanid="cb9-3"><ahref="#cb9-3"aria-hidden="true"tabindex="-1"></a><spanclass="bu">zip</span>(row[<spanclass="st">'input_ids'</span>], row[<spanclass="st">'labels'</span>])])</span></code><buttontitle="Copy to Clipboard"class="code-copy-button"><iclass="bi"></i></button></pre></div>
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.