Built site for gh-pages

This commit is contained in:
Quarto GHA Workflow Runner
2025-03-10 09:26:51 +00:00
parent 754817c8c6
commit 089c1c2c18
7 changed files with 82 additions and 57 deletions

View File

@@ -1 +1 @@
e905cd86
5da99456

View File

@@ -552,6 +552,19 @@ Important
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">type</span><span class="kw">:</span><span class="at"> chat_template</span></span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">roles_to_train</span><span class="kw">:</span></span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">train_on_eos</span><span class="kw">:</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="callout callout-style-default callout-tip callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
Tip
</div>
</div>
<div class="callout-body-container callout-body">
<p>If you receive an error like “<code>chat_template</code> choice is <code>tokenizer_default</code> but tokenizers <code>chat_template</code> is null.”, it means the tokenizer does not have a default <code>chat_template</code>. Follow the examples below instead to set a custom <code>chat_template</code>.</p>
</div>
</div>
<ol start="2" type="1">
<li>Using the <code>gemma</code> chat template to override the tokenizer_config.jsons chat template on OpenAI messages format, training on all assistant messages.</li>
</ol>

View File

@@ -486,6 +486,10 @@ ul.task-list li input[type="checkbox"] {
<blockquote class="blockquote">
<p>A: This is because of the mismatch between <code>tokenizer.eos_token</code> and EOS/EOT token in template. Please make sure to set <code>eos_token</code> under <code>special_tokens</code> to the same EOS/EOT token as in template.</p>
</blockquote>
<p><strong>Q: “<code>chat_template</code> choice is <code>tokenizer_default</code> but tokenizers <code>chat_template</code> is null. Please add a <code>chat_template</code> in tokenizer config”</strong></p>
<blockquote class="blockquote">
<p>A: This is because the tokenizer does not have a chat template. Please add a chat template in the tokenizer config. See <a href="../docs/dataset-formats/conversation.html#chat-template">chat_template</a> for more details.</p>
</blockquote>
</section>

View File

@@ -491,22 +491,30 @@ pre > code.sourceCode > span > a:first-child::before { text-decoration: underlin
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a><span class="fu">val_set_size</span><span class="kw">:</span><span class="at"> </span><span class="fl">0.1</span></span>
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a><span class="fu">eval_steps</span><span class="kw">:</span><span class="at"> </span><span class="dv">100</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Bradley-Terry chat templates expect single-turn conversations in the following format:</p>
<div class="sourceCode" id="cb2"><pre class="sourceCode json code-with-copy"><code class="sourceCode json"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="fu">{</span></span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="dt">"system"</span><span class="fu">:</span> <span class="st">"..."</span><span class="fu">,</span> <span class="er">//</span> <span class="er">optional</span></span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> <span class="dt">"input"</span><span class="fu">:</span> <span class="st">"..."</span><span class="fu">,</span></span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> <span class="dt">"chosen"</span><span class="fu">:</span> <span class="st">"..."</span><span class="fu">,</span></span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> <span class="dt">"rejected"</span><span class="fu">:</span> <span class="st">"..."</span></span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a><span class="fu">}</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</section>
<section id="process-reward-models-prm" class="level3">
<h3 class="anchored" data-anchor-id="process-reward-models-prm">Process Reward Models (PRM)</h3>
<p>Process reward models are trained using data which contains preference annotations for each step in a series of interactions. Typically, PRMs are trained to provide reward signals over each step of a reasoning trace and are used for downstream reinforcement learning.</p>
<div class="sourceCode" id="cb2"><pre class="sourceCode yaml code-with-copy"><code class="sourceCode yaml"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="fu">base_model</span><span class="kw">:</span><span class="at"> Qwen/Qwen2.5-3B</span></span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a><span class="fu">model_type</span><span class="kw">:</span><span class="at"> AutoModelForTokenClassification</span></span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a><span class="fu">num_labels</span><span class="kw">:</span><span class="at"> </span><span class="dv">2</span></span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a><span class="fu">process_reward_model</span><span class="kw">:</span><span class="at"> </span><span class="ch">true</span></span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a><span class="fu">datasets</span><span class="kw">:</span></span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> </span><span class="fu">path</span><span class="kw">:</span><span class="at"> trl-lib/math_shepherd</span></span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">type</span><span class="kw">:</span><span class="at"> stepwise_supervised</span></span>
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">split</span><span class="kw">:</span><span class="at"> train</span></span>
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb2-11"><a href="#cb2-11" aria-hidden="true" tabindex="-1"></a><span class="fu">val_set_size</span><span class="kw">:</span><span class="at"> </span><span class="fl">0.1</span></span>
<span id="cb2-12"><a href="#cb2-12" aria-hidden="true" tabindex="-1"></a><span class="fu">eval_steps</span><span class="kw">:</span><span class="at"> </span><span class="dv">100</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode" id="cb3"><pre class="sourceCode yaml code-with-copy"><code class="sourceCode yaml"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="fu">base_model</span><span class="kw">:</span><span class="at"> Qwen/Qwen2.5-3B</span></span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a><span class="fu">model_type</span><span class="kw">:</span><span class="at"> AutoModelForTokenClassification</span></span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a><span class="fu">num_labels</span><span class="kw">:</span><span class="at"> </span><span class="dv">2</span></span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a><span class="fu">process_reward_model</span><span class="kw">:</span><span class="at"> </span><span class="ch">true</span></span>
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a><span class="fu">datasets</span><span class="kw">:</span></span>
<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="kw">-</span><span class="at"> </span><span class="fu">path</span><span class="kw">:</span><span class="at"> trl-lib/math_shepherd</span></span>
<span id="cb3-8"><a href="#cb3-8" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">type</span><span class="kw">:</span><span class="at"> stepwise_supervised</span></span>
<span id="cb3-9"><a href="#cb3-9" aria-hidden="true" tabindex="-1"></a><span class="at"> </span><span class="fu">split</span><span class="kw">:</span><span class="at"> train</span></span>
<span id="cb3-10"><a href="#cb3-10" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb3-11"><a href="#cb3-11" aria-hidden="true" tabindex="-1"></a><span class="fu">val_set_size</span><span class="kw">:</span><span class="at"> </span><span class="fl">0.1</span></span>
<span id="cb3-12"><a href="#cb3-12" aria-hidden="true" tabindex="-1"></a><span class="fu">eval_steps</span><span class="kw">:</span><span class="at"> </span><span class="dv">100</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Please see <a href="../docs/dataset-formats/stepwise_supervised.html">stepwise_supervised</a> for more details on the dataset format.</p>
</section>

View File

@@ -317,7 +317,7 @@
"href": "docs/reward_modelling.html",
"title": "Reward Modelling",
"section": "",
"text": "Overview\nReward modelling is a technique used to train models to predict the reward or value of a given input. This is particularly useful in reinforcement learning scenarios where the model needs to evaluate the quality of its actions or predictions. We support the reward modelling techniques supported by trl.\n\n\n(Outcome) Reward Models\nOutcome reward models are trained using data which contains preference annotations for an entire interaction between the user and model (e.g. rather than per-turn or per-step).\nbase_model: google/gemma-2-2b\nmodel_type: AutoModelForSequenceClassification\nnum_labels: 1\ntokenizer_type: AutoTokenizer\n\nreward_model: true\nchat_template: gemma\ndatasets:\n - path: argilla/distilabel-intel-orca-dpo-pairs\n type: bradley_terry.chat_template\n\nval_set_size: 0.1\neval_steps: 100\n\n\nProcess Reward Models (PRM)\nProcess reward models are trained using data which contains preference annotations for each step in a series of interactions. Typically, PRMs are trained to provide reward signals over each step of a reasoning trace and are used for downstream reinforcement learning.\nbase_model: Qwen/Qwen2.5-3B\nmodel_type: AutoModelForTokenClassification\nnum_labels: 2\n\nprocess_reward_model: true\ndatasets:\n - path: trl-lib/math_shepherd\n type: stepwise_supervised\n split: train\n\nval_set_size: 0.1\neval_steps: 100",
"text": "Overview\nReward modelling is a technique used to train models to predict the reward or value of a given input. This is particularly useful in reinforcement learning scenarios where the model needs to evaluate the quality of its actions or predictions. We support the reward modelling techniques supported by trl.\n\n\n(Outcome) Reward Models\nOutcome reward models are trained using data which contains preference annotations for an entire interaction between the user and model (e.g. rather than per-turn or per-step).\nbase_model: google/gemma-2-2b\nmodel_type: AutoModelForSequenceClassification\nnum_labels: 1\ntokenizer_type: AutoTokenizer\n\nreward_model: true\nchat_template: gemma\ndatasets:\n - path: argilla/distilabel-intel-orca-dpo-pairs\n type: bradley_terry.chat_template\n\nval_set_size: 0.1\neval_steps: 100\nBradley-Terry chat templates expect single-turn conversations in the following format:\n{\n \"system\": \"...\", // optional\n \"input\": \"...\",\n \"chosen\": \"...\",\n \"rejected\": \"...\"\n}\n\n\nProcess Reward Models (PRM)\nProcess reward models are trained using data which contains preference annotations for each step in a series of interactions. Typically, PRMs are trained to provide reward signals over each step of a reasoning trace and are used for downstream reinforcement learning.\nbase_model: Qwen/Qwen2.5-3B\nmodel_type: AutoModelForTokenClassification\nnum_labels: 2\n\nprocess_reward_model: true\ndatasets:\n - path: trl-lib/math_shepherd\n type: stepwise_supervised\n split: train\n\nval_set_size: 0.1\neval_steps: 100\nPlease see stepwise_supervised for more details on the dataset format.",
"crumbs": [
"How To Guides",
"Reward Modelling"
@@ -779,7 +779,7 @@
"href": "docs/faq.html",
"title": "FAQ",
"section": "",
"text": "General\nQ: The trainer stopped and hasnt progressed in several minutes.\n\nA: Usually an issue with the GPUs communicating with each other. See the NCCL doc\n\nQ: Exitcode -9\n\nA: This usually happens when you run out of system RAM.\n\nQ: Exitcode -7 while using deepspeed\n\nA: Try upgrading deepspeed w: pip install -U deepspeed\n\nQ: AttributeError: DummyOptim object has no attribute step\nQ: ModuleNotFoundError: No module named mpi4py using single GPU with deepspeed\n\nA: You may be using deepspeed with single gpu. Please remove the deepspeed: section in the yaml file or --deepspeed CLI flag.\n\nQ: The codes is stuck on saving preprocessed datasets.\n\nA: This is usually an issue with the GPU. This can be resolved through setting the os environment variable CUDA_VISIBLE_DEVICES=0. If you are on runpod, this is usually a pod issue. Starting a new pod should take care of it.\n\n\n\nChat templates\nQ: jinja2.exceptions.UndefinedError: 'dict object' has no attribute 'content' / 'role' / ____\n\nA: This means that the property mapping for the stated attribute does not exist when building chat_template prompt. For example, if no attribute 'content', please check you have added the correct mapping for content under message_property_mappings.\n\nQ: Empty template generated for turn ___\n\nA: The content is empty for that turn.\n\nQ: Could not find content start/end boundary for turn __\n\nA: The specific turns start/end could not be detected. Please ensure you have set the eos_token following your chat_template. Otherwise, this could be a chat_template which doesnt use proper boundaries for each turn (like system). On the rare occurrence, make sure your content is not [[dummy_message]]. Please let us know about this.\n\nQ: Content end boundary is before start boundary for turn ___\n\nA: This is an edge case which should not occur. Please create an Issue if this happens.\n\nQ: Content end boundary is the same as start boundary for turn ___. This is likely an empty turn.\n\nA: This is likely an empty turn.\n\nQ: The EOS/EOT token is incorrectly being masked or not being masked.\n\nA: This is because of the mismatch between tokenizer.eos_token and EOS/EOT token in template. Please make sure to set eos_token under special_tokens to the same EOS/EOT token as in template.",
"text": "General\nQ: The trainer stopped and hasnt progressed in several minutes.\n\nA: Usually an issue with the GPUs communicating with each other. See the NCCL doc\n\nQ: Exitcode -9\n\nA: This usually happens when you run out of system RAM.\n\nQ: Exitcode -7 while using deepspeed\n\nA: Try upgrading deepspeed w: pip install -U deepspeed\n\nQ: AttributeError: DummyOptim object has no attribute step\nQ: ModuleNotFoundError: No module named mpi4py using single GPU with deepspeed\n\nA: You may be using deepspeed with single gpu. Please remove the deepspeed: section in the yaml file or --deepspeed CLI flag.\n\nQ: The codes is stuck on saving preprocessed datasets.\n\nA: This is usually an issue with the GPU. This can be resolved through setting the os environment variable CUDA_VISIBLE_DEVICES=0. If you are on runpod, this is usually a pod issue. Starting a new pod should take care of it.\n\n\n\nChat templates\nQ: jinja2.exceptions.UndefinedError: 'dict object' has no attribute 'content' / 'role' / ____\n\nA: This means that the property mapping for the stated attribute does not exist when building chat_template prompt. For example, if no attribute 'content', please check you have added the correct mapping for content under message_property_mappings.\n\nQ: Empty template generated for turn ___\n\nA: The content is empty for that turn.\n\nQ: Could not find content start/end boundary for turn __\n\nA: The specific turns start/end could not be detected. Please ensure you have set the eos_token following your chat_template. Otherwise, this could be a chat_template which doesnt use proper boundaries for each turn (like system). On the rare occurrence, make sure your content is not [[dummy_message]]. Please let us know about this.\n\nQ: Content end boundary is before start boundary for turn ___\n\nA: This is an edge case which should not occur. Please create an Issue if this happens.\n\nQ: Content end boundary is the same as start boundary for turn ___. This is likely an empty turn.\n\nA: This is likely an empty turn.\n\nQ: The EOS/EOT token is incorrectly being masked or not being masked.\n\nA: This is because of the mismatch between tokenizer.eos_token and EOS/EOT token in template. Please make sure to set eos_token under special_tokens to the same EOS/EOT token as in template.\n\nQ: “chat_template choice is tokenizer_default but tokenizers chat_template is null. Please add a chat_template in tokenizer config”\n\nA: This is because the tokenizer does not have a chat template. Please add a chat template in the tokenizer config. See chat_template for more details.",
"crumbs": [
"Troubleshooting",
"FAQ"
@@ -1314,7 +1314,7 @@
"href": "docs/dataset-formats/conversation.html#chat_template",
"title": "Conversation",
"section": "chat_template",
"text": "chat_template\nChat Template strategy uses a jinja2 template that converts a list of messages into a prompt. Support using tokenizers template, a supported template, or custom jinja2.\n\n\ndata.jsonl\n\n{\"conversations\": [{\"role\": \"...\", \"content\": \"...\"}]}\n\nSee configs for full configs and supported templates.\n\nMigrating from sharegpt\nMost configs can be adapted as follows:\n# old\nchat_template: chatml\ndatasets:\n - path: ...\n type: sharegpt\n conversation: chatml\n\n# new (if using tokenizer's chat_template)\ndatasets:\n - path: ...\n type: chat_template\n\n field_messages: conversations\n message_property_mappings:\n role: from\n content: value\n\n# new (if setting a new chat_template like chatml, gemma, etc)\nchat_template: chatml\ndatasets:\n - path: ...\n type: chat_template\n\n field_messages: conversations\n message_property_mappings:\n role: from\n content: value\nWe recommend checking the below examples for other usecases.\n\n\nExamples\n\nUsing the default chat template in the tokenizer_config.json on OpenAI messages format, training on only last message.\n\ndatasets:\n - path: ...\n type: chat_template\n roles_to_train:\n train_on_eos:\n\nUsing the gemma chat template to override the tokenizer_config.jsons chat template on OpenAI messages format, training on all assistant messages.\n\nchat_template: gemma # this overwrites the tokenizer's chat_template\ndatasets:\n - path: ...\n type: chat_template\n roles_to_train: [\"assistant\"] # default value\n\nUsing the tokenizer_config.jsons chat template or chatml as fallback if the formers chat template does not exist, on OpenAI messages format, training on all assistant messages.\n\nchat_template: tokenizer_default_fallback_chatml # this overwrites the tokenizer's chat_template\ndatasets:\n - path: ...\n type: chat_template\n\nUsing a custom jinja template on OpenAI messages format, training on all assistant messages.\n\n# chat_template: jinja # `jinja` will be implied if the `chat_template_jinja` is set and this field is empty\nchat_template_jinja: \"{{ bos_token }}{% for message in messages %}{% if (message['role'] == 'system') %}{{'&lt;|system|&gt;' + '\\n' + message['content'] + '&lt;|end|&gt;' + '\\n'}}{% elif (message['role'] == 'user') %}{{'&lt;|user|&gt;' + '\\n' + message['content'] + '&lt;|end|&gt;' + '\\n' + '&lt;|assistant|&gt;' + '\\n'}}{% elif message['role'] == 'assistant' %}{{message['content'] + '&lt;|end|&gt;' + '\\n'}}{% endif %}{% endfor %}\"\n\ndatasets:\n - path: ...\n type: chat_template\n\n\n\n\n\n\nImportant\n\n\n\nPlease make sure that your tokenizer.eos_token is same as EOS/EOT token in template. Otherwise, set eos_token under special_tokens.\n\n\n\n(Advanced) Using fine-grained control over tokens and turns to train in a conversation\n\nFor a data sample that looks like:\n\n\ndata.jsonl\n\n{\n \"conversations\": [\n {\"from\": \"system\", \"value\": \"You are an AI assistant.\", \"train\": false},\n {\"from\": \"human\", \"value\": \"Hello\", \"train\": false},\n {\"from\": \"assistant\", \"value\": \"Hello\", \"train\": true},\n {\"from\": \"human\", \"value\": \"How are you?\", \"train\": true},\n {\n \"from\": \"assistant\",\n \"value\": \"I'm doing very well, thank you!\",\n \"train_detail\": [\n {\"begin_offset\": 0, \"end_offset\": 8, \"train\": false},\n {\"begin_offset\": 9, \"end_offset\": 18, \"train\": true},\n {\"begin_offset\": 19, \"end_offset\": 30, \"train\": false},\n ],\n },\n {\n \"from\": \"human\",\n \"value\": \"I'm doing very well, thank you!\",\n \"train\": true,\n },\n {\"from\": \"assistant\", \"value\": \"Hi there!\", \"train\": true}\n ]\n}\n\nThe configuration would look like:\ndatasets:\n - path: ...\n type: chat_template\n chat_template: tokenizer_default\n field_messages: conversations\n message_property_mappings:\n role: from\n content: value\n roles_to_train: []\n train_on_eos: turn\n message_field_training: train\n message_field_training_detail: train_detail\n\n\n\n\n\n\nTip\n\n\n\nIt is not necessary to set both message_field_training and message_field_training_detail at once.",
"text": "chat_template\nChat Template strategy uses a jinja2 template that converts a list of messages into a prompt. Support using tokenizers template, a supported template, or custom jinja2.\n\n\ndata.jsonl\n\n{\"conversations\": [{\"role\": \"...\", \"content\": \"...\"}]}\n\nSee configs for full configs and supported templates.\n\nMigrating from sharegpt\nMost configs can be adapted as follows:\n# old\nchat_template: chatml\ndatasets:\n - path: ...\n type: sharegpt\n conversation: chatml\n\n# new (if using tokenizer's chat_template)\ndatasets:\n - path: ...\n type: chat_template\n\n field_messages: conversations\n message_property_mappings:\n role: from\n content: value\n\n# new (if setting a new chat_template like chatml, gemma, etc)\nchat_template: chatml\ndatasets:\n - path: ...\n type: chat_template\n\n field_messages: conversations\n message_property_mappings:\n role: from\n content: value\nWe recommend checking the below examples for other usecases.\n\n\nExamples\n\nUsing the default chat template in the tokenizer_config.json on OpenAI messages format, training on only last message.\n\ndatasets:\n - path: ...\n type: chat_template\n roles_to_train:\n train_on_eos:\n\n\n\n\n\n\nTip\n\n\n\nIf you receive an error like “chat_template choice is tokenizer_default but tokenizers chat_template is null.”, it means the tokenizer does not have a default chat_template. Follow the examples below instead to set a custom chat_template.\n\n\n\nUsing the gemma chat template to override the tokenizer_config.jsons chat template on OpenAI messages format, training on all assistant messages.\n\nchat_template: gemma # this overwrites the tokenizer's chat_template\ndatasets:\n - path: ...\n type: chat_template\n roles_to_train: [\"assistant\"] # default value\n\nUsing the tokenizer_config.jsons chat template or chatml as fallback if the formers chat template does not exist, on OpenAI messages format, training on all assistant messages.\n\nchat_template: tokenizer_default_fallback_chatml # this overwrites the tokenizer's chat_template\ndatasets:\n - path: ...\n type: chat_template\n\nUsing a custom jinja template on OpenAI messages format, training on all assistant messages.\n\n# chat_template: jinja # `jinja` will be implied if the `chat_template_jinja` is set and this field is empty\nchat_template_jinja: \"{{ bos_token }}{% for message in messages %}{% if (message['role'] == 'system') %}{{'&lt;|system|&gt;' + '\\n' + message['content'] + '&lt;|end|&gt;' + '\\n'}}{% elif (message['role'] == 'user') %}{{'&lt;|user|&gt;' + '\\n' + message['content'] + '&lt;|end|&gt;' + '\\n' + '&lt;|assistant|&gt;' + '\\n'}}{% elif message['role'] == 'assistant' %}{{message['content'] + '&lt;|end|&gt;' + '\\n'}}{% endif %}{% endfor %}\"\n\ndatasets:\n - path: ...\n type: chat_template\n\n\n\n\n\n\nImportant\n\n\n\nPlease make sure that your tokenizer.eos_token is same as EOS/EOT token in template. Otherwise, set eos_token under special_tokens.\n\n\n\n(Advanced) Using fine-grained control over tokens and turns to train in a conversation\n\nFor a data sample that looks like:\n\n\ndata.jsonl\n\n{\n \"conversations\": [\n {\"from\": \"system\", \"value\": \"You are an AI assistant.\", \"train\": false},\n {\"from\": \"human\", \"value\": \"Hello\", \"train\": false},\n {\"from\": \"assistant\", \"value\": \"Hello\", \"train\": true},\n {\"from\": \"human\", \"value\": \"How are you?\", \"train\": true},\n {\n \"from\": \"assistant\",\n \"value\": \"I'm doing very well, thank you!\",\n \"train_detail\": [\n {\"begin_offset\": 0, \"end_offset\": 8, \"train\": false},\n {\"begin_offset\": 9, \"end_offset\": 18, \"train\": true},\n {\"begin_offset\": 19, \"end_offset\": 30, \"train\": false},\n ],\n },\n {\n \"from\": \"human\",\n \"value\": \"I'm doing very well, thank you!\",\n \"train\": true,\n },\n {\"from\": \"assistant\", \"value\": \"Hi there!\", \"train\": true}\n ]\n}\n\nThe configuration would look like:\ndatasets:\n - path: ...\n type: chat_template\n chat_template: tokenizer_default\n field_messages: conversations\n message_property_mappings:\n role: from\n content: value\n roles_to_train: []\n train_on_eos: turn\n message_field_training: train\n message_field_training_detail: train_detail\n\n\n\n\n\n\nTip\n\n\n\nIt is not necessary to set both message_field_training and message_field_training_detail at once.",
"crumbs": [
"Dataset Formats",
"Conversation"

View File

@@ -2,162 +2,162 @@
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/examples/colab-notebooks/colab-axolotl-example.html</loc>
<lastmod>2025-03-07T13:59:04.910Z</lastmod>
<lastmod>2025-03-10T09:26:02.164Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/dataset-formats/stepwise_supervised.html</loc>
<lastmod>2025-03-07T13:59:04.906Z</lastmod>
<lastmod>2025-03-10T09:26:02.160Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/dataset-formats/template_free.html</loc>
<lastmod>2025-03-07T13:59:04.906Z</lastmod>
<lastmod>2025-03-10T09:26:02.160Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/dataset-formats/tokenized.html</loc>
<lastmod>2025-03-07T13:59:04.906Z</lastmod>
<lastmod>2025-03-10T09:26:02.160Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/nccl.html</loc>
<lastmod>2025-03-07T13:59:04.909Z</lastmod>
<lastmod>2025-03-10T09:26:02.163Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/amd_hpc.html</loc>
<lastmod>2025-03-07T13:59:04.905Z</lastmod>
<lastmod>2025-03-10T09:26:02.159Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/config.html</loc>
<lastmod>2025-03-07T13:59:04.905Z</lastmod>
<lastmod>2025-03-10T09:26:02.159Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/multi-gpu.html</loc>
<lastmod>2025-03-07T13:59:04.909Z</lastmod>
<lastmod>2025-03-10T09:26:02.163Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/installation.html</loc>
<lastmod>2025-03-07T13:59:04.909Z</lastmod>
<lastmod>2025-03-10T09:26:02.163Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/torchao.html</loc>
<lastmod>2025-03-07T13:59:04.909Z</lastmod>
<lastmod>2025-03-10T09:26:02.163Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/reward_modelling.html</loc>
<lastmod>2025-03-07T13:59:04.909Z</lastmod>
<lastmod>2025-03-10T09:26:02.163Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/input_output.html</loc>
<lastmod>2025-03-07T13:59:04.909Z</lastmod>
<lastmod>2025-03-10T09:26:02.163Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/multimodal.html</loc>
<lastmod>2025-03-07T13:59:04.909Z</lastmod>
<lastmod>2025-03-10T09:26:02.163Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/getting-started.html</loc>
<lastmod>2025-03-07T13:59:04.906Z</lastmod>
<lastmod>2025-03-10T09:26:02.160Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/inference.html</loc>
<lastmod>2025-03-07T13:59:04.909Z</lastmod>
<lastmod>2025-03-10T09:26:02.163Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/multipack.html</loc>
<lastmod>2025-03-07T13:59:04.909Z</lastmod>
<lastmod>2025-03-10T09:26:02.163Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/debugging.html</loc>
<lastmod>2025-03-07T13:59:04.906Z</lastmod>
<lastmod>2025-03-10T09:26:02.160Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/lr_groups.html</loc>
<lastmod>2025-03-07T13:59:04.909Z</lastmod>
<lastmod>2025-03-10T09:26:02.163Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/TODO.html</loc>
<lastmod>2025-03-07T13:59:04.904Z</lastmod>
<lastmod>2025-03-10T09:26:02.158Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/src/axolotl/integrations/LICENSE.html</loc>
<lastmod>2025-03-07T13:59:04.924Z</lastmod>
<lastmod>2025-03-10T09:26:02.178Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/index.html</loc>
<lastmod>2025-03-07T13:59:04.921Z</lastmod>
<lastmod>2025-03-10T09:26:02.175Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/src/axolotl/integrations/cut_cross_entropy/ACKNOWLEDGEMENTS.html</loc>
<lastmod>2025-03-07T13:59:04.924Z</lastmod>
<lastmod>2025-03-10T09:26:02.178Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/FAQS.html</loc>
<lastmod>2025-03-07T13:59:04.904Z</lastmod>
<lastmod>2025-03-10T09:26:02.158Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/multi-node.html</loc>
<lastmod>2025-03-07T13:59:04.909Z</lastmod>
<lastmod>2025-03-10T09:26:02.163Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/faq.html</loc>
<lastmod>2025-03-07T13:59:04.906Z</lastmod>
<lastmod>2025-03-10T09:26:02.160Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/batch_vs_grad.html</loc>
<lastmod>2025-03-07T13:59:04.905Z</lastmod>
<lastmod>2025-03-10T09:26:02.159Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/lora_optims.html</loc>
<lastmod>2025-03-07T13:59:04.909Z</lastmod>
<lastmod>2025-03-10T09:26:02.163Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/rlhf.html</loc>
<lastmod>2025-03-07T13:59:04.909Z</lastmod>
<lastmod>2025-03-10T09:26:02.163Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/cli.html</loc>
<lastmod>2025-03-07T13:59:04.905Z</lastmod>
<lastmod>2025-03-10T09:26:02.159Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/unsloth.html</loc>
<lastmod>2025-03-07T13:59:04.909Z</lastmod>
<lastmod>2025-03-10T09:26:02.163Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/fsdp_qlora.html</loc>
<lastmod>2025-03-07T13:59:04.906Z</lastmod>
<lastmod>2025-03-10T09:26:02.160Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/dataset_preprocessing.html</loc>
<lastmod>2025-03-07T13:59:04.906Z</lastmod>
<lastmod>2025-03-10T09:26:02.160Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/custom_integrations.html</loc>
<lastmod>2025-03-07T13:59:04.905Z</lastmod>
<lastmod>2025-03-10T09:26:02.159Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/mac.html</loc>
<lastmod>2025-03-07T13:59:04.909Z</lastmod>
<lastmod>2025-03-10T09:26:02.163Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/docker.html</loc>
<lastmod>2025-03-07T13:59:04.906Z</lastmod>
<lastmod>2025-03-10T09:26:02.160Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/ray-integration.html</loc>
<lastmod>2025-03-07T13:59:04.909Z</lastmod>
<lastmod>2025-03-10T09:26:02.163Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/dataset-formats/index.html</loc>
<lastmod>2025-03-07T13:59:04.906Z</lastmod>
<lastmod>2025-03-10T09:26:02.160Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/dataset-formats/conversation.html</loc>
<lastmod>2025-03-07T13:59:04.905Z</lastmod>
<lastmod>2025-03-10T09:26:02.159Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/dataset-formats/pretraining.html</loc>
<lastmod>2025-03-07T13:59:04.906Z</lastmod>
<lastmod>2025-03-10T09:26:02.160Z</lastmod>
</url>
<url>
<loc>https://axolotl-ai-cloud.github.io/axolotl/docs/dataset-formats/inst_tune.html</loc>
<lastmod>2025-03-07T13:59:04.906Z</lastmod>
<lastmod>2025-03-10T09:26:02.160Z</lastmod>
</url>
</urlset>

View File

@@ -14,7 +14,7 @@
h1 {
font-family: var(--font-title);
font-weight: 400;
font-size: 6rem;
font-size: 5rem;
line-height: 1.1;
letter-spacing: -0.05em;
font-feature-settings: "ss01" on;