diff --git a/.nojekyll b/.nojekyll index f9903bf94..c47c88483 100644 --- a/.nojekyll +++ b/.nojekyll @@ -1 +1 @@ -7dd4c342 \ No newline at end of file +a6bc28b6 \ No newline at end of file diff --git a/docs/api/prompt_strategies.chat_template.html b/docs/api/prompt_strategies.chat_template.html index f2ec3587c..209bc339a 100644 --- a/docs/api/prompt_strategies.chat_template.html +++ b/docs/api/prompt_strategies.chat_template.html @@ -988,8 +988,43 @@ gtag('config', 'G-9KYCVJBNMQ', { 'anonymize_ip': true}); turns, turn_idx, tools=None, -) + content_only=False, + reasoning_only=False, +)
Locate the starting and ending indices of the specified turn in a conversation.
+| Name | +Type | +Description | +Default | +
|---|---|---|---|
| content_only | +bool | +If True and the turn has reasoning_content (template_thinking_key), preserve reasoning_content in the dummy turn so the diff only captures the content field boundaries. This is needed for correct training_detail alignment when reasoning_content is present. | +False |
+
| reasoning_only | +bool | +If True, preserve content in the dummy turn and replace reasoning_content with a dummy, so the diff only captures the reasoning_content field boundaries. | +False |
+
eot_tokens: config. The handling of EOT tokens follows train_on_eos: which defaults to turn.train_on_eos: last.Instead of passing tools via the system prompt, an alternative method would be to have the tools in a separate column and loaded via chat_template to let the template dynamically build it.
{
- "tools": [
- {
- "type": "...",
- "function": {
- "name": "...",
- "description": "...",
- "parameters": {
- "type": "...",
- "properties": {
- // ...
- },
- "required": ["..."],
- },
- },
- },
- ],
- "messages": [
- // ...
- {
- "role": "assistant", // call the function via assistant
- "tool_calls": [
- {
- "id": "...", // required only for mistral
- "type": "function",
- "function": {
- "name": "...",
- "arguments": {
- "...": "...",
- }
- }
- }
- ]
- },
- {
- "role": "tool",
- "tool_call_id": "...", // required only for mistral
- "name": "...",
- "content": "..."
- },
- ],
-}{
+ "tools": [
+ {
+ "type": "...",
+ "function": {
+ "name": "...",
+ "description": "...",
+ "parameters": {
+ "type": "...",
+ "properties": {
+ // ...
+ },
+ "required": ["..."],
+ },
+ },
+ },
+ ],
+ "messages": [
+ // ...
+ {
+ "role": "assistant", // call the function via assistant
+ "tool_calls": [
+ {
+ "id": "...", // required only for mistral
+ "type": "function",
+ "function": {
+ "name": "...",
+ "arguments": {
+ "...": "...",
+ }
+ }
+ }
+ ]
+ },
+ {
+ "role": "tool",
+ "tool_call_id": "...", // required only for mistral
+ "name": "...",
+ "content": "..."
+ },
+ ],
+}Example config for Llama4:
-data.jsonl
{
- "conversations": [
- {"from": "system", "value": "You are an AI assistant.", "train": false},
- {"from": "human", "value": "Hello", "train": false},
- {"from": "assistant", "value": "Hello", "train": true},
- {"from": "human", "value": "How are you?", "train": true},
- {
- "from": "assistant",
- "value": "I'm doing very well, thank you!",
- "train_detail": [
- {"begin_offset": 0, "end_offset": 8, "train": false},
- {"begin_offset": 9, "end_offset": 18, "train": true},
- {"begin_offset": 19, "end_offset": 30, "train": false},
- ],
- },
- {
- "from": "human",
- "value": "I'm doing very well, thank you!",
- "train": true,
- },
- {"from": "assistant", "value": "Hi there!", "train": true}
- ]
-}{
+ "conversations": [
+ {"from": "system", "value": "You are an AI assistant.", "train": false},
+ {"from": "human", "value": "Hello", "train": false},
+ {"from": "assistant", "value": "Hello", "train": true},
+ {"from": "human", "value": "How are you?", "train": true},
+ {
+ "from": "assistant",
+ "value": "I'm doing very well, thank you!",
+ "train_detail": [
+ {"begin_offset": 0, "end_offset": 8, "train": false},
+ {"begin_offset": 9, "end_offset": 18, "train": true},
+ {"begin_offset": 19, "end_offset": 30, "train": false},
+ ],
+ },
+ {
+ "from": "human",
+ "value": "I'm doing very well, thank you!",
+ "train": true,
+ },
+ {"from": "assistant", "value": "Hi there!", "train": true}
+ ]
+}The configuration would look like:
-Instead of using character offsets with train_detail, you can split a message’s content into a list of parts, each with its own training flag. This is useful when you want to mask specific sections of a response (e.g., mask reasoning but train on the answer).
data.jsonl+
The configuration is the same as standard chat_template — no extra fields needed:
Each content part supports:
+type: "text" (required)text: the text value (also accepts content or value as the key)train: true/false (optional) — whether to train on this partweight: 0/1 (optional) — alternative to trainIf a part has no train or weight flag, it inherits the turn-level training decision (from roles_to_train, message_field_training, or train_on_inputs).
BPE tokenizers (used by Llama, Qwen, Mistral, GPT, etc.) prepend spaces to word tokens. For example, " answer" is a single token — the space is part of it. This means where you place whitespace between content parts matters:
Split BEFORE spaces (space goes with the next part):
+DON’T put trailing spaces on a part (the space merges with the next word into one token that straddles the boundary, and straddling tokens are masked):
+In the bad example, " The" becomes a single token that spans both parts. Because it straddles the boundary, it is conservatively masked (not trained) — even though the second part has train: true.
Newlines typically merge with preceding punctuation (e.g., ":\n" is one token). Keep newlines with the preceding part:
Axolotl will log a warning if it detects trailing whitespace at a boundary between parts with different training flags.
+When all content parts in a message are strings, they are concatenated before being passed to the chat template. This means content parts work with any Jinja template — the template sees a plain string, and the per-part training flags are applied during tokenization.
+For templates that support a separate reasoning_content field (e.g., qwen3), the same content-parts format works on reasoning_content. This is useful for masking incorrect reasoning steps while training on self-corrections:
data.jsonl+
{
+ "messages": [
+ {"role": "user", "content": [{"type": "text", "text": "What is 2+2?"}]},
+ {
+ "role": "assistant",
+ "reasoning_content": [
+ {"type": "text", "text": "Hmm maybe 2+2=5.", "train": false},
+ {"type": "text", "text": " Wait no, 2+2=4.", "train": true}
+ ],
+ "content": [
+ {"type": "text", "text": "The answer is 4.", "train": true}
+ ]
+ }
+ ]
+}The reasoning_content and content fields are handled independently — each has its own token boundaries and per-part masking. No additional configuration is needed beyond what the template already requires.
When reasoning_content is provided as a separate field, split_thinking is not needed — the reasoning is already separated from the content in the data.
The same whitespace rules apply to reasoning_content parts as to content parts — split before spaces, keep newlines with the preceding part.
(For Qwen3 template only) Enable reasoning split, where the reasoning is split from the content and passed as a separate field into the template.
- +For example, a content can look like:
- +After split, it will look like:
- +data.jsonl