feat: support training with JSON string tool arguments (#3136)
* feat: support training with JSON string tool arguments; fix PyArrow data type inconsistent error * feat: raise error for tool call arguments decode * Add test_chat_templates_tool_call_string_arguments.py Add test for string arguments * fix: change to correct qwen3 tokenizer * fix: update docs to clarify arguments json * chore: lint * fix: duplicate * chore: revert * feat: add error to faq * fix: remove duplicate fixture --------- Co-authored-by: caoqinping <caoqinping@lixiang.com> Co-authored-by: gamersover-blog <1611885128@qq.com> Co-authored-by: NanoCode012 <nano@axolotl.ai>
This commit is contained in:
@@ -212,6 +212,14 @@ Instead of passing `tools` via the system prompt, an alternative method would be
|
||||
Tools need to follow [JSON schema](https://json-schema.org/learn/getting-started-step-by-step).
|
||||
:::
|
||||
|
||||
::: {.callout-warning}
|
||||
If you have tool arguments with same name but different dtypes (like `"time": string` and `"time": number`), please save `arguments: ` as JSON string to prevent `datasets` from having casting issues.
|
||||
|
||||
```
|
||||
"arguments": "{\"...\": \"...\"}"
|
||||
```
|
||||
:::
|
||||
|
||||
Example config for Llama4:
|
||||
```yaml
|
||||
chat_template: llama4
|
||||
|
||||
Reference in New Issue
Block a user