From 46a045e52807a4108b21043b6041b88be2f19912 Mon Sep 17 00:00:00 2001 From: NanoCode012 Date: Mon, 10 Mar 2025 16:25:50 +0700 Subject: [PATCH] chore(doc): add faq when having no default chat_template (#2398) * chore(doc): add faq when having no default chat_template * Update docs/dataset-formats/conversation.qmd Co-authored-by: salman * Update docs/faq.qmd Co-authored-by: salman --------- Co-authored-by: salman --- docs/dataset-formats/conversation.qmd | 4 ++++ docs/faq.qmd | 4 ++++ 2 files changed, 8 insertions(+) diff --git a/docs/dataset-formats/conversation.qmd b/docs/dataset-formats/conversation.qmd index 8ce95b7b0..81c902afd 100644 --- a/docs/dataset-formats/conversation.qmd +++ b/docs/dataset-formats/conversation.qmd @@ -74,6 +74,10 @@ datasets: train_on_eos: ``` +::: {.callout-tip} +If you receive an error like "`chat_template` choice is `tokenizer_default` but tokenizer's `chat_template` is null.", it means the tokenizer does not have a default `chat_template`. Follow the examples below instead to set a custom `chat_template`. +::: + 2. Using the `gemma` chat template to override the tokenizer_config.json's chat template on OpenAI messages format, training on all assistant messages. ```yaml diff --git a/docs/faq.qmd b/docs/faq.qmd index 7e5dd3d74..ba7ac1265 100644 --- a/docs/faq.qmd +++ b/docs/faq.qmd @@ -52,3 +52,7 @@ description: Frequently asked questions **Q: The EOS/EOT token is incorrectly being masked or not being masked.** > A: This is because of the mismatch between `tokenizer.eos_token` and EOS/EOT token in template. Please make sure to set `eos_token` under `special_tokens` to the same EOS/EOT token as in template. + +**Q: "`chat_template` choice is `tokenizer_default` but tokenizer's `chat_template` is null. Please add a `chat_template` in tokenizer config"** + +> A: This is because the tokenizer does not have a chat template. Please add a chat template in the tokenizer config. See [chat_template](dataset-formats/conversation.qmd#chat-template) for more details.