* Allow using tokenizer's default chat template with fallbacks
Summary of changes:
1. Adds `tokenizer_default` as option for `chat_template` in
`chat_template` prompt strategy that allows using the chat template
from tokenizer's config.json
2. Allows falling back to chat templates available in axolotl if
tokenizer does not have a chat template
3. Adds a mistral chat template which supports system message - taken
from https://github.com/chujiezheng/chat_templates/blob/main/chat_templates/mistral-instruct.jinja
---
Why?
Many popular models are not trained with chatml format. As a result for
the model to correctly learn chatml we have to turn on train_on_inputs
which requires more compute and time. If we can use the model's already
learned chat template we can just learn the output tokens
---
Todo:
- Write tests
* Add tests
* Fix lint and bug post merge from main
* Add option `chat_template_jinja` to provide a jinja template
* remove custom mistral template
* Address review comments and add docs
* Update docs/dataset-formats/conversation.qmd
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
* fix: set default to tokenizer template
* Merge branch 'main' into cj_tokenizer_default_prompt_template
* chore: remove redundant function
* fix: re-arrange enum declaration position
* fix: refactor artifact left from main merge
* feat(doc): updated config with chat template options and clarified examples
* chore: clarify doc
* chore: added example for non-default template
* chore: refactor
* fix: test
* fix: config being dropped and unittest to catch that
* chore: lint
* chore: skip duplicate
* fix: rename var after merge
* feat: add test for levy's dpo case
* fix: remove default setting on edge case where chat template overriden in dataset section
* feat: handle sharegpt deprecation better in docs
* feat: add example using fallback
* feat: handles chat_template requiring specific user/assistant order
* fix: update test based on new defaults
* fix: imported name incorrectly updated on merge
* chore: lint
* fix: update dummy message to prevent potential overlap with real content
* fix(doc): formatting
* fix: update bradleyterry to use new chat_template
---------
Co-authored-by: Chirag Jain <jain.chirag925@gmail.com>