NanoCode012
|
a098df527b
|
feat: add Mistral Small 4 (#3502)
* feat: add mistral small 4
* fix: update mistral common
* fix: deepcopy when passing in tokenizer
* feat: add doc on reasoning and thinking section
* fix: don't use custom tokenizer and quantize experts
* chore: update docs and configs
* chore: update doc to follow official name
* feat: update cce to include mistral4
* chore: move
* fix: naming
* fix: test mock breaking get_text_config check
* fix: enable CCE and add expert block targetting to configs
* chore: docs
* fix: use act checkpointing
* chore: doc
* chore: docs
* chore: docs
|
2026-03-17 09:39:05 +07:00 |
|