NanoCode012
a098df527b
feat: add Mistral Small 4 (#3502)
* feat: add mistral small 4
* fix: update mistral common
* fix: deepcopy when passing in tokenizer
* feat: add doc on reasoning and thinking section
* fix: don't use custom tokenizer and quantize experts
* chore: update docs and configs
* chore: update doc to follow official name
* feat: update cce to include mistral4
* chore: move
* fix: naming
* fix: test mock breaking get_text_config check
* fix: enable CCE and add expert block targetting to configs
* chore: docs
* fix: use act checkpointing
* chore: doc
* chore: docs
* chore: docs
2026-03-17 09:39:05 +07:00
..
2025-08-23 23:37:33 -04:00
2024-05-16 00:05:56 -04:00
2026-03-13 11:54:09 -04:00
2026-03-17 09:39:05 +07:00
2025-07-27 17:04:27 -04:00
2025-08-23 23:37:33 -04:00