Feat: add Magistral Small 2509 and native mistral3 tokenizer support (#3165)

* feat: update mistral common

* feat: add mistral3processor

* fix: loading

* fix: cast pixel_values to fp32

* fix: image tensor conversion

* feat: add FA2 support for pixtral based models

* fix: update mistral small 3.1 to use native tokenizer

* fix: install tips

* fix: improve info on sample dataset files

* chore: move mistral configs into subfolders

* fix: remove unneeded patch

* fix: indent

* feat: add integration tests

* chore: move

* feat: add magistral 2509 docs and example

* fix: convert tensor to bool

* feat: expand tests

* chore: move tests
This commit is contained in:
NanoCode012
2025-09-18 15:42:20 +07:00
committed by GitHub
parent 4065bc14c6
commit 09959fac70
32 changed files with 757 additions and 39 deletions

View File

@@ -45,8 +45,7 @@ tf32: true
gradient_checkpointing: true
logging_steps: 1
# flash_attention: # PixtralVisionModel does not support Flash Attention 2.0 yet
sdp_attention: true
flash_attention: true
warmup_ratio: 0.1
evals_per_epoch: 1