axolotl

Author	SHA1	Message	Date
NanoCode012	8c6a6ea6eb	Feat: add devstral model support (#2880 ) [skip ci] * fix: do not add training and training_detail block by default * fixed: magistral docs * fix: address pad adding new fields and use built-in from_openai * feat: try enable multiprocessing * fix: check for keys before deleting attn_mask * feat: add mistral pad test * feat: add tool calling test * feat: add devstral tokenizer tests * fix: comma format * chore: remove unused support_preprocessing as tokenizer is pickable now * chore: update magistral doc * feat: add devstral readme and example * chore: refactor error handling	2025-07-08 11:01:19 -04:00
NanoCode012	80d5b066ec	Fix: adding magistral fsdp config, fixing not eval with test_datasets, handle mllama attention (#2789 ) [skip ci] * feat: add fsdp config for magistral * fix: add mllama self attention handling for lora kernels * fix: no eval if val_set_size 0 despite having test_datasets * fix: add note for cce for vlm in newer model	2025-06-14 11:53:43 -07:00
NanoCode012	eac4a61f55	Feat: Add Magistral and mistral-common tokenizer support (#2780 )	2025-06-12 19:18:33 -04:00
Dan Saunders	52a0452acb	magistral small placeholder (#2777 )	2025-06-10 13:03:41 -04:00