fix: force train split for json,csv,txt for test_datasets and misc doc changes (#3226)

* fix: force train split for json,csv,txt for test_datasets

* feat(doc): add info on mixing datasets for VLM

* feat(doc): max memory

* fix(doc): clarify lr groups

* fix: add info on vision not being dropped

* feat: add qwen3-vl to multimodal docs

* fix: add moe blocks to arch list

* feat(doc): improve mistral docs

* chore: add helpful link [skip-e2e]

* fix: add vram usage for mistral small

* Update link in docs/faq.qmd

Co-authored-by: salman <salman.mohammadi@outlook.com>

---------

Co-authored-by: Wing Lian <wing@axolotl.ai>
Co-authored-by: salman <salman.mohammadi@outlook.com>
This commit is contained in:
NanoCode012
2025-10-23 05:23:20 +07:00
committed by GitHub
parent 3750fdcf79
commit 243620394a
9 changed files with 88 additions and 4 deletions

View File

@@ -12,7 +12,7 @@ Before starting, ensure you have:
Run the thinking model fine-tuning:
```bash
axolotl train magistral-small-think-qlora.yaml
axolotl train examples/magistral/think/magistral-small-think-qlora.yaml
```
This config uses about 19.1 GiB VRAM.

View File

@@ -21,7 +21,7 @@ Before starting, ensure you have:
3. Run the fine-tuning:
```bash
axolotl train magistral-small-vision-24B-qlora.yml
axolotl train examples/magistral/vision/magistral-small-vision-24B-qlora.yml
```
This config uses about 17GiB VRAM.