fix: force train split for json,csv,txt for test_datasets and misc doc changes (#3226)

* fix: force train split for json,csv,txt for test_datasets * feat(doc): add info on mixing datasets for VLM * feat(doc): max memory * fix(doc): clarify lr groups * fix: add info on vision not being dropped * feat: add qwen3-vl to multimodal docs * fix: add moe blocks to arch list * feat(doc): improve mistral docs * chore: add helpful link [skip-e2e] * fix: add vram usage for mistral small * Update link in docs/faq.qmd Co-authored-by: salman <salman.mohammadi@outlook.com> --------- Co-authored-by: Wing Lian <wing@axolotl.ai> Co-authored-by: salman <salman.mohammadi@outlook.com>
2025-10-23 05:23:20 +07:00
parent 3750fdcf79
commit 243620394a
9 changed files with 88 additions and 4 deletions
--- a/examples/magistral/think/README.md
+++ b/examples/magistral/think/README.md
@@ -12,7 +12,7 @@ Before starting, ensure you have:
 Run the thinking model fine-tuning:

 ```bash
-axolotl train magistral-small-think-qlora.yaml
+axolotl train examples/magistral/think/magistral-small-think-qlora.yaml
 ```

 This config uses about 19.1 GiB VRAM.
--- a/examples/magistral/vision/README.md
+++ b/examples/magistral/vision/README.md
@@ -21,7 +21,7 @@ Before starting, ensure you have:

 3. Run the fine-tuning:
   ```bash
-   axolotl train magistral-small-vision-24B-qlora.yml
+   axolotl train examples/magistral/vision/magistral-small-vision-24B-qlora.yml
   ```

 This config uses about 17GiB VRAM.