Fix: adding magistral fsdp config, fixing not eval with test_datasets, handle mllama attention (#2789) [skip ci]

* feat: add fsdp config for magistral

* fix: add mllama self attention handling for lora kernels

* fix: no eval if val_set_size 0 despite having test_datasets

* fix: add note for cce for vlm in newer model
This commit is contained in:
NanoCode012
2025-06-14 11:53:43 -07:00
committed by GitHub
parent a3c82e8cbb
commit 80d5b066ec
4 changed files with 87 additions and 2 deletions

View File

@@ -24,6 +24,14 @@ pip3 uninstall -y cut-cross-entropy && pip3 install "cut-cross-entropy[transform
## Usage
**NOTE**: If you are training a VLM model, please use older version of Axolotl as upstream has applied a major VLM refactor, and our patches have not been updated yet.
```bash
git checkout 787880215b3ab32ccaf81c1b2e9588c6f3e6e764
pip3 install --no-build-isolation -e .
```
```yaml
plugins:
- axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin