fix token state json and mistral tokenizer issue (#3522) [skip ci]

* fix token state json and mistral tokenizer issue

* centralize constants

* forgot to commit constants file

* Fix weakref in pickling relora state dict

* make curl a bit quieter so it doesn't log 2K lines

* fix path traversal for olmoe test

* more test fixes that weren't flagged previously

* chore: lint

* skip tests that fail b/c of OutOfResources

* scattermoe as slow tests

* update fbgemm-genai for torch 2.10
This commit is contained in:
Wing Lian
2026-03-21 22:46:10 -04:00
committed by GitHub
parent 2c05847a5f
commit 0ee98a0309
22 changed files with 249 additions and 57 deletions

View File

@@ -14,6 +14,9 @@ from axolotl.utils.dict import DictDefault
from tests.hf_offline_utils import enable_hf_offline
@pytest.mark.skip(
reason="DeepSeek-V3-11M remote model code needs _supports_flash_attn=True for newer transformers"
)
class TestDeepseekV3:
"""
Test case for DeepseekV3 models