llama4 support (#2493)

* llama4 support

* add xet support [skip ci]

* be flexible on transformers version and skip test on version

* don't use deepspeed for the fix_untrained_tokens test

* reordering to trigger torch 2.6.0 tests first

* slightly smaller train set

* use 4.51.0 for now

* remove stray print, add llama4 chat template to schema, bump peft to 0.15.1

* patches to make llama4 performant

* add preliminary fp8 support
This commit is contained in:
Wing Lian
2025-04-07 10:49:15 -04:00
committed by GitHub
parent 5f4af3665d
commit 8bbad21bfd
17 changed files with 409 additions and 34 deletions

View File

@@ -13,6 +13,7 @@ from axolotl.monkeypatch.utils import get_unpad_data
SUPPORTED_MULTIPACK_MODEL_TYPES = [
"mllama_text_model",
"llama",
"llama4",
"mistral",
"mixtral",
"qwen2",