llama4 support (#2493)

* llama4 support * add xet support [skip ci] * be flexible on transformers version and skip test on version * don't use deepspeed for the fix_untrained_tokens test * reordering to trigger torch 2.6.0 tests first * slightly smaller train set * use 4.51.0 for now * remove stray print, add llama4 chat template to schema, bump peft to 0.15.1 * patches to make llama4 performant * add preliminary fp8 support
2025-04-07 10:49:15 -04:00
parent 5f4af3665d
commit 8bbad21bfd
17 changed files with 409 additions and 34 deletions
--- a/src/axolotl/monkeypatch/multipack.py
+++ b/src/axolotl/monkeypatch/multipack.py
@@ -13,6 +13,7 @@ from axolotl.monkeypatch.utils import get_unpad_data
 SUPPORTED_MULTIPACK_MODEL_TYPES = [
    "mllama_text_model",
    "llama",
+    "llama4",
    "mistral",
    "mixtral",
    "qwen2",