Files
axolotl/examples/llama-4/README.md
2025-04-09 08:27:46 -04:00

813 B

Llama 4 by Meta AI

Available Examples

Llama 4 Scout 17Bx16Experts (109B)

Our Single H100 implementation for Llama 4 Scout uses only 68.5GB VRAM for post-training with 4k context length @ 546 tokens/second. WandB logs here

Llama 4 Maverick 17Bx128Experts (400B)

Our 4xH100 implementation for Llama 4 Maverick uses 79.5GB VRAM/GPU for post-training with 4k context length @ 206 tokens/second. WandB logs here.