813 B
813 B
Llama 4 by Meta AI
Available Examples
Llama 4 Scout 17Bx16Experts (109B)
Our Single H100 implementation for Llama 4 Scout uses only 68.5GB VRAM for post-training with 4k context length @ 546 tokens/second. WandB logs here
Llama 4 Maverick 17Bx128Experts (400B)
Our 4xH100 implementation for Llama 4 Maverick uses 79.5GB VRAM/GPU for post-training with 4k context length @ 206 tokens/second. WandB logs here.