update doc snippets + reject gemma4-hybrid with non-FA2 backend

This commit is contained in:
Wing Lian
2026-04-23 22:18:02 +00:00
parent 39226623d2
commit 434a484fe9
10 changed files with 47 additions and 27 deletions

View File

@@ -55,7 +55,7 @@ To use sequence parallelism, you need:
## Limitations
- Flash attention must be enabled for this to work (`flash_attention: true` in config YAML)
- Flash attention must be enabled for this to work (`attn_implementation: flash_attention_2` in config YAML)
- May have a small performance overhead due to communication between GPUs
## Example