Commit Graph

5 Commits

Author SHA1 Message Date
Wing Lian
434a484fe9 update doc snippets + reject gemma4-hybrid with non-FA2 backend 2026-04-23 22:27:01 +00:00
Wing Lian
2d64d009d8 expand attention tests + rewrite docs 2026-04-23 22:27:01 +00:00
Wing Lian
2579c496d5 make attn_implementation the single source of truth 2026-04-23 22:27:01 +00:00
Wing Lian
ff5d6393c8 replace legacy attention boolean flags with capability properties
Replace checks with capability-based properties derived from attn_implementation

This separates three concerns that were conflated under flash_attention:
1. Backend selection -> attn_implementation enum
2. Packing capability -> attn_supports_packing property
3. Flash-attn library dependency -> attn_uses_flash_lib property
2026-04-23 22:27:01 +00:00
Wing Lian
aee8c75d64 refactor attention handling 2026-04-23 22:27:01 +00:00