feat: add sageattention (#2823) [skip ci]

* feat: add sageattention

* feat: call path on pre model load

* fix: patch to use register to correct var

* fix: add strict check import at start

* chore: fix comments

* chore: refactor

* feat: add capability check

* fix: missed underscore

* fix: let sageattention use FA backend in transformers

* feat: update sage attention for attention mask and position ids

* feat: allow sample packing but add warning without packing

* fix: loss hitting 0 with packing and attention mask note

* feat: downcast embeds if sage attention too

* feat: add config validation

* feat: add attention docs

* chore: docs
This commit is contained in:
NanoCode012
2026-02-10 17:49:21 +07:00
committed by GitHub
parent 97a4f28511
commit fcc4cfdb63
7 changed files with 416 additions and 3 deletions

View File

@@ -320,6 +320,7 @@ website:
- docs/multipack.qmd
- docs/mixed_precision.qmd
- docs/optimizers.qmd
- docs/attention.qmd
- section: "Advanced Features"
contents: