Files
axolotl/docs
NanoCode012 fcc4cfdb63 feat: add sageattention (#2823) [skip ci]
* feat: add sageattention

* feat: call path on pre model load

* fix: patch to use register to correct var

* fix: add strict check import at start

* chore: fix comments

* chore: refactor

* feat: add capability check

* fix: missed underscore

* fix: let sageattention use FA backend in transformers

* feat: update sage attention for attention mask and position ids

* feat: allow sample packing but add warning without packing

* fix: loss hitting 0 with packing and attention mask note

* feat: downcast embeds if sage attention too

* feat: add config validation

* feat: add attention docs

* chore: docs
2026-02-10 17:49:21 +07:00
..
2026-01-27 17:08:24 -05:00
2025-06-18 15:36:53 -04:00
2026-01-01 06:52:45 -05:00
2025-06-18 15:36:53 -04:00
2025-09-26 09:55:15 -04:00
2025-09-17 10:38:15 +01:00
2026-01-21 17:22:45 -05:00
2025-09-02 12:08:44 -04:00