Commit Graph

4 Commits

Author SHA1 Message Date
Wing Lian
fec0c3a99e chore: lint 2026-03-19 07:27:23 +00:00
Wing Lian
31d8d068bb handle base+lora split kernel for older moe models 2026-03-19 07:11:30 +00:00
Wing Lian
66fea258c7 add correctness unit tests and benchmarks for scattermoe + lora 2026-03-19 06:40:04 +00:00
Wing Lian
163bd4dd5a use custom triton kernels for entropy from logits and selective softmax (#3510)
* use custom triton kernels for entropy from logits and selective softmax

* PR comments fixes

* fix out of bounds, include tests, include benchmarks

* chore: lint
2026-03-19 02:02:43 -04:00