NanoCode012
|
372f664c63
|
feat: cleanup old flex mask patch, suppress Matmul bnb warn, and misc (#3330) [skip-ci]
* feat: add pos id to flex attention for packing part 1
* feat: update to include sliding window mask patch
* fix: suppress MatMul8bitLt: inputs will be cast from warnings
* fix: remove redundant flex attention patch
* chore: update olmo docs
* feat: add validator patch for cross entropy
|
2025-12-25 17:56:20 +07:00 |
|