NanoCode012
|
97f1b1758d
|
Feat: add kimi linear support (#3257)
* feat: add custom kimi linear patch [skip ci]
* feat: add configuration file and fix import [skip ci]
* fix: hijack tokenizer temporarily [skip ci]
* chore: remove accidental commit
* fix: attempt patch kimi remote
* fix: kwargs passsed
* fix: device for tensor
* fix: aux loss calculation
* feat: cleaned up patches order
* fix: remove duplicate tokenizer patch
* chore: add debug logs
* chore: add debug logs
* chore: debug
* Revert "chore: add debug logs"
This reverts commit da372a5f67.
* Revert "chore: add debug logs"
This reverts commit 97d1de1d7c.
* fix: KeyError: 'tokenization_kimi'
* fix: support remote_model_id in cce patch
* feat: add config preload patch
* fix: use standard aux loss calc and updated modeling
* fix: import
* feat: add kimi-linear docs and example
* chore: add note about moe kernels
* feat: update cce to include kimi-linear
* chore: lint
* chore: update main readme
* fix: patch mechanism to address comments
* chore: lint
* fix: tests
* chore: cleanup comment
|
2025-12-25 17:53:52 +07:00 |
|