VED
|
bb622b83de
|
super nemo support (#3508)
* nemo support
* config
* rename , config
* nemotron packing
* config fix
* read me + configs
* gc compat bug
* config chnages for qwen and pad token nemo
* patch nemotron_h weight renaming so it doesn't get reversed to embedding (singular noun) on checkpoint save
* lint
* revert qwen3.5 config changes, not needed in this pr
* lint
* Update examples/nemotron-h/120b-a12b-qlora.yaml
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
* Update examples/nemotron-h/nano-30b-a3b-qlora.yaml
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
* readme + validation
* lazy load comment
* Update examples/nemotron-h/120b-a12b-qlora.yaml
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
* val fix
* add nemo to multi packing
---------
Co-authored-by: Wing Lian <wing@axolotl.ai>
Co-authored-by: NanoCode012 <kevinvong@rocketmail.com>
|
2026-03-30 18:12:50 -04:00 |
|