Wing Lian
|
50f2b94d50
|
add 120b and deepspeed zero3 examples (#3035) [skip ci]
* add 120b and deepspeed zero3 examples
* add a bit of flavor and cleanup gpt oss readme
* fix: remove expert vram usage
* fix: remove redundant EOS token from eot_tokens
* feat: add 120B to docs
---------
Co-authored-by: NanoCode012 <nano@axolotl.ai>
|
2025-08-08 08:04:56 -04:00 |
|