add kernels for gpt oss models (#3020)

* add kernels for gpt oss models

* add support for gpt-oss

* typo incorrect package

* fix: layout for configs and added wandb/epochs

* add gptoss example w offload and set moe leaf for z3

* add support for Mxfp4Config from yaml

* update yaml to use official model

* fix lora and don't allow triton to go above 3.3.1

* fix lr and tweak vram use

* fix range for triton since pinned wasn't compatible with toch 2.6.0

* update cce with gpt oss patches

---------

Co-authored-by: NanoCode012 <nano@axolotl.ai>
This commit is contained in:
Wing Lian
2025-08-06 09:47:55 -04:00
committed by GitHub
parent 97e86c6d47
commit ba3dba3e4f
15 changed files with 257 additions and 11 deletions

View File

@@ -2,7 +2,8 @@
# START section of dependencies that don't install on Darwin/MacOS
bitsandbytes==0.46.1
triton>=3.0.0
# triton 3.4.0 is not compatible with CCE
triton>=3.0.0,<3.4.0
mamba-ssm==1.2.0.post1
xformers>=0.0.23.post1
autoawq==0.2.7.post3
@@ -20,6 +21,7 @@ datasets==4.0.0
deepspeed>=0.17.0
trl==0.20.0
hf_xet==1.1.5
kernels==0.9.0
optimum==1.16.2
hf_transfer