add kernels for gpt oss models (#3020)

* add kernels for gpt oss models * add support for gpt-oss * typo incorrect package * fix: layout for configs and added wandb/epochs * add gptoss example w offload and set moe leaf for z3 * add support for Mxfp4Config from yaml * update yaml to use official model * fix lora and don't allow triton to go above 3.3.1 * fix lr and tweak vram use * fix range for triton since pinned wasn't compatible with toch 2.6.0 * update cce with gpt oss patches --------- Co-authored-by: NanoCode012 <nano@axolotl.ai>
2025-08-06 09:47:55 -04:00
parent 97e86c6d47
commit ba3dba3e4f
15 changed files with 257 additions and 11 deletions
--- a/requirements.txt
+++ b/requirements.txt
@@ -2,7 +2,8 @@

 # START section of dependencies that don't install on Darwin/MacOS
 bitsandbytes==0.46.1
-triton>=3.0.0
+# triton 3.4.0 is not compatible with CCE
+triton>=3.0.0,<3.4.0
 mamba-ssm==1.2.0.post1
 xformers>=0.0.23.post1
 autoawq==0.2.7.post3
@@ -20,6 +21,7 @@ datasets==4.0.0
 deepspeed>=0.17.0
 trl==0.20.0
 hf_xet==1.1.5
+kernels==0.9.0

 optimum==1.16.2
 hf_transfer