Files

Wing Lian ba3dba3e4f add kernels for gpt oss models (#3020 )

* add kernels for gpt oss models

* add support for gpt-oss

* typo incorrect package

* fix: layout for configs and added wandb/epochs

* add gptoss example w offload and set moe leaf for z3

* add support for Mxfp4Config from yaml

* update yaml to use official model

* fix lora and don't allow triton to go above 3.3.1

* fix lr and tweak vram use

* fix range for triton since pinned wasn't compatible with toch 2.6.0

* update cce with gpt oss patches

---------

Co-authored-by: NanoCode012 <nano@axolotl.ai>

2025-08-06 09:47:55 -04:00

gpt-oss-20b-fft-fsdp2-offload.yaml

add kernels for gpt oss models (#3020 )

2025-08-06 09:47:55 -04:00

gpt-oss-20b-fft-fsdp2.yaml

add kernels for gpt oss models (#3020 )

2025-08-06 09:47:55 -04:00

gpt-oss-20b-sft-lora-singlegpu.yaml

add kernels for gpt oss models (#3020 )

2025-08-06 09:47:55 -04:00

README.md

add kernels for gpt oss models (#3020 )

2025-08-06 09:47:55 -04:00

README.md

OpenAI's GPT-OSS

GPT-OSS is a 20 billion parameter MoE model trained by OpenAI, released in August 2025.

20B Full Parameter SFT can be trained on 8x48GB GPUs (peak reserved memory @ ~36GiB/GPU) - YAML
20B LoRA SFT (all linear layers, and experts in last two layers) can be trained a single GPU (peak reserved memory @ ~47GiB)
- removing the experts from lora_target_parameters will allow the model to fit around ~44GiB of VRAM
- YAML
20B Full Parameter SFT with FSDP2 offloading can be trained on 2x24GB GPUs (peak reserved memory @ ~21GiB/GPU) - YAML