fix ddp/fsdp w gemma4 (#3584)

* fix ddp/fsdp w gemma4

* address pr comments

* activation offloading fix and update agent docs for gemma4
This commit is contained in:
Wing Lian
2026-04-09 20:02:36 -07:00
committed by GitHub
parent 7daf7d96f1
commit 4ef608dda3
9 changed files with 398 additions and 2 deletions

View File

@@ -38,6 +38,8 @@ Agent-specific references:
- [docs/agents/grpo.md](docs/agents/grpo.md) — GRPO online RL with reward functions
- [docs/agents/reward_modelling.md](docs/agents/reward_modelling.md) — outcome and process reward models
- [docs/agents/pretraining.md](docs/agents/pretraining.md) — continual pretraining
- [docs/agents/model_architectures.md](docs/agents/model_architectures.md) — model-specific quirks (Gemma4, Qwen3.5 MoE, etc.)
- [docs/agents/new_model_support.md](docs/agents/new_model_support.md) — debugging and adding support for new model architectures
## Config Pattern