From a3c82e8cbbe4f8285a7ebb40891cd9289b087ece Mon Sep 17 00:00:00 2001
From: NanoCode012 <nano@axolotl.ai>
Date: Fri, 13 Jun 2025 12:03:47 -0700
Subject: [PATCH] fix: grpo doc link (#2788) [skip ci]

---
 docs/rlhf.qmd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/rlhf.qmd b/docs/rlhf.qmd
index b2687a8f9..0a189e3c1 100644
--- a/docs/rlhf.qmd
+++ b/docs/rlhf.qmd
@@ -500,7 +500,7 @@ The input format is a simple JSON input with customizable fields based on the ab
 ### GRPO
 
 ::: {.callout-tip}
-Check out our [GRPO cookbook](https://github.com/axolotl-ai-cloud/axolotl-cookbook/tree/main/grpo#training-an-r1-style-large-language-model-using-grpo).
+Check out our [GRPO cookbook](https://github.com/axolotl-ai-cloud/grpo_code).
 :::
 
 In the latest GRPO implementation, `vLLM` is used to significantly speedup trajectory generation during training. In this example, we're using 4 GPUs - 2 for training, and 2 for vLLM: