From 740d5a1d31e100833974f8ad2a7891b6c4dc1f9c Mon Sep 17 00:00:00 2001
From: Dan Saunders <danjsaund@gmail.com>
Date: Fri, 26 Sep 2025 09:55:15 -0400
Subject: [PATCH] doc fix (#3187)

---
 docs/lora_optims.qmd | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/lora_optims.qmd b/docs/lora_optims.qmd
index 7cdf53975..40893387b 100644
--- a/docs/lora_optims.qmd
+++ b/docs/lora_optims.qmd
@@ -5,10 +5,11 @@ description: "Custom autograd functions and Triton kernels in Axolotl for optimi
 
 Inspired by [Unsloth](https://github.com/unslothai/unsloth), we've implemented two
 optimizations for LoRA and QLoRA fine-tuning, supporting both single GPU and multi-GPU
-(in the DDP and DeepSpeed settings) training. These include (1) SwiGLU and GEGLU activation function
-Triton kernels, and (2) LoRA MLP and attention custom autograd functions. Our goal was
-to leverage operator fusion and tensor re-use in order to improve speed and reduce
-memory usage during the forward and backward passes of these calculations.
+(including the DDP, DeepSpeed, and FSDP2 settings) training. These include (1) SwiGLU
+and GEGLU activation function Triton kernels, and (2) LoRA MLP and attention custom
+autograd functions. Our goal was to leverage operator fusion and tensor re-use in order
+to improve speed and reduce memory usage during the forward and backward passes of
+these calculations.
 
 We currently support several common model architectures, including (but not limited to):
 
@@ -131,6 +132,5 @@ computation path.
 ## Future Work
 
 - Support for additional model architectures
-- Support for the FSDP setting
 - Support for dropout and bias
 - Additional operator fusions