From 756a0559c1bdadb8c833d9de13e728b616972a56 Mon Sep 17 00:00:00 2001 From: NanoCode012 Date: Fri, 11 Apr 2025 20:52:43 +0700 Subject: [PATCH] feat(doc): explain deepspeed configs (#2514) [skip ci] * feat(doc): explain deepspeed configs * fix: add fetch configs --- docs/multi-gpu.qmd | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/docs/multi-gpu.qmd b/docs/multi-gpu.qmd index 5aec89763..55eaca6c3 100644 --- a/docs/multi-gpu.qmd +++ b/docs/multi-gpu.qmd @@ -36,6 +36,9 @@ deepspeed: deepspeed_configs/zero1.json ### Usage {#sec-deepspeed-usage} ```{.bash} +# Fetch deepspeed configs (if not already present) +axolotl fetch deepspeed_configs + # Passing arg via config axolotl train config.yml @@ -48,10 +51,20 @@ axolotl train config.yml --deepspeed deepspeed_configs/zero1.json We provide default configurations for: - ZeRO Stage 1 (`zero1.json`) +- ZeRO Stage 1 with torch compile (`zero1_torch_compile.json`) - ZeRO Stage 2 (`zero2.json`) - ZeRO Stage 3 (`zero3.json`) +- ZeRO Stage 3 with bf16 (`zero3_bf16.json`) +- ZeRO Stage 3 with bf16 and CPU offload params(`zero3_bf16_cpuoffload_params.json`) +- ZeRO Stage 3 with bf16 and CPU offload params and optimizer (`zero3_bf16_cpuoffload_all.json`) -Choose based on your memory requirements and performance needs. +::: {.callout-tip} + +Choose the configuration that offloads the least amount to memory while still being able to fit on VRAM for best performance. + +Start from Stage 1 -> Stage 2 -> Stage 3. + +::: ## FSDP {#sec-fsdp}