removing deepspeed guard for LoRA Triton kernels

fix(example): align example to correct adapter (#2478 )
* fix(example): align example to correct adapter * fix: add missing load in 4 bit
2025-04-03 16:44:45 +00:00 · 2025-04-03 08:48:14 -04:00 · 2025-04-03 08:47:52 -04:00
4 changed files with 6 additions and 6 deletions
--- a/examples/gemma3/gemma-3-4b-qlora.yml
+++ b/examples/gemma3/gemma-3-4b-qlora.yml
@@ -1,6 +1,8 @@
 base_model: google/gemma-3-4b-it
 strict: false

+load_in_4bit: true
+
 # gemma3 doesn't seem to play nice with ddp
 ddp_find_unused_parameters: true

@@ -17,7 +19,7 @@ dataset_prepared_path: last_run_prepared
 val_set_size: 0.01
 output_dir: ./outputs/out

-adapter: lora
+adapter: qlora
 lora_model_dir:

 sequence_len: 2048
--- a/examples/gemma3/gemma-3-4b-vision-qlora.yml
+++ b/examples/gemma3/gemma-3-4b-vision-qlora.yml
@@ -2,6 +2,8 @@ base_model: google/gemma-3-4b-it
 processor_type: AutoProcessor
 strict: false

+load_in_4bit: true
+
 # these 3 lines are needed for now to handle vision chat templates w images
 skip_prepare_dataset: true
 remove_unused_columns: false
--- a/src/axolotl/utils/config/init.py
+++ b/src/axolotl/utils/config/init.py
@@ -78,6 +78,7 @@ def resolve_dtype(cfg):
        cfg.bf16 = False
    else:
        torch.backends.cuda.matmul.allow_tf32 = cfg.tf32 or False
+        torch.backends.cudnn.allow_tf32 = cfg.tf32 or False
        if cfg.bf16:
            cfg.fp16 = False

--- a/src/axolotl/utils/schemas/config.py
+++ b/src/axolotl/utils/schemas/config.py
@@ -1224,17 +1224,12 @@ class AxolotlConfigWCapabilities(AxolotlInputConfig):
        ):
            capabilities = data.get("capabilities")
            is_fsdp = data.get("fsdp") is not None
-            is_deepspeed = data.get("deepspeed") is not None

            if capabilities and capabilities.get("n_gpu", 0) > 1:
                if is_fsdp:
                    raise ValueError(
                        "lora_mlp_kernel, lora_qkv_kernel, and lora_o_kernel are not compatible with FSDP."
                    )
-                if is_deepspeed:
-                    raise ValueError(
-                        "lora_mlp_kernel, lora_qkv_kernel, and lora_o_kernel are not compatible with DeepSpeed."
-                    )
        return data

    @model_validator(mode="before")
Author	SHA1	Message	Date
Dan Saunders	700409be6f	removing deepspeed guard for LoRA Triton kernels	2025-04-03 16:44:45 +00:00
NanoCode012	64d8035f50	fix(example): align example to correct adapter (#2478 ) * fix(example): align example to correct adapter * fix: add missing load in 4 bit	2025-04-03 08:48:14 -04:00
Wing Lian	5249e98058	add additional tf32 opt for cudnn (#2477 ) [skip ci]	2025-04-03 08:47:52 -04:00