From 5b66b8e86c2a91af1f5efdde27e63fea88957dab Mon Sep 17 00:00:00 2001
From: Quarto GHA Workflow Runner
Get the appropriate attention class by inspecting the model config.
+
+get_layers
+Get the layers of the model. Handles text-only and multimodal models.
+
-original_apply_o
Original implementation of output projection without optimizations.
+
-original_apply_qkv
Original implementation of QKV projection without optimizations.
+
@@ -740,13 +745,58 @@ the standard transformers naming convention.patch_self_attn_lora
Given an
axolotl config, this method patches the inferred attention class forward
monkeypatch.lora_kernels.original_apply_o(self, hidden_states)Original implementation of output projection without optimizations.
+monkeypatch.lora_kernels.get_layers(model)Get the layers of the model. Handles text-only and multimodal models.
| Name | +Type | +Description | +Default | +
|---|---|---|---|
| model | +PeftModelForCausalLM | +A PEFT model. | +required | +
| Name | +Type | +Description | +
|---|---|---|
| + | list[nn.Module] | +A list of layers. | +
monkeypatch.lora_kernels.original_apply_o(self, hidden_states)Original implementation of output projection without optimizations.
+