fix(validation): add validation for lora target linear with quantize experts (#3461)

* fix: add validation for lora target linear with quantize experts * chore: fix lint * chore: comment * fix: missing link on readme
2026-03-06 21:19:05 +07:00
parent a260d330ed
commit 6c8c73e5a4
4 changed files with 21 additions and 1 deletions
--- a/docs/expert_quantization.qmd
+++ b/docs/expert_quantization.qmd
@@ -45,6 +45,7 @@ lora_target_parameters:

 ## Limitations

+- `lora_target_linear` is not compatible with `quantize_moe_experts`. See [Expert LoRA targeting](#expert-lora-targeting) instead.
 - `cpu_ram_efficient_loading` hangs / takes long time with FSDP2 + QLoRA.
 - Total model parameter count may display incorrectly (trainable param count is correct).
 - FSDP LoRA (8-bit) may have a large initial VRAM spike at the first 1-2 steps, which then drops. QLoRA does not exhibit this.