Built site for gh-pages

2026-03-06 14:26:20 +00:00
parent 8f63599e42
commit 17a84c24d3
5 changed files with 241 additions and 240 deletions
--- a/docs/expert_quantization.html
+++ b/docs/expert_quantization.html
@@ -837,6 +837,7 @@ Note
 <section id="limitations" class="level2">
 <h2 class="anchored" data-anchor-id="limitations">Limitations</h2>
 <ul>
+<li><code>lora_target_linear</code> is not compatible with <code>quantize_moe_experts</code>. See <a href="#expert-lora-targeting">Expert LoRA targeting</a> instead.</li>
 <li><code>cpu_ram_efficient_loading</code> hangs / takes long time with FSDP2 + QLoRA.</li>
 <li>Total model parameter count may display incorrectly (trainable param count is correct).</li>
 <li>FSDP LoRA (8-bit) may have a large initial VRAM spike at the first 1-2 steps, which then drops. QLoRA does not exhibit this.</li>