Add troubleshooting note for GLM4 GGUF MTP mismatch (#3559) [skip ci]

* Add troubleshooting note for GLM4 GGUF MTP mismatch * Fix JSON syntax for num_nextn_predict_layers example * fix: concise --------- Co-authored-by: NanoCode012 <nano@axolotl.ai>
2026-04-01 16:05:06 +02:00
parent 438ea7b045
commit 96ae8bdd1d
1 changed files with 8 additions and 0 deletions
--- a/examples/glm45/README.md
+++ b/examples/glm45/README.md
@@ -58,6 +58,14 @@ datasets:
 - **LoRA kernels**: Incompatible with this model. Must be explicitly disabled (`lora_*_kernel: false`).
 - Read more on how to load your own dataset at [docs](https://docs.axolotl.ai/docs/dataset_loading.html).

+### GGUF / llama.cpp loading error (missing tensors)
+
+If you see `missing tensor 'blk.X.attn_norm.weight'` when loading a GLM-4 / GLM4-MoE model in llama.cpp, this is likely
+caused by `num_nextn_predict_layers` being set to `1` in `config.json` while the MTP weights were not exported (possible
+after PEFT/QLoRA training).
+
+**Fix:** Set `"num_nextn_predict_layers": 0` in your `config.json` before converting to GGUF.
+
 ## Optimization Guides

 Please check the [Optimizations doc](https://docs.axolotl.ai/docs/optimizations.html).