Add troubleshooting note for GLM4 GGUF MTP mismatch (#3559) [skip ci]

* Add troubleshooting note for GLM4 GGUF MTP mismatch

* Fix JSON syntax for num_nextn_predict_layers example

* fix: concise

---------

Co-authored-by: NanoCode012 <nano@axolotl.ai>
This commit is contained in:
Mario Župan
2026-04-01 16:05:06 +02:00
committed by GitHub
parent 438ea7b045
commit 96ae8bdd1d

View File

@@ -58,6 +58,14 @@ datasets:
- **LoRA kernels**: Incompatible with this model. Must be explicitly disabled (`lora_*_kernel: false`).
- Read more on how to load your own dataset at [docs](https://docs.axolotl.ai/docs/dataset_loading.html).
### GGUF / llama.cpp loading error (missing tensors)
If you see `missing tensor 'blk.X.attn_norm.weight'` when loading a GLM-4 / GLM4-MoE model in llama.cpp, this is likely
caused by `num_nextn_predict_layers` being set to `1` in `config.json` while the MTP weights were not exported (possible
after PEFT/QLoRA training).
**Fix:** Set `"num_nextn_predict_layers": 0` in your `config.json` before converting to GGUF.
## Optimization Guides
Please check the [Optimizations doc](https://docs.axolotl.ai/docs/optimizations.html).