Address Review Comments:

* deleted redundant docs/llm_compressor.qmd
* incorporated feedback in integration README.md
* added llmcompressor integration to docs/custom_integrations.qmd

Signed-off-by: Rahul Tuli <rtuli@redhat.com>
This commit is contained in:
Rahul Tuli
2025-04-23 18:00:00 -04:00
parent 99c13ef60c
commit f3e876dbfc
3 changed files with 40 additions and 101 deletions

View File

@@ -49,7 +49,8 @@ sections = [
("Knowledge Distillation (KD)", "kd"),
("Liger Kernels", "liger"),
("Language Model Evaluation Harness (LM Eval)", "lm_eval"),
("Spectrum", "spectrum")
("Spectrum", "spectrum"),
("LLMCompressor", "llm_compressor")
]
for section_name, folder_name in sections:

View File

@@ -1,98 +0,0 @@
---
title: "LLMCompressor Sparse Fine-tuning"
format:
html:
toc: true
toc-depth: 3
number-sections: true
execute:
enabled: false
---
# LLMCompressor Integration
Fine-tune sparsified models in Axolotl using [LLMCompressor](https://github.com/vllm-project/llm-compressor).
This integration enables fine-tuning of models **already sparsified** using LLMCompressor.
It hooks into Axolotls training pipeline using the plugin system and maintains sparsity throughout the fine-tuning process.
---
## Requirements
- Install Axolotl with `llmcompressor` extras:
```bash
pip install "axolotl[llmcompressor]"
```
- Requires `llmcompressor >= 0.5.1`
This will install all required dependencies for sparse model fine-tuning.
---
## Usage
To enable sparse fine-tuning with this integration, configure your Axolotl YAML like so:
```yaml
plugins:
- axolotl.integrations.llm_compressor.LLMCompressorPlugin
llmcompressor:
recipe:
finetuning_stage:
finetuning_modifiers:
ConstantPruningModifier:
targets: [
're:.*q_proj.weight',
're:.*k_proj.weight',
're:.*v_proj.weight',
're:.*o_proj.weight',
're:.*gate_proj.weight',
're:.*up_proj.weight',
're:.*down_proj.weight',
]
start: 0
# ... (other Axolotl training arguments)
```
::: {.callout-note}
This plugin **does not prune or sparsify the model**. It is only meant for **fine-tuning models that are already sparsified**.
:::
---
## Pre-Sparsified Checkpoints
You can use:
- Your own LLMCompressor-sparsified model
- Or one from [Neural Magic's Hugging Face page](https://huggingface.co/neuralmagic)
Refer to the [LLMCompressor README](https://github.com/vllm-project/llm-compressor/blob/main/README.md) to learn how to sparsify models or write custom recipes.
---
## Example Config
A full working example is provided at:
```bash
examples/llama-3/sparse-finetuning.yaml
```
Run fine-tuning using:
```bash
axolotl train examples/llama-3/sparse-finetuning.yaml
```
---
## Learn More
Explore LLMCompressor capabilities, supported modifiers, and detailed examples:
👉 [LLMCompressor GitHub](https://github.com/vllm-project/llm-compressor)