* feat(doc): organize docs, add to menu bar, fix broken formatting * feat: add link to custom integrations * feat: update readme for integrations to include citations and repo link * chore: update lm_eval info * chore: use fullname * Update docs/cli.qmd per suggestion Co-authored-by: Dan Saunders <danjsaund@gmail.com> * feat: add sweep doc * feat: add kd doc * fix: remove toc * fix: update deprecation * feat: add more info about chat_template issues * fix: heading level * fix: shell->bash code block * fix: ray link * fix(doc): heading level, header links, formatting * feat: add grpo docs * feat: add style changes * fix: wrong cli arg for lm-eval * fix: remove old run method * feat: load custom integration doc dynamically * fix: remove old cli way * fix: toc * fix: minor formatting --------- Co-authored-by: Dan Saunders <danjsaund@gmail.com>
54 lines
1.4 KiB
Plaintext
54 lines
1.4 KiB
Plaintext
---
|
|
title: "Unsloth"
|
|
description: "Hyper-optimized QLoRA finetuning for single GPUs"
|
|
---
|
|
|
|
### Overview
|
|
|
|
Unsloth provides hand-written optimized kernels for LLM finetuning that slightly improve speed and VRAM over
|
|
standard industry baselines.
|
|
|
|
::: {.callout-important}
|
|
Due to breaking changes in transformers `v4.48.0`, users will need to downgrade to `<=v4.47.1` to use this patch.
|
|
|
|
This will later be deprecated in favor of [LoRA Optimizations](lora_optims.qmd).
|
|
:::
|
|
|
|
|
|
### Installation
|
|
|
|
The following will install the correct unsloth and extras from source.
|
|
|
|
```bash
|
|
python scripts/unsloth_install.py | sh
|
|
```
|
|
|
|
### Usage
|
|
|
|
Axolotl exposes a few configuration options to try out unsloth and get most of the performance gains.
|
|
|
|
Our unsloth integration is currently limited to the following model architectures:
|
|
- llama
|
|
|
|
These options are specific to LoRA finetuning and cannot be used for multi-GPU finetuning
|
|
```yaml
|
|
unsloth_lora_mlp: true
|
|
unsloth_lora_qkv: true
|
|
unsloth_lora_o: true
|
|
```
|
|
|
|
These options are composable and can be used with multi-gpu finetuning
|
|
```yaml
|
|
unsloth_cross_entropy_loss: true
|
|
unsloth_rms_norm: true
|
|
unsloth_rope: true
|
|
```
|
|
|
|
### Limitations
|
|
|
|
- Single GPU only; e.g. no multi-gpu support
|
|
- No deepspeed or FSDP support (requires multi-gpu)
|
|
- LoRA + QLoRA support only. No full fine tunes or fp8 support.
|
|
- Limited model architecture support. Llama, Phi, Gemma, Mistral only
|
|
- No MoE support.
|