--- title: "Unsloth" description: "Hyper-optimized QLoRA finetuning for single GPUs" --- ### Overview Unsloth provides hand-written optimized kernels for LLM finetuning that slightly improve speed and VRAM over standard industry baselines. ### Installation The following will install unsloth from source and downgrade xformers as unsloth is incompatible with the most up to date libraries. ```bash pip install --no-deps "unsloth @ git+https://github.com/unslothai/unsloth.git" pip install --no-deps --force-reinstall xformers==0.0.26.post1 ``` ### Using unsloth w Axolotl Axolotl exposes a few configuration options to try out unsloth and get most of the performance gains. Our unsloth integration is currently limited to the following model architectures: - llama These options are specific to LoRA finetuning and cannot be used for multi-GPU finetuning ```yaml unsloth_lora_mlp: true unsloth_lora_qkv: true unsloth_lora_o: true ``` These options are composable and can be used with multi-gpu finetuning ```yaml unsloth_cross_entropy_loss: true unsloth_rms_norm: true unsloth_rope: true ``` ### Limitations - Single GPU only; e.g. no multi-gpu support - No deepspeed or FSDP support (requires multi-gpu) - LoRA + QLoRA support only. No full fine tunes or fp8 support. - Limited model architecture support. Llama, Phi, Gemma, Mistral only - No MoE support.