axolotl/docs/getting-started.qmd

---
title: "Quickstart"
format:
  html:
    toc: true
    toc-depth: 3
    number-sections: true
execute:
  enabled: false
---

This guide will walk you through your first model fine-tuning project with Axolotl.

## Quick Example {#sec-quick-example}

Let's start by fine-tuning a small language model using LoRA. This example uses a 1B parameter model to ensure it runs on most GPUs.
Assuming `axolotl` is installed (if not, see our [Installation Guide](installation.qmd))

1. Download example configs:
```bash
axolotl fetch examples
```

2. Run the training:
```bash
axolotl train examples/llama-3/lora-1b.yml
```

That's it! Let's understand what just happened.

## Understanding the Process {#sec-understanding}

### The Configuration File {#sec-config}

The YAML configuration file controls everything about your training. Here's what (part of) our example config looks like:

```yaml
base_model: NousResearch/Llama-3.2-1B

load_in_8bit: true
adapter: lora

datasets:
  - path: teknium/GPT4-LLM-Cleaned
    type: alpaca
dataset_prepared_path: last_run_prepared
val_set_size: 0.1
output_dir: ./outputs/lora-out
```

::: {.callout-tip}
`load_in_8bit: true` and `adapter: lora` enables LoRA adapter finetuning.

- To perform Full finetuning, remove these two lines.
- To perform QLoRA finetuning, replace with `load_in_4bit: true` and `adapter: qlora`.
:::

See our [config options](config-reference.qmd) for more details.

### Training {#sec-training}

When you run `axolotl train`, Axolotl:

1. Downloads the base model
2. (If specified) applies QLoRA/LoRA adapter layers
3. Loads and processes the dataset
4. Runs the training loop
5. Saves the trained model and / or LoRA weights

## Your First Custom Training {#sec-custom}

Let's modify the example for your own data:

1. Create a new config file `my_training.yml`:

```yaml
base_model: NousResearch/Nous-Hermes-llama-1b-v1

load_in_8bit: true
adapter: lora

# Training settings
micro_batch_size: 2
num_epochs: 3
learning_rate: 0.0003

# Your dataset
datasets:
  - path: my_data.jsonl        # Your local data file
    type: alpaca               # Or other format
```

This specific config is for LoRA fine-tuning a model with instruction tuning data using
the `alpaca` dataset format, which has the following format:

```json
{
    "instruction": "Write a description of alpacas.",
    "input": "",
    "output": "Alpacas are domesticated South American camelids..."
}
```

Please see our [Dataset Formats](dataset-formats) for more dataset formats and how to
format them.

2. Prepare your JSONL data in the specified format (in this case, the expected `alpaca`
format):

```json
{"instruction": "Classify this text", "input": "I love this!", "output": "positive"}
{"instruction": "Classify this text", "input": "Not good at all", "output": "negative"}
```

3. Run the training:

```bash
axolotl train my_training.yml
```

## Common Tasks {#sec-common-tasks}

::: {.callout-tip}

The same yaml file is used for training, inference, and merging.

:::

### Testing Your Model {#sec-testing}

After training, test your model:

```bash
axolotl inference my_training.yml --lora-model-dir="./outputs/lora-out"
```

More details can be found in [Inference](inference.qmd).

### Using a UI {#sec-ui}

Launch a Gradio interface:

```bash
axolotl inference my_training.yml --lora-model-dir="./outputs/lora-out" --gradio
```

### Preprocessing Data {#sec-preprocessing}

For large datasets, preprocess first:

```bash
axolotl preprocess my_training.yml
```

Please make sure to set `dataset_prepared_path: ` in your config to set the path to save the prepared dataset.

More details can be found in [Dataset Preprocessing](dataset_preprocessing.qmd).

### Merging LoRA weights {#sec-merging-lora}

To merge the LoRA weights back into the base model, run:

```bash
axolotl merge-lora my_training.yml --lora-model-dir="./outputs/lora-out"
```

The merged model will be saved in the `{output_dir}/merged` directory.

More details can be found in [Merging LoRA weights](inference.qmd#sec-merging).

## Next Steps {#sec-next-steps}

Now that you have the basics, explore these guides based on what you want to do:

**Choose your path:**

- [Choosing a Fine-Tuning Method](choosing_method.qmd) — SFT vs LoRA vs QLoRA vs GRPO vs DPO, with hardware recommendations

**Core guides:**

- [Dataset Loading](dataset_loading.qmd) — Loading datasets from various sources
- [Dataset Formats](dataset-formats) — Working with different data formats
- [Optimizations](optimizations.qmd) — Flash attention, gradient checkpointing, sample packing
- [Training Stability & Debugging](training_stability.qmd) — Monitoring metrics, fixing NaN, OOM debugging

**Advanced training methods:**

- [RLHF / Preference Learning](rlhf.qmd) — DPO, KTO, GRPO, EBFT
- [GRPO Training](grpo.qmd) — RL with custom rewards and vLLM generation
- [vLLM Serving](vllm_serving.qmd) — Setting up vLLM for GRPO

**Scaling up:**

- [Multi-GPU Training](multi-gpu.qmd) — DeepSpeed, FSDP, DDP
- [Multi-Node Training](multi-node.qmd) — Distributed training across machines