344 lines
8.4 KiB
Plaintext
344 lines
8.4 KiB
Plaintext
---
|
|
title: "Command Line Interface (CLI)"
|
|
format:
|
|
html:
|
|
toc: true
|
|
toc-expand: 2
|
|
toc-depth: 3
|
|
execute:
|
|
enabled: false
|
|
---
|
|
|
|
The Axolotl CLI provides a streamlined interface for training and fine-tuning large language models. This guide covers
|
|
the CLI commands, their usage, and common examples.
|
|
|
|
|
|
## Basic Commands
|
|
|
|
All Axolotl commands follow this general structure:
|
|
|
|
```bash
|
|
axolotl <command> [config.yml] [options]
|
|
```
|
|
|
|
The config file can be local or a URL to a raw YAML file.
|
|
|
|
### Launcher Arguments
|
|
|
|
For commands that support multi-GPU (`train`, `evaluate`, ...), you can pass launcher-specific arguments using the `--` separator:
|
|
|
|
```bash
|
|
# Pass torchrun arguments
|
|
axolotl train config.yml --launcher torchrun -- --nproc_per_node=2 --nnodes=1
|
|
|
|
# Pass accelerate arguments
|
|
axolotl train config.yml --launcher accelerate -- --config_file=accelerate_config.yml --num_processes=4
|
|
```
|
|
|
|
Arguments after `--` are passed directly to the launcher (torchrun, accelerate launch, etc.).
|
|
|
|
## Command Reference
|
|
|
|
### fetch
|
|
|
|
Downloads example configurations and deepspeed configs to your local machine.
|
|
|
|
```bash
|
|
# Get example YAML files
|
|
axolotl fetch examples
|
|
|
|
# Get deepspeed config files
|
|
axolotl fetch deepspeed_configs
|
|
|
|
# Specify custom destination
|
|
axolotl fetch examples --dest path/to/folder
|
|
```
|
|
|
|
### preprocess
|
|
|
|
Preprocesses and tokenizes your dataset before training. This is recommended for large datasets.
|
|
|
|
```bash
|
|
# Basic preprocessing
|
|
axolotl preprocess config.yml
|
|
|
|
# Preprocessing with one GPU
|
|
CUDA_VISIBLE_DEVICES="0" axolotl preprocess config.yml
|
|
|
|
# Debug mode to see processed examples
|
|
axolotl preprocess config.yml --debug
|
|
|
|
# Debug with limited examples
|
|
axolotl preprocess config.yml --debug --debug-num-examples 5
|
|
```
|
|
|
|
Configuration options:
|
|
|
|
```yaml
|
|
dataset_prepared_path: Local folder for saving preprocessed data
|
|
push_dataset_to_hub: HuggingFace repo to push preprocessed data (optional)
|
|
```
|
|
|
|
### train
|
|
|
|
Trains or fine-tunes a model using the configuration specified in your YAML file.
|
|
|
|
```bash
|
|
# Basic training
|
|
axolotl train config.yml
|
|
|
|
# Train and set/override specific options
|
|
axolotl train config.yml \
|
|
--learning-rate 1e-4 \
|
|
--micro-batch-size 2 \
|
|
--num-epochs 3
|
|
|
|
# Training without accelerate
|
|
axolotl train config.yml --launcher python
|
|
|
|
# Pass launcher-specific arguments using -- separator
|
|
axolotl train config.yml --launcher torchrun -- --nproc_per_node=2 --nnodes=1
|
|
axolotl train config.yml --launcher accelerate -- --config_file=accelerate_config.yml
|
|
|
|
# Resume training from checkpoint
|
|
axolotl train config.yml --resume-from-checkpoint path/to/checkpoint
|
|
```
|
|
|
|
It is possible to run sweeps over multiple hyperparameters by passing in a sweeps config.
|
|
|
|
```bash
|
|
# Basic training with sweeps
|
|
axolotl train config.yml --sweep path/to/sweep.yaml
|
|
```
|
|
|
|
Example sweep config:
|
|
```yaml
|
|
_:
|
|
# This section is for dependent variables we need to fix
|
|
- load_in_8bit: false
|
|
load_in_4bit: false
|
|
adapter: lora
|
|
- load_in_8bit: true
|
|
load_in_4bit: false
|
|
adapter: lora
|
|
|
|
# These are independent variables
|
|
learning_rate: [0.0003, 0.0006]
|
|
lora_r:
|
|
- 16
|
|
- 32
|
|
lora_alpha:
|
|
- 16
|
|
- 32
|
|
- 64
|
|
```
|
|
|
|
|
|
|
|
### inference
|
|
|
|
Runs inference using your trained model in either CLI or Gradio interface mode.
|
|
|
|
```bash
|
|
# CLI inference with LoRA
|
|
axolotl inference config.yml --lora-model-dir="./outputs/lora-out"
|
|
|
|
# CLI inference with full model
|
|
axolotl inference config.yml --base-model="./completed-model"
|
|
|
|
# Gradio web interface
|
|
axolotl inference config.yml --gradio \
|
|
--lora-model-dir="./outputs/lora-out"
|
|
|
|
# Inference with input from file
|
|
cat prompt.txt | axolotl inference config.yml \
|
|
--base-model="./completed-model"
|
|
```
|
|
|
|
### merge-lora
|
|
|
|
Merges trained LoRA adapters into the base model.
|
|
|
|
```bash
|
|
# Basic merge
|
|
axolotl merge-lora config.yml
|
|
|
|
# Specify LoRA directory (usually used with checkpoints)
|
|
axolotl merge-lora config.yml --lora-model-dir="./lora-output/checkpoint-100"
|
|
|
|
# Merge using CPU (if out of GPU memory)
|
|
CUDA_VISIBLE_DEVICES="" axolotl merge-lora config.yml
|
|
```
|
|
|
|
Configuration options:
|
|
|
|
```yaml
|
|
gpu_memory_limit: Limit GPU memory usage
|
|
lora_on_cpu: Load LoRA weights on CPU
|
|
```
|
|
|
|
### merge-sharded-fsdp-weights
|
|
|
|
Merges sharded FSDP model checkpoints into a single combined checkpoint.
|
|
|
|
```bash
|
|
# Basic merge
|
|
axolotl merge-sharded-fsdp-weights config.yml
|
|
```
|
|
|
|
### evaluate
|
|
|
|
Evaluates a model's performance (loss etc) on the train and eval datasets.
|
|
|
|
```bash
|
|
# Basic evaluation
|
|
axolotl evaluate config.yml
|
|
|
|
# Evaluation with launcher arguments
|
|
axolotl evaluate config.yml --launcher torchrun -- --nproc_per_node=2
|
|
```
|
|
|
|
### lm-eval
|
|
|
|
Runs LM Evaluation Harness on your model.
|
|
|
|
```bash
|
|
# Basic evaluation
|
|
axolotl lm-eval config.yml
|
|
```
|
|
|
|
Configuration options:
|
|
|
|
```yaml
|
|
lm_eval_model: # model to evaluate (local or hf path)
|
|
|
|
# List of tasks to evaluate
|
|
lm_eval_tasks:
|
|
- arc_challenge
|
|
- hellaswag
|
|
lm_eval_batch_size: # Batch size for evaluation
|
|
output_dir: # Directory to save evaluation results
|
|
```
|
|
|
|
See [LM Eval Harness integration docs](https://docs.axolotl.ai/docs/custom_integrations.html#language-model-evaluation-harness-lm-eval) for full configuration details.
|
|
|
|
### delinearize-llama4
|
|
|
|
Delinearizes a Llama 4 linearized model into a regular HuggingFace Llama 4 model. This only works with the non-quantized linearized model.
|
|
|
|
```bash
|
|
axolotl delinearize-llama4 --model path/to/model_dir --output path/to/output_dir
|
|
```
|
|
|
|
This would be necessary to use with other frameworks. If you have an adapter, merge it with the non-quantized linearized model before delinearizing.
|
|
|
|
### quantize
|
|
|
|
Quantizes a model using the quantization configuration specified in your YAML file.
|
|
|
|
```bash
|
|
axolotl quantize config.yml
|
|
```
|
|
|
|
See [Quantization](./quantize.qmd) for more details.
|
|
|
|
|
|
## Legacy CLI Usage
|
|
|
|
While the new Click-based CLI is preferred, Axolotl still supports the legacy module-based CLI:
|
|
|
|
```bash
|
|
# Preprocess
|
|
python -m axolotl.cli.preprocess config.yml
|
|
|
|
# Train
|
|
accelerate launch -m axolotl.cli.train config.yml
|
|
|
|
# Inference
|
|
accelerate launch -m axolotl.cli.inference config.yml \
|
|
--lora_model_dir="./outputs/lora-out"
|
|
|
|
# Gradio interface
|
|
accelerate launch -m axolotl.cli.inference config.yml \
|
|
--lora_model_dir="./outputs/lora-out" --gradio
|
|
```
|
|
|
|
::: {.callout-important}
|
|
When overriding CLI parameters in the legacy CLI, use same notation as in yaml file (e.g., `--lora_model_dir`).
|
|
|
|
**Note:** This differs from the new Click-based CLI, which uses dash notation (e.g., `--lora-model-dir`). Keep this in mind if you're referencing newer documentation or switching between CLI versions.
|
|
:::
|
|
|
|
## Remote Compute with Modal Cloud
|
|
|
|
Axolotl supports running training and inference workloads on Modal cloud infrastructure. This is configured using a
|
|
cloud YAML file alongside your regular Axolotl config.
|
|
|
|
### Cloud Configuration
|
|
|
|
Create a cloud config YAML with your Modal settings:
|
|
|
|
```yaml
|
|
# cloud_config.yml
|
|
provider: modal
|
|
gpu: a100 # Supported: l40s, a100-40gb, a100-80gb, a10g, h100, t4, l4
|
|
gpu_count: 1 # Number of GPUs to use
|
|
timeout: 86400 # Maximum runtime in seconds (24 hours)
|
|
branch: main # Git branch to use (optional)
|
|
|
|
volumes: # Persistent storage volumes
|
|
- name: axolotl-cache
|
|
mount: /workspace/cache
|
|
- name: axolotl-data
|
|
mount: /workspace/data
|
|
- name: axolotl-artifacts
|
|
mount: /workspace/artifacts
|
|
|
|
secrets: # Secrets to inject
|
|
- WANDB_API_KEY
|
|
- HF_TOKEN
|
|
```
|
|
|
|
### Running on Modal Cloud
|
|
|
|
Commands that support the --cloud flag:
|
|
|
|
```bash
|
|
# Preprocess on cloud
|
|
axolotl preprocess config.yml --cloud cloud_config.yml
|
|
|
|
# Train on cloud
|
|
axolotl train config.yml --cloud cloud_config.yml
|
|
|
|
# Run lm-eval on cloud
|
|
axolotl lm-eval config.yml --cloud cloud_config.yml
|
|
```
|
|
|
|
### Cloud Configuration Options
|
|
|
|
```yaml
|
|
provider: # compute provider, currently only `modal` is supported
|
|
gpu: # GPU type to use
|
|
gpu_count: # Number of GPUs (default: 1)
|
|
memory: # RAM in GB (default: 128)
|
|
timeout: # Maximum runtime in seconds
|
|
timeout_preprocess: # Preprocessing timeout
|
|
branch: # Git branch to use
|
|
docker_tag: # Custom Docker image tag
|
|
volumes: # List of persistent storage volumes
|
|
|
|
# Environment variables to pass. Can be specified in two ways:
|
|
# 1. As a string: Will load the value from the host computer's environment variables
|
|
# 2. As a key-value pair: Will use the specified value directly
|
|
# Example:
|
|
# env:
|
|
# - CUSTOM_VAR # Loads from host's $CUSTOM_VAR
|
|
# - {CUSTOM_VAR: "value"} # Uses "value" directly
|
|
env:
|
|
|
|
# Secrets to inject. Same input format as `env` but for sensitive data.
|
|
secrets:
|
|
# - HF_TOKEN
|
|
# - WANDB_API_KEY
|
|
```
|