Axolotl Setup — miaai (RTX 5080, CUDA 13.2)

System Info

GPU: NVIDIA RTX 5080 (16GB VRAM)
Driver: 580.126.09 — max CUDA 13.0 (nvcc from conda resolves to 13.2)
OS: Ubuntu (Python 3.13 system — do NOT use system Python for ML)
Axolotl branch: activeblue/main

One-time Setup

1. Install Miniconda

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
bash miniconda.sh -b -p /opt/miniconda3
/opt/miniconda3/bin/conda init bash
source ~/.bashrc

2. Create Python 3.11 environment

conda create -n axolotl python=3.11 -y
conda activate axolotl

3. Clone and sync repo with upstream

git clone https://git.activeblue.net/tocmo0nlord/axolotl.git
cd axolotl
git remote add upstream https://github.com/axolotl-ai-cloud/axolotl.git
git fetch upstream
git rebase upstream/main        # keeps activeblue patches on top
git push origin activeblue/main --force-with-lease

4. Install CUDA toolkit (needed to compile flash-attn)

conda install -y -c "nvidia/label/cuda-12.8.0" cuda-toolkit
export CUDA_HOME=$CONDA_PREFIX
export PATH=$CUDA_HOME/bin:$PATH

5. Install PyTorch — use cu132 (matches nvcc from conda)

NOTE: torchaudio has no cu132 wheel — skip it, not needed for LLM training

pip install torch torchvision --index-url https://download.pytorch.org/whl/cu132
python -c "import torch; print('CUDA:', torch.version.cuda); print('GPU:', torch.cuda.get_device_name(0))"

6. Install Axolotl

pip install -e "."

flash-attn compiles CUDA kernels from source — takes 15–25 min on 10 cores of i7-14700K. Always set MAX_JOBS to the number of available CPU cores to parallelize and speed up compilation:

MAX_JOBS=10 pip install flash-attn --no-build-isolation

Every Session (after first-time setup)

export PATH="/opt/miniconda3/bin:$PATH"
conda activate axolotl
export CUDA_HOME=$CONDA_PREFIX
export PATH=$CUDA_HOME/bin:$PATH
cd /home/tocmo0nlord/axolotl

Run Training

axolotl train human_chat_qlora.yml

Common Pitfalls Encountered

Problem	Cause	Fix
`externally-managed-environment`	System Python 3.13 blocks pip	Use conda env, never system pip
`No module named torch` (flash-attn)	pip builds in isolated env	Use `--no-build-isolation`
`CUDA_HOME not set`	CUDA toolkit not installed	`conda install cuda-toolkit` from nvidia channel
`CUDA version mismatch 13.2 vs 12.8`	Conda nvcc is 13.2, torch was cu128	Reinstall torch with `--index-url .../cu132`
`torchaudio` not found for cu132	No cu132 wheel exists	Skip torchaudio — not needed
`src refspec main does not match`	Fork default branch is `activeblue/main`	`git push origin activeblue/main`
flash-attn compile is slow	Single-threaded by default	Set `MAX_JOBS=<cpu_count>` before pip install

2.9 KiB Raw Blame History Unescape Escape

Axolotl Setup — miaai (RTX 5080, CUDA 13.2)

System Info

One-time Setup

1. Install Miniconda

2. Create Python 3.11 environment

3. Clone and sync repo with upstream

4. Install CUDA toolkit (needed to compile flash-attn)

5. Install PyTorch — use cu132 (matches nvcc from conda)

6. Install Axolotl

Every Session (after first-time setup)

Run Training

Common Pitfalls Encountered

2.9 KiB

Raw Blame History