84 lines
2.9 KiB
Markdown
84 lines
2.9 KiB
Markdown
# Axolotl Setup — miaai (RTX 5080, CUDA 13.2)
|
||
|
||
## System Info
|
||
- GPU: NVIDIA RTX 5080 (16GB VRAM)
|
||
- Driver: 580.126.09 — max CUDA 13.0 (nvcc from conda resolves to 13.2)
|
||
- OS: Ubuntu (Python 3.13 system — do NOT use system Python for ML)
|
||
- Axolotl branch: `activeblue/main`
|
||
|
||
## One-time Setup
|
||
|
||
### 1. Install Miniconda
|
||
```bash
|
||
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
|
||
bash miniconda.sh -b -p /opt/miniconda3
|
||
/opt/miniconda3/bin/conda init bash
|
||
source ~/.bashrc
|
||
```
|
||
|
||
### 2. Create Python 3.11 environment
|
||
```bash
|
||
conda create -n axolotl python=3.11 -y
|
||
conda activate axolotl
|
||
```
|
||
|
||
### 3. Clone and sync repo with upstream
|
||
```bash
|
||
git clone https://git.activeblue.net/tocmo0nlord/axolotl.git
|
||
cd axolotl
|
||
git remote add upstream https://github.com/axolotl-ai-cloud/axolotl.git
|
||
git fetch upstream
|
||
git rebase upstream/main # keeps activeblue patches on top
|
||
git push origin activeblue/main --force-with-lease
|
||
```
|
||
|
||
### 4. Install CUDA toolkit (needed to compile flash-attn)
|
||
```bash
|
||
conda install -y -c "nvidia/label/cuda-12.8.0" cuda-toolkit
|
||
export CUDA_HOME=$CONDA_PREFIX
|
||
export PATH=$CUDA_HOME/bin:$PATH
|
||
```
|
||
|
||
### 5. Install PyTorch — use cu132 (matches nvcc from conda)
|
||
> NOTE: torchaudio has no cu132 wheel — skip it, not needed for LLM training
|
||
```bash
|
||
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu132
|
||
python -c "import torch; print('CUDA:', torch.version.cuda); print('GPU:', torch.cuda.get_device_name(0))"
|
||
```
|
||
|
||
### 6. Install Axolotl
|
||
```bash
|
||
pip install -e "."
|
||
```
|
||
|
||
> **flash-attn compiles CUDA kernels from source — takes 15–25 min on 10 cores of i7-14700K.**
|
||
> Always set `MAX_JOBS` to the number of available CPU cores to parallelize and speed up compilation:
|
||
```bash
|
||
MAX_JOBS=10 pip install flash-attn --no-build-isolation
|
||
```
|
||
|
||
## Every Session (after first-time setup)
|
||
```bash
|
||
export PATH="/opt/miniconda3/bin:$PATH"
|
||
conda activate axolotl
|
||
export CUDA_HOME=$CONDA_PREFIX
|
||
export PATH=$CUDA_HOME/bin:$PATH
|
||
cd /home/tocmo0nlord/axolotl
|
||
```
|
||
|
||
## Run Training
|
||
```bash
|
||
axolotl train human_chat_qlora.yml
|
||
```
|
||
|
||
## Common Pitfalls Encountered
|
||
| Problem | Cause | Fix |
|
||
|---|---|---|
|
||
| `externally-managed-environment` | System Python 3.13 blocks pip | Use conda env, never system pip |
|
||
| `No module named torch` (flash-attn) | pip builds in isolated env | Use `--no-build-isolation` |
|
||
| `CUDA_HOME not set` | CUDA toolkit not installed | `conda install cuda-toolkit` from nvidia channel |
|
||
| `CUDA version mismatch 13.2 vs 12.8` | Conda nvcc is 13.2, torch was cu128 | Reinstall torch with `--index-url .../cu132` |
|
||
| `torchaudio` not found for cu132 | No cu132 wheel exists | Skip torchaudio — not needed |
|
||
| `src refspec main does not match` | Fork default branch is `activeblue/main` | `git push origin activeblue/main` |
|
||
| flash-attn compile is slow | Single-threaded by default | Set `MAX_JOBS=<cpu_count>` before pip install |
|