199 lines
6.4 KiB
Markdown
199 lines
6.4 KiB
Markdown
# LLM Trainer
|
|
|
|
A web-based interface for building fine-tuning datasets and training LLMs on a remote GPU server. The frontend connects to a FastAPI backend that SSHes into your GPU machine, runs the [synthetic-data-kit](https://github.com/anthropics/synthetic-data-kit) pipeline, and streams live output back to the browser.
|
|
|
|
## Architecture
|
|
|
|
```
|
|
Browser (React/Vite)
|
|
│
|
|
▼
|
|
FastAPI Backend (Docker, port 8080)
|
|
│ REST + WebSocket
|
|
▼
|
|
Remote GPU Server (SSH)
|
|
├── synthetic-data-kit → parse / generate / curate / export
|
|
└── train.py → fine-tuning run
|
|
```
|
|
|
|
Ollama (port 11434 on the GPU server) is used for model management — pulling, listing, and deleting models.
|
|
|
|
## Pipeline stages
|
|
|
|
| Stage | Directory | Description |
|
|
|-------|-----------|-------------|
|
|
| `input` | `/opt/synthetic/…/data/input` | Raw source documents |
|
|
| `parsed` | `/opt/synthetic/…/data/parsed` | Ingested plain text |
|
|
| `generated` | `/opt/synthetic/…/data/generated` | Raw QA pairs |
|
|
| `curated` | `/opt/synthetic/…/data/curated` | Filtered pairs (quality threshold) |
|
|
| `final` | `/opt/synthetic/…/data/final` | Export-ready JSONL/CSV |
|
|
|
|
## Getting started
|
|
|
|
### Prerequisites
|
|
|
|
- A remote machine with:
|
|
- SSH access
|
|
- `miniconda3` with a `synthetic-data` conda env containing `synthetic-data-kit`
|
|
- `train.py` at `/opt/synthetic/train.py`
|
|
- Ollama running on port `11434`
|
|
|
|
---
|
|
|
|
### Option A — Docker (quickest)
|
|
|
|
**Additional requirements:** Docker + Docker Compose
|
|
|
|
```bash
|
|
docker compose up --build
|
|
```
|
|
|
|
| Service | URL |
|
|
|---------|-----|
|
|
| Frontend | http://localhost:3000 |
|
|
| Backend API | http://localhost:8080 |
|
|
| API docs | http://localhost:8080/docs |
|
|
|
|
The `OLLAMA_URL` environment variable in `docker-compose.yml` defaults to `http://192.168.2.47:11434` — update it to point to your GPU server.
|
|
|
|
---
|
|
|
|
### Option B — Install as a system package (Debian / Ubuntu)
|
|
|
|
The package is published to the local Gitea registry and installs the FastAPI backend as a systemd service with nginx serving the frontend on port **3000**.
|
|
|
|
**1. Add the apt source** *(skip this step if you already have the Gitea registry in your sources from another package — e.g. `dupfinder.list`)*
|
|
|
|
```bash
|
|
echo "deb [trusted=yes] http://192.168.1.64:3000/api/packages/tocmo0nlord/debian bookworm main" \
|
|
| sudo tee /etc/apt/sources.list.d/llm-trainer.list
|
|
sudo apt update
|
|
```
|
|
|
|
**2. Install**
|
|
|
|
```bash
|
|
sudo apt install llm-trainer
|
|
```
|
|
|
|
The installer will automatically:
|
|
- Create a `llm-trainer` system user
|
|
- Install the FastAPI backend under `/opt/llm-trainer/` with its own Python venv
|
|
- Enable and start the `llm-trainer` systemd service
|
|
- Configure nginx to serve the React frontend on port **3000** and proxy `/api/` to the backend
|
|
|
|
| Service | URL |
|
|
|---------|-----|
|
|
| Frontend | `http://<server-ip>:3000` |
|
|
| Backend API | `http://<server-ip>:3000/api` |
|
|
| API docs | `http://<server-ip>:8080/docs` |
|
|
|
|
**3. Configure**
|
|
|
|
Edit `/etc/llm-trainer/env` to set the Ollama URL for your GPU server:
|
|
|
|
```ini
|
|
OLLAMA_URL=http://192.168.2.47:11434
|
|
```
|
|
|
|
Then restart the service:
|
|
|
|
```bash
|
|
sudo systemctl restart llm-trainer
|
|
```
|
|
|
|
**Service management**
|
|
|
|
```bash
|
|
sudo systemctl status llm-trainer
|
|
sudo systemctl restart llm-trainer
|
|
sudo journalctl -u llm-trainer -f # live logs
|
|
sudo tail -f /var/log/llm-trainer/backend.log
|
|
```
|
|
|
|
**Uninstall**
|
|
|
|
```bash
|
|
sudo apt remove llm-trainer # keep config and logs
|
|
sudo apt purge llm-trainer # remove everything including /opt/llm-trainer
|
|
```
|
|
|
|
---
|
|
|
|
### Building the .deb from source
|
|
|
|
To rebuild the package (e.g. after code changes):
|
|
|
|
```bash
|
|
# Install build dependencies
|
|
sudo apt install -y git nodejs npm python3 python3-pip python3-venv nginx
|
|
|
|
# Clone and build
|
|
git clone http://192.168.1.64:3000/tocmo0nlord/llm-trainer.git
|
|
cd llm-trainer
|
|
chmod +x packaging/build-deb.sh
|
|
./packaging/build-deb.sh
|
|
# Produces llm-trainer_1.0.0_amd64.deb in the repo root
|
|
|
|
# Install locally
|
|
sudo dpkg -i llm-trainer_1.0.0_amd64.deb
|
|
sudo apt-get install -f # resolve any missing runtime deps
|
|
|
|
# Or upload to the Gitea registry
|
|
curl -u tocmo0nlord:<token> --upload-file llm-trainer_1.0.0_amd64.deb \
|
|
http://192.168.1.64:3000/api/packages/tocmo0nlord/debian/pool/bookworm/main/upload
|
|
```
|
|
|
|
---
|
|
|
|
### Configuration
|
|
|
|
The pipeline reads its config from `/opt/synthetic/synthetic-data-kit/config.yaml` on the remote server. You can edit it live from the **Config Editor** tab in the UI.
|
|
|
|
## Project structure
|
|
|
|
```
|
|
├── backend/
|
|
│ ├── main.py # FastAPI app — all REST and WebSocket endpoints
|
|
│ ├── pipeline.py # Command builders for synthetic-data-kit stages
|
|
│ ├── ssh_client.py # Paramiko SSH manager (connect, stream, upload, shell)
|
|
│ ├── gpu.py # nvidia-smi GPU stats
|
|
│ ├── requirements.txt
|
|
│ └── Dockerfile
|
|
├── frontend/
|
|
│ ├── src/
|
|
│ │ ├── App.jsx
|
|
│ │ └── components/
|
|
│ │ ├── ConnectionPanel.jsx # SSH connect / GPU status
|
|
│ │ ├── DocumentManager.jsx # Upload & browse pipeline files
|
|
│ │ ├── PipelineRunner.jsx # Run ingest → create → curate → save
|
|
│ │ ├── QAPairViewer.jsx # Preview generated QA pairs
|
|
│ │ ├── TrainingMonitor.jsx # Launch training, live log stream
|
|
│ │ ├── ModelManager.jsx # Pull / delete Ollama models
|
|
│ │ ├── ConfigEditor.jsx # Edit remote config.yaml
|
|
│ │ └── Terminal.jsx # Interactive SSH terminal (xterm.js)
|
|
│ └── Dockerfile
|
|
├── packaging/
|
|
│ └── build-deb.sh # Build a .deb installer
|
|
└── docker-compose.yml
|
|
```
|
|
|
|
## API reference
|
|
|
|
Key endpoints (full docs at `/docs`):
|
|
|
|
| Method | Path | Description |
|
|
|--------|------|-------------|
|
|
| `POST` | `/api/connect` | Open SSH connection |
|
|
| `GET` | `/api/status` | Connection + GPU status |
|
|
| `GET` | `/api/files/{stage}` | List files at a pipeline stage |
|
|
| `POST` | `/api/upload` | Upload a file to the `input` stage |
|
|
| `WS` | `/api/pipeline/ingest` | Stream ingest (parse) output |
|
|
| `WS` | `/api/pipeline/create` | Stream QA pair generation |
|
|
| `WS` | `/api/pipeline/curate` | Stream curation / filtering |
|
|
| `WS` | `/api/pipeline/save` | Stream export to JSONL/CSV |
|
|
| `WS` | `/api/train` | Stream fine-tuning run |
|
|
| `WS` | `/api/terminal` | Interactive SSH shell |
|
|
| `GET` | `/api/models` | List Ollama models |
|
|
| `WS` | `/api/models/pull` | Pull an Ollama model |
|