Files
llm-trainer/README.md

6.4 KiB

LLM Trainer

A web-based interface for building fine-tuning datasets and training LLMs on a remote GPU server. The frontend connects to a FastAPI backend that SSHes into your GPU machine, runs the synthetic-data-kit pipeline, and streams live output back to the browser.

Architecture

Browser (React/Vite)
    │
    ▼
FastAPI Backend (Docker, port 8080)
    │  REST + WebSocket
    ▼
Remote GPU Server (SSH)
    ├── synthetic-data-kit  →  parse / generate / curate / export
    └── train.py            →  fine-tuning run

Ollama (port 11434 on the GPU server) is used for model management — pulling, listing, and deleting models.

Pipeline stages

Stage Directory Description
input /opt/synthetic/…/data/input Raw source documents
parsed /opt/synthetic/…/data/parsed Ingested plain text
generated /opt/synthetic/…/data/generated Raw QA pairs
curated /opt/synthetic/…/data/curated Filtered pairs (quality threshold)
final /opt/synthetic/…/data/final Export-ready JSONL/CSV

Getting started

Prerequisites

  • A remote machine with:
    • SSH access
    • miniconda3 with a synthetic-data conda env containing synthetic-data-kit
    • train.py at /opt/synthetic/train.py
    • Ollama running on port 11434

Option A — Docker (quickest)

Additional requirements: Docker + Docker Compose

docker compose up --build
Service URL
Frontend http://localhost:3000
Backend API http://localhost:8080
API docs http://localhost:8080/docs

The OLLAMA_URL environment variable in docker-compose.yml defaults to http://192.168.2.47:11434 — update it to point to your GPU server.


Option B — Install as a system package (Debian / Ubuntu)

The package is published to the local Gitea registry and installs the FastAPI backend as a systemd service with nginx serving the frontend on port 3000.

1. Add the apt source (skip this step if you already have the Gitea registry in your sources from another package — e.g. dupfinder.list)

echo "deb [trusted=yes] http://192.168.1.64:3000/api/packages/tocmo0nlord/debian bookworm main" \
  | sudo tee /etc/apt/sources.list.d/llm-trainer.list
sudo apt update

2. Install

sudo apt install llm-trainer

The installer will automatically:

  • Create a llm-trainer system user
  • Install the FastAPI backend under /opt/llm-trainer/ with its own Python venv
  • Enable and start the llm-trainer systemd service
  • Configure nginx to serve the React frontend on port 3000 and proxy /api/ to the backend
Service URL
Frontend http://<server-ip>:3000
Backend API http://<server-ip>:3000/api
API docs http://<server-ip>:8080/docs

3. Configure

Edit /etc/llm-trainer/env to set the Ollama URL for your GPU server:

OLLAMA_URL=http://192.168.2.47:11434

Then restart the service:

sudo systemctl restart llm-trainer

Service management

sudo systemctl status llm-trainer
sudo systemctl restart llm-trainer
sudo journalctl -u llm-trainer -f          # live logs
sudo tail -f /var/log/llm-trainer/backend.log

Uninstall

sudo apt remove llm-trainer      # keep config and logs
sudo apt purge llm-trainer       # remove everything including /opt/llm-trainer

Building the .deb from source

To rebuild the package (e.g. after code changes):

# Install build dependencies
sudo apt install -y git nodejs npm python3 python3-pip python3-venv nginx

# Clone and build
git clone http://192.168.1.64:3000/tocmo0nlord/llm-trainer.git
cd llm-trainer
chmod +x packaging/build-deb.sh
./packaging/build-deb.sh
# Produces llm-trainer_1.0.0_amd64.deb in the repo root

# Install locally
sudo dpkg -i llm-trainer_1.0.0_amd64.deb
sudo apt-get install -f   # resolve any missing runtime deps

# Or upload to the Gitea registry
curl -u tocmo0nlord:<token> --upload-file llm-trainer_1.0.0_amd64.deb \
  http://192.168.1.64:3000/api/packages/tocmo0nlord/debian/pool/bookworm/main/upload

Configuration

The pipeline reads its config from /opt/synthetic/synthetic-data-kit/config.yaml on the remote server. You can edit it live from the Config Editor tab in the UI.

Project structure

├── backend/
│   ├── main.py          # FastAPI app — all REST and WebSocket endpoints
│   ├── pipeline.py      # Command builders for synthetic-data-kit stages
│   ├── ssh_client.py    # Paramiko SSH manager (connect, stream, upload, shell)
│   ├── gpu.py           # nvidia-smi GPU stats
│   ├── requirements.txt
│   └── Dockerfile
├── frontend/
│   ├── src/
│   │   ├── App.jsx
│   │   └── components/
│   │       ├── ConnectionPanel.jsx   # SSH connect / GPU status
│   │       ├── DocumentManager.jsx   # Upload & browse pipeline files
│   │       ├── PipelineRunner.jsx    # Run ingest → create → curate → save
│   │       ├── QAPairViewer.jsx      # Preview generated QA pairs
│   │       ├── TrainingMonitor.jsx   # Launch training, live log stream
│   │       ├── ModelManager.jsx      # Pull / delete Ollama models
│   │       ├── ConfigEditor.jsx      # Edit remote config.yaml
│   │       └── Terminal.jsx          # Interactive SSH terminal (xterm.js)
│   └── Dockerfile
├── packaging/
│   └── build-deb.sh     # Build a .deb installer
└── docker-compose.yml

API reference

Key endpoints (full docs at /docs):

Method Path Description
POST /api/connect Open SSH connection
GET /api/status Connection + GPU status
GET /api/files/{stage} List files at a pipeline stage
POST /api/upload Upload a file to the input stage
WS /api/pipeline/ingest Stream ingest (parse) output
WS /api/pipeline/create Stream QA pair generation
WS /api/pipeline/curate Stream curation / filtering
WS /api/pipeline/save Stream export to JSONL/CSV
WS /api/train Stream fine-tuning run
WS /api/terminal Interactive SSH shell
GET /api/models List Ollama models
WS /api/models/pull Pull an Ollama model