Files

tocmo0nlord ec9c42618b docs: add Debian apt installation instructions

2026-04-25 23:33:13 +00:00

6.3 KiB

Raw Blame History

LLM Trainer

A web-based interface for building fine-tuning datasets and training LLMs on a remote GPU server. The frontend connects to a FastAPI backend that SSHes into your GPU machine, runs the synthetic-data-kit pipeline, and streams live output back to the browser.

Architecture

Browser (React/Vite)
    │
    ▼
FastAPI Backend (Docker, port 8080)
    │  REST + WebSocket
    ▼
Remote GPU Server (SSH)
    ├── synthetic-data-kit  →  parse / generate / curate / export
    └── train.py            →  fine-tuning run

Ollama (port 11434 on the GPU server) is used for model management — pulling, listing, and deleting models.

Pipeline stages

Stage	Directory	Description
`input`	`/opt/synthetic/…/data/input`	Raw source documents
`parsed`	`/opt/synthetic/…/data/parsed`	Ingested plain text
`generated`	`/opt/synthetic/…/data/generated`	Raw QA pairs
`curated`	`/opt/synthetic/…/data/curated`	Filtered pairs (quality threshold)
`final`	`/opt/synthetic/…/data/final`	Export-ready JSONL/CSV

Getting started

Prerequisites

A remote machine with:
- SSH access
- miniconda3 with a synthetic-data conda env containing synthetic-data-kit
- train.py at /opt/synthetic/train.py
- Ollama running on port 11434

Option A — Docker (quickest)

Additional requirements: Docker + Docker Compose

docker compose up --build

Service	URL
Frontend	http://localhost:3000
Backend API	http://localhost:8080
API docs	http://localhost:8080/docs

The OLLAMA_URL environment variable in docker-compose.yml defaults to http://192.168.2.47:11434 — update it to point to your GPU server.

Option B — Install as a system package (Debian / Ubuntu)

The package is published to the local Gitea registry and installs the FastAPI backend as a systemd service with nginx serving the frontend on port 3000.

1. Add the apt source

echo "deb [trusted=yes] http://192.168.1.64:3000/api/packages/tocmo0nlord/debian bookworm main" \
  | sudo tee /etc/apt/sources.list.d/llm-trainer.list
sudo apt update

2. Install

sudo apt install llm-trainer

The installer will automatically:

Create a llm-trainer system user
Install the FastAPI backend under /opt/llm-trainer/ with its own Python venv
Enable and start the llm-trainer systemd service
Configure nginx to serve the React frontend on port 3000 and proxy /api/ to the backend

Service	URL
Frontend	`http://<server-ip>:3000`
Backend API	`http://<server-ip>:3000/api`
API docs	`http://<server-ip>:8080/docs`

3. Configure

Edit /etc/llm-trainer/env to set the Ollama URL for your GPU server:

OLLAMA_URL=http://192.168.2.47:11434

Then restart the service:

sudo systemctl restart llm-trainer

Service management

sudo systemctl status llm-trainer
sudo systemctl restart llm-trainer
sudo journalctl -u llm-trainer -f          # live logs
sudo tail -f /var/log/llm-trainer/backend.log

Uninstall

sudo apt remove llm-trainer      # keep config and logs
sudo apt purge llm-trainer       # remove everything including /opt/llm-trainer

Building the .deb from source

To rebuild the package (e.g. after code changes):

# Install build dependencies
sudo apt install -y git nodejs npm python3 python3-pip python3-venv nginx

# Clone and build
git clone http://192.168.1.64:3000/tocmo0nlord/llm-trainer.git
cd llm-trainer
chmod +x packaging/build-deb.sh
./packaging/build-deb.sh
# Produces llm-trainer_1.0.0_amd64.deb in the repo root

# Install locally
sudo dpkg -i llm-trainer_1.0.0_amd64.deb
sudo apt-get install -f   # resolve any missing runtime deps

# Or upload to the Gitea registry
curl -u tocmo0nlord:<token> --upload-file llm-trainer_1.0.0_amd64.deb \
  http://192.168.1.64:3000/api/packages/tocmo0nlord/debian/pool/bookworm/main/upload

Configuration

The pipeline reads its config from /opt/synthetic/synthetic-data-kit/config.yaml on the remote server. You can edit it live from the Config Editor tab in the UI.

Project structure

├── backend/
│   ├── main.py          # FastAPI app — all REST and WebSocket endpoints
│   ├── pipeline.py      # Command builders for synthetic-data-kit stages
│   ├── ssh_client.py    # Paramiko SSH manager (connect, stream, upload, shell)
│   ├── gpu.py           # nvidia-smi GPU stats
│   ├── requirements.txt
│   └── Dockerfile
├── frontend/
│   ├── src/
│   │   ├── App.jsx
│   │   └── components/
│   │       ├── ConnectionPanel.jsx   # SSH connect / GPU status
│   │       ├── DocumentManager.jsx   # Upload & browse pipeline files
│   │       ├── PipelineRunner.jsx    # Run ingest → create → curate → save
│   │       ├── QAPairViewer.jsx      # Preview generated QA pairs
│   │       ├── TrainingMonitor.jsx   # Launch training, live log stream
│   │       ├── ModelManager.jsx      # Pull / delete Ollama models
│   │       ├── ConfigEditor.jsx      # Edit remote config.yaml
│   │       └── Terminal.jsx          # Interactive SSH terminal (xterm.js)
│   └── Dockerfile
├── packaging/
│   └── build-deb.sh     # Build a .deb installer
└── docker-compose.yml

API reference

Key endpoints (full docs at /docs):

Method	Path	Description
`POST`	`/api/connect`	Open SSH connection
`GET`	`/api/status`	Connection + GPU status
`GET`	`/api/files/{stage}`	List files at a pipeline stage
`POST`	`/api/upload`	Upload a file to the `input` stage
`WS`	`/api/pipeline/ingest`	Stream ingest (parse) output
`WS`	`/api/pipeline/create`	Stream QA pair generation
`WS`	`/api/pipeline/curate`	Stream curation / filtering
`WS`	`/api/pipeline/save`	Stream export to JSONL/CSV
`WS`	`/api/train`	Stream fine-tuning run
`WS`	`/api/terminal`	Interactive SSH shell
`GET`	`/api/models`	List Ollama models
`WS`	`/api/models/pull`	Pull an Ollama model

6.3 KiB Raw Blame History