Files

tocmo0nlord a07a8df25d docs: note shared apt source entry across packages

2026-04-25 23:48:45 +00:00

6.4 KiB

Raw Permalink Blame History

LLM Trainer

A web-based interface for building fine-tuning datasets and training LLMs on a remote GPU server. The frontend connects to a FastAPI backend that SSHes into your GPU machine, runs the synthetic-data-kit pipeline, and streams live output back to the browser.

Architecture

Browser (React/Vite)
    │
    ▼
FastAPI Backend (Docker, port 8080)
    │  REST + WebSocket
    ▼
Remote GPU Server (SSH)
    ├── synthetic-data-kit  →  parse / generate / curate / export
    └── train.py            →  fine-tuning run

Ollama (port 11434 on the GPU server) is used for model management — pulling, listing, and deleting models.

Pipeline stages

Stage	Directory	Description
`input`	`/opt/synthetic/…/data/input`	Raw source documents
`parsed`	`/opt/synthetic/…/data/parsed`	Ingested plain text
`generated`	`/opt/synthetic/…/data/generated`	Raw QA pairs
`curated`	`/opt/synthetic/…/data/curated`	Filtered pairs (quality threshold)
`final`	`/opt/synthetic/…/data/final`	Export-ready JSONL/CSV

Getting started

Prerequisites

A remote machine with:
- SSH access
- miniconda3 with a synthetic-data conda env containing synthetic-data-kit
- train.py at /opt/synthetic/train.py
- Ollama running on port 11434

Option A — Docker (quickest)

Additional requirements: Docker + Docker Compose

docker compose up --build

Service	URL
Frontend	http://localhost:3000
Backend API	http://localhost:8080
API docs	http://localhost:8080/docs

The OLLAMA_URL environment variable in docker-compose.yml defaults to http://192.168.2.47:11434 — update it to point to your GPU server.

Option B — Install as a system package (Debian / Ubuntu)

The package is published to the local Gitea registry and installs the FastAPI backend as a systemd service with nginx serving the frontend on port 3000.

1. Add the apt source (skip this step if you already have the Gitea registry in your sources from another package — e.g. dupfinder.list)

echo "deb [trusted=yes] http://192.168.1.64:3000/api/packages/tocmo0nlord/debian bookworm main" \
  | sudo tee /etc/apt/sources.list.d/llm-trainer.list
sudo apt update

2. Install

sudo apt install llm-trainer

The installer will automatically:

Create a llm-trainer system user
Install the FastAPI backend under /opt/llm-trainer/ with its own Python venv
Enable and start the llm-trainer systemd service
Configure nginx to serve the React frontend on port 3000 and proxy /api/ to the backend

Service	URL
Frontend	`http://<server-ip>:3000`
Backend API	`http://<server-ip>:3000/api`
API docs	`http://<server-ip>:8080/docs`

3. Configure

Edit /etc/llm-trainer/env to set the Ollama URL for your GPU server:

OLLAMA_URL=http://192.168.2.47:11434

Then restart the service:

sudo systemctl restart llm-trainer

Service management

sudo systemctl status llm-trainer
sudo systemctl restart llm-trainer
sudo journalctl -u llm-trainer -f          # live logs
sudo tail -f /var/log/llm-trainer/backend.log

Uninstall

sudo apt remove llm-trainer      # keep config and logs
sudo apt purge llm-trainer       # remove everything including /opt/llm-trainer

Building the .deb from source

To rebuild the package (e.g. after code changes):

# Install build dependencies
sudo apt install -y git nodejs npm python3 python3-pip python3-venv nginx

# Clone and build
git clone http://192.168.1.64:3000/tocmo0nlord/llm-trainer.git
cd llm-trainer
chmod +x packaging/build-deb.sh
./packaging/build-deb.sh
# Produces llm-trainer_1.0.0_amd64.deb in the repo root

# Install locally
sudo dpkg -i llm-trainer_1.0.0_amd64.deb
sudo apt-get install -f   # resolve any missing runtime deps

# Or upload to the Gitea registry
curl -u tocmo0nlord:<token> --upload-file llm-trainer_1.0.0_amd64.deb \
  http://192.168.1.64:3000/api/packages/tocmo0nlord/debian/pool/bookworm/main/upload

Configuration

The pipeline reads its config from /opt/synthetic/synthetic-data-kit/config.yaml on the remote server. You can edit it live from the Config Editor tab in the UI.

Project structure

├── backend/
│   ├── main.py          # FastAPI app — all REST and WebSocket endpoints
│   ├── pipeline.py      # Command builders for synthetic-data-kit stages
│   ├── ssh_client.py    # Paramiko SSH manager (connect, stream, upload, shell)
│   ├── gpu.py           # nvidia-smi GPU stats
│   ├── requirements.txt
│   └── Dockerfile
├── frontend/
│   ├── src/
│   │   ├── App.jsx
│   │   └── components/
│   │       ├── ConnectionPanel.jsx   # SSH connect / GPU status
│   │       ├── DocumentManager.jsx   # Upload & browse pipeline files
│   │       ├── PipelineRunner.jsx    # Run ingest → create → curate → save
│   │       ├── QAPairViewer.jsx      # Preview generated QA pairs
│   │       ├── TrainingMonitor.jsx   # Launch training, live log stream
│   │       ├── ModelManager.jsx      # Pull / delete Ollama models
│   │       ├── ConfigEditor.jsx      # Edit remote config.yaml
│   │       └── Terminal.jsx          # Interactive SSH terminal (xterm.js)
│   └── Dockerfile
├── packaging/
│   └── build-deb.sh     # Build a .deb installer
└── docker-compose.yml

API reference

Key endpoints (full docs at /docs):

Method	Path	Description
`POST`	`/api/connect`	Open SSH connection
`GET`	`/api/status`	Connection + GPU status
`GET`	`/api/files/{stage}`	List files at a pipeline stage
`POST`	`/api/upload`	Upload a file to the `input` stage
`WS`	`/api/pipeline/ingest`	Stream ingest (parse) output
`WS`	`/api/pipeline/create`	Stream QA pair generation
`WS`	`/api/pipeline/curate`	Stream curation / filtering
`WS`	`/api/pipeline/save`	Stream export to JSONL/CSV
`WS`	`/api/train`	Stream fine-tuning run
`WS`	`/api/terminal`	Interactive SSH shell
`GET`	`/api/models`	List Ollama models
`WS`	`/api/models/pull`	Pull an Ollama model

6.4 KiB Raw Permalink Blame History