Rewrite README install instructions for end users
Lay out the three install paths (Windows installer, .deb package, manual docker compose) with concrete numbered steps and a 'pick your method' table at the top so users don't have to read past their own platform. Add a using-it walkthrough, a scan-mode explanation, and a short troubleshooting section.
This commit is contained in:
157
README.md
157
README.md
@@ -1,56 +1,157 @@
|
||||
# Duplicate Finder
|
||||
|
||||
A self-hosted Docker web app that scans a photo/video library, detects duplicates using four methods, and lets you review them in a gallery UI. **No files are ever moved, renamed, or deleted** — all decisions are recorded in SQLite only.
|
||||
Self-hosted web app that scans your photo and video library, finds duplicates four different ways, and lets you review them in a browser. **It never moves, renames, or deletes anything** — every decision is recorded in a SQLite database. A separate tool (coming later) will act on those decisions.
|
||||
|
||||
## Quick start
|
||||
> Once installed, open **http://localhost:8765** in any browser to use it.
|
||||
|
||||
---
|
||||
|
||||
## Pick your install method
|
||||
|
||||
| You have… | Use this |
|
||||
|---|---|
|
||||
| **Windows 10/11** | [Windows installer](#windows-1011) (one PowerShell command) |
|
||||
| **Debian / Ubuntu / Proxmox LXC** | [.deb package](#debian--ubuntu--proxmox) (`apt install`) |
|
||||
| **Anything else with Docker** | [Docker Compose](#manual-docker-compose) (manual) |
|
||||
|
||||
All three installs end up running the same Docker container.
|
||||
|
||||
---
|
||||
|
||||
### Windows 10/11
|
||||
|
||||
**What you need:** Docker Desktop (the installer will check for it and offer to download).
|
||||
|
||||
1. Download the latest release zip from the Gitea **Releases** page and extract it anywhere.
|
||||
2. Right-click `installer\install.ps1` → **Run with PowerShell** (or open an elevated PowerShell and run it).
|
||||
3. When prompted, type the path to your photos folder (e.g. `D:\Photos`) and a folder for the database (default is fine).
|
||||
4. The installer starts the container and puts a **DupFinder** shortcut on your desktop.
|
||||
|
||||
**Day-to-day use:** double-click the desktop shortcut, or browse to http://localhost:8765.
|
||||
|
||||
**Uninstall:** run `installer\uninstall.ps1` as administrator.
|
||||
|
||||
---
|
||||
|
||||
### Debian / Ubuntu / Proxmox
|
||||
|
||||
**What you need:** Docker Engine. If you don't have it: `curl -fsSL https://get.docker.com | sh`.
|
||||
|
||||
```bash
|
||||
# 1. Edit docker-compose.yml — set your photos volume path
|
||||
# 2. Build and run
|
||||
docker compose up -d --build
|
||||
# 3. Open http://localhost:8765
|
||||
# 4. Enter folder path in UI and click Scan
|
||||
# 1. Install the package
|
||||
wget http://192.168.1.64:3000/tocmo0nlord/-/packages/debian/dupfinder/1.0.0/files/amd64/dupfinder_1.0.0_amd64.deb
|
||||
sudo apt install ./dupfinder_1.0.0_amd64.deb
|
||||
|
||||
# 2. Run first-time setup (asks for photos path + data path)
|
||||
sudo dupfinder setup
|
||||
|
||||
# 3. Start it
|
||||
sudo dupfinder start
|
||||
```
|
||||
|
||||
## Volume mounts
|
||||
**Manage the service:**
|
||||
|
||||
| Container path | Purpose |
|
||||
| Command | What it does |
|
||||
|---|---|
|
||||
| `/photos` | Your photo library — mounted **read-only** |
|
||||
| `/data` | SQLite database persistence |
|
||||
| `sudo dupfinder start` | Start the container |
|
||||
| `sudo dupfinder stop` | Stop the container |
|
||||
| `sudo dupfinder restart` | Restart |
|
||||
| `sudo dupfinder status` | Show systemd status |
|
||||
| `sudo dupfinder logs` | Tail the logs |
|
||||
| `dupfinder open` | Open in your default browser |
|
||||
|
||||
Edit `docker-compose.yml` to point these at your NAS paths.
|
||||
The service auto-starts on boot via systemd (`dupfinder.service`).
|
||||
|
||||
## Detection methods
|
||||
**Uninstall:** `sudo apt remove dupfinder` (your photos and database are left untouched).
|
||||
|
||||
| Method | Color | Description |
|
||||
---
|
||||
|
||||
### Manual Docker Compose
|
||||
|
||||
For NAS appliances (Synology, Unraid, TrueNAS), Mac, or any host where you'd rather wire it up yourself.
|
||||
|
||||
1. Clone the repo:
|
||||
```bash
|
||||
git clone http://192.168.1.64:3000/tocmo0nlord/duplicate-finder.git
|
||||
cd duplicate-finder
|
||||
```
|
||||
2. Open `docker-compose.yml` and change the two volume paths under `dup-finder:`:
|
||||
```yaml
|
||||
volumes:
|
||||
- /your/photos/path:/photos:ro # ← your photo library (read-only)
|
||||
- /your/data/path:/data # ← where the SQLite DB lives
|
||||
```
|
||||
3. Build and start:
|
||||
```bash
|
||||
docker compose up -d --build
|
||||
```
|
||||
4. Open http://localhost:8765.
|
||||
|
||||
To stop: `docker compose down`. To update later: `git pull && docker compose up -d --build`.
|
||||
|
||||
> **GPU acceleration (optional):** the compose file requests an NVIDIA GPU for faster perceptual hashing. If you don't have one, delete the `deploy.resources.reservations.devices` block — the app falls back to CPU automatically.
|
||||
|
||||
---
|
||||
|
||||
## Using it
|
||||
|
||||
1. Open http://localhost:8765.
|
||||
2. Click **Browse** and pick the folder you want to scan (it's relative to the container — usually just `/photos`).
|
||||
3. Pick a scan mode (see below) and click **Scan**.
|
||||
4. When it finishes, review the duplicate groups. Each group shows the suggested keeper highlighted; click any other photo to pick it instead, or **Keep all** to skip the group.
|
||||
5. When you're done, click **Download CSV** to export all decisions.
|
||||
|
||||
### Scan modes
|
||||
|
||||
| Mode | When to use |
|
||||
|---|---|
|
||||
| **Incremental** *(default)* | Day-to-day rescans. Re-hashes only changed/new files. Past review decisions are preserved. |
|
||||
| **New files only** | Fastest option. Indexes only files added since the last scan. |
|
||||
| **Rebuild groups** | Re-runs duplicate detection on the existing index without re-hashing. |
|
||||
| **Full reset** | Wipes the entire index and starts from scratch. |
|
||||
|
||||
### Detection methods
|
||||
|
||||
| Method | UI color | What it catches |
|
||||
|---|---|---|
|
||||
| SHA-256 | Blue | Byte-identical files |
|
||||
| Perceptual hash | Purple | Visually similar photos (hamming ≤ 10) |
|
||||
| EXIF timestamp + device | Amber | Same camera, same moment |
|
||||
| File size + dimensions | Gray | Same size and resolution (low confidence) |
|
||||
| **SHA-256** | Blue | Byte-identical files |
|
||||
| **Perceptual hash** | Purple | Visually similar photos (hamming ≤ 10) |
|
||||
| **EXIF timestamp + device** | Amber | Same camera, same moment |
|
||||
| **File size + dimensions** | Gray | Same size and resolution (low confidence) |
|
||||
|
||||
## Scan modes
|
||||
### Google Takeout
|
||||
|
||||
| Mode | Description |
|
||||
|---|---|
|
||||
| Incremental | Only re-hashes changed/new files. Prior decisions preserved. |
|
||||
| New files only | Indexes newly added files. Existing decisions untouched. |
|
||||
| Rebuild groups | Re-runs detection on existing index. No re-hashing. |
|
||||
| Full reset | Wipes everything and scans from scratch. |
|
||||
Point it at a Google Photos Takeout export and it auto-detects the structure, reads the `.json` sidecars, and restores the correct capture timestamps and original filenames. Takeout files get a flag in the UI.
|
||||
|
||||
## Google Takeout
|
||||
---
|
||||
|
||||
The scanner automatically detects Google Takeout folder structures and reads `.json` sidecar files to restore correct capture timestamps and original filenames. Takeout files are flagged in the UI.
|
||||
## Troubleshooting
|
||||
|
||||
**The page won't load at http://localhost:8765**
|
||||
Check the container is up: `docker ps | grep dup-finder`. If not, see the logs: `docker compose logs dup-finder` (or `sudo dupfinder logs` on Debian).
|
||||
|
||||
**"Permission denied" reading photos**
|
||||
The `/photos` mount is read-only by design, but the container still needs read access. Make sure your user (or the docker daemon) can read the folder you mounted.
|
||||
|
||||
**Scan is stuck on "phash"**
|
||||
Perceptual hashing is the slowest phase — large libraries (>50k photos) on CPU can take hours. Add an NVIDIA GPU and the `deploy.resources` block in compose to get a 10-50× speedup.
|
||||
|
||||
**I marked the wrong file as keeper**
|
||||
Open the group again and click **Unreview**, then re-decide.
|
||||
|
||||
---
|
||||
|
||||
## What "redundant" means
|
||||
|
||||
Marking a file redundant **only writes to the database**. Nothing is moved, renamed, or deleted. This tool produces a decision record only. A separate tool handles file actions.
|
||||
When you mark a file redundant, **only the database is updated**. Nothing on disk changes. This tool produces a decision record. A future companion tool will use that record to actually move or delete files.
|
||||
|
||||
---
|
||||
|
||||
## Tech stack
|
||||
|
||||
- Python 3.12, FastAPI, Uvicorn
|
||||
- SQLite (stdlib `sqlite3`)
|
||||
- Pillow, imagehash, pillow-heif
|
||||
- PyTorch + CUDA for batched perceptual hashing
|
||||
- Vanilla JS single-page frontend
|
||||
- Docker / docker-compose
|
||||
|
||||
Reference in New Issue
Block a user