Duplicate Finder
Self-hosted web app that scans your photo and video library, finds duplicates four different ways, and lets you review them in a browser. It never moves, renames, or deletes anything — every decision is recorded in a SQLite database. A separate tool (coming later) will act on those decisions.
Once installed, open http://localhost:8765 in any browser to use it.
Pick your install method
| You have… | Use this |
|---|---|
| Windows 10/11 | Windows installer (one PowerShell command) |
| Debian / Ubuntu / Proxmox LXC | .deb package (apt install) |
| Anything else with Docker | Docker Compose (manual) |
All three installs end up running the same Docker container.
Windows 10/11
What you need: Docker Desktop (the installer will check for it and offer to download).
- Download the latest release zip from the Gitea Releases page and extract it anywhere.
- Right-click
installer\install.ps1→ Run with PowerShell (or open an elevated PowerShell and run it). - When prompted, type the path to your photos folder (e.g.
D:\Photos) and a folder for the database (default is fine). - The installer starts the container and puts a DupFinder shortcut on your desktop.
Day-to-day use: double-click the desktop shortcut, or browse to http://localhost:8765.
Uninstall: run installer\uninstall.ps1 as administrator.
Debian / Ubuntu / Proxmox
What you need: Docker Engine. If you don't have it: curl -fsSL https://get.docker.com | sh.
# 1. Add the Gitea apt repo
echo "deb [trusted=yes] http://192.168.1.64:3000/api/packages/tocmo0nlord/debian bookworm main" \
| sudo tee /etc/apt/sources.list.d/dupfinder.list
# 2. Install
sudo apt update
sudo apt install dupfinder
# 3. Run first-time setup (asks for photos path + data path)
sudo dupfinder setup
# 4. Start it
sudo dupfinder start
The repo says
bookworm(Debian 12). For Ubuntu/other distros the package still works — the codename in the URL is just how Gitea organizes the registry.
One-shot install without the apt repo:
curl -u tocmo0nlord:<your-token> -O \ http://192.168.1.64:3000/api/packages/tocmo0nlord/debian/pool/bookworm/main/dupfinder_1.0.0_amd64.deb sudo apt install ./dupfinder_1.0.0_amd64.deb
Manage the service:
| Command | What it does |
|---|---|
sudo dupfinder start |
Start the container |
sudo dupfinder stop |
Stop the container |
sudo dupfinder restart |
Restart |
sudo dupfinder status |
Show systemd status |
sudo dupfinder logs |
Tail the logs |
dupfinder open |
Open in your default browser |
The service auto-starts on boot via systemd (dupfinder.service).
Uninstall: sudo apt remove dupfinder (your photos and database are left untouched).
Manual Docker Compose
For NAS appliances (Synology, Unraid, TrueNAS), Mac, or any host where you'd rather wire it up yourself.
- Clone the repo:
git clone http://192.168.1.64:3000/tocmo0nlord/duplicate-finder.git cd duplicate-finder - Open
docker-compose.ymland change the two volume paths underdup-finder::volumes: - /your/photos/path:/photos:ro # ← your photo library (read-only) - /your/data/path:/data # ← where the SQLite DB lives - Build and start:
docker compose up -d --build - Open http://localhost:8765.
To stop: docker compose down. To update later: git pull && docker compose up -d --build.
GPU acceleration (optional): the compose file requests an NVIDIA GPU for faster perceptual hashing. If you don't have one, delete the
deploy.resources.reservations.devicesblock — the app falls back to CPU automatically.
Using it
- Open http://localhost:8765.
- Click Browse and pick the folder you want to scan (it's relative to the container — usually just
/photos). - Pick a scan mode (see below) and click Scan.
- When it finishes, review the duplicate groups. Each group shows the suggested keeper highlighted; click any other photo to pick it instead, or Keep all to skip the group.
- When you're done, click Download CSV to export all decisions.
Scan modes
| Mode | When to use |
|---|---|
| Incremental (default) | Day-to-day rescans. Re-hashes only changed/new files. Past review decisions are preserved. |
| New files only | Fastest option. Indexes only files added since the last scan. |
| Rebuild groups | Re-runs duplicate detection on the existing index without re-hashing. |
| Full reset | Wipes the entire index and starts from scratch. |
Detection methods
| Method | UI color | What it catches |
|---|---|---|
| SHA-256 | Blue | Byte-identical files |
| Perceptual hash | Purple | Visually similar photos (hamming ≤ 10) |
| EXIF timestamp + device | Amber | Same camera, same moment |
| File size + dimensions | Gray | Same size and resolution (low confidence) |
Google Takeout
Point it at a Google Photos Takeout export and it auto-detects the structure, reads the .json sidecars, and restores the correct capture timestamps and original filenames. Takeout files get a flag in the UI.
Troubleshooting
The page won't load at http://localhost:8765
Check the container is up: docker ps | grep dup-finder. If not, see the logs: docker compose logs dup-finder (or sudo dupfinder logs on Debian).
"Permission denied" reading photos
The /photos mount is read-only by design, but the container still needs read access. Make sure your user (or the docker daemon) can read the folder you mounted.
Scan is stuck on "phash"
Perceptual hashing is the slowest phase — large libraries (>50k photos) on CPU can take hours. Add an NVIDIA GPU and the deploy.resources block in compose to get a 10-50× speedup.
I marked the wrong file as keeper Open the group again and click Unreview, then re-decide.
What "redundant" means
When you mark a file redundant, only the database is updated. Nothing on disk changes. This tool produces a decision record. A future companion tool will use that record to actually move or delete files.
Tech stack
- Python 3.12, FastAPI, Uvicorn
- SQLite (stdlib
sqlite3) - Pillow, imagehash, pillow-heif
- PyTorch + CUDA for batched perceptual hashing
- Vanilla JS single-page frontend
- Docker / docker-compose