# Duplicate Finder A self-hosted Docker web app that scans a photo/video library, detects duplicates using four methods, and lets you review them in a gallery UI. **No files are ever moved, renamed, or deleted** — all decisions are recorded in SQLite only. ## Quick start ```bash # 1. Edit docker-compose.yml — set your photos volume path # 2. Build and run docker compose up -d --build # 3. Open http://localhost:8765 # 4. Enter folder path in UI and click Scan ``` ## Volume mounts | Container path | Purpose | |---|---| | `/photos` | Your photo library — mounted **read-only** | | `/data` | SQLite database persistence | Edit `docker-compose.yml` to point these at your NAS paths. ## Detection methods | Method | Color | Description | |---|---|---| | SHA-256 | Blue | Byte-identical files | | Perceptual hash | Purple | Visually similar photos (hamming ≤ 10) | | EXIF timestamp + device | Amber | Same camera, same moment | | File size + dimensions | Gray | Same size and resolution (low confidence) | ## Scan modes | Mode | Description | |---|---| | Incremental | Only re-hashes changed/new files. Prior decisions preserved. | | New files only | Indexes newly added files. Existing decisions untouched. | | Rebuild groups | Re-runs detection on existing index. No re-hashing. | | Full reset | Wipes everything and scans from scratch. | ## Google Takeout The scanner automatically detects Google Takeout folder structures and reads `.json` sidecar files to restore correct capture timestamps and original filenames. Takeout files are flagged in the UI. ## What "redundant" means Marking a file redundant **only writes to the database**. Nothing is moved, renamed, or deleted. This tool produces a decision record only. A separate tool handles file actions. ## Tech stack - Python 3.12, FastAPI, Uvicorn - SQLite (stdlib `sqlite3`) - Pillow, imagehash, pillow-heif - Vanilla JS single-page frontend - Docker / docker-compose