Instead of walk-everything-first then index, workers now receive files
the instant os.walk yields them. The thread pool is open before the
walk starts; each discovered file is submitted immediately. Completed
futures are drained after each directory to keep memory flat.
Progress message shows:
"Discovering & indexing (8w): 1,234 — 5,678 found so far"
then once walk finishes:
"Indexing (8w): 8,000 / 9,100"
UI: merged Discovery + Indexing into a single "Discover + Index" phase pill.
Indeterminate progress bar stays on until total file count is known.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace single-threaded indexing loop with ThreadPoolExecutor.
Default workers = min(cpu_count*2, 16), tunable via DUPFINDER_WORKERS
env var. Pre-loads all existing DB records in one query instead of
N per-file queries. Progress message shows worker count and live
done/total count. Skipped files bulk-stamped in batches of 500.
On an 8-core machine over NAS: ~4-8x faster indexing phase.
On NVMe: up to 16x faster with 16 workers.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
debian/control, postinst, prerm, postrm — standard dpkg package lifecycle
debian/files/opt/dupfinder/dupfinder-setup.sh — interactive setup:
checks Docker, detects NVIDIA GPU, prompts for photos/data paths,
writes docker-compose.override.yml with GPU reservation if present,
pulls image from registry (builds from source as fallback)
debian/files/usr/local/bin/dupfinder — CLI wrapper:
setup / start / stop / restart / status / logs / open / update
debian/files/etc/systemd/system/dupfinder.service — systemd unit,
guards against starting before setup has run
debian/build-deb.sh — builds .deb and uploads to Gitea package registry;
prints the exact apt sources.list line on success
Install on any Debian/Ubuntu machine:
echo "deb [trusted=yes] http://192.168.1.64:3000/api/packages/tocmo0nlord/debian bookworm main" \
| sudo tee /etc/apt/sources.list.d/dupfinder.list
sudo apt update && sudo apt install dupfinder
sudo dupfinder setup
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
GPU:
- Switch Dockerfile base to pytorch/pytorch:2.3.1-cuda12.1-cudnn8-runtime
- Add gpu_hasher.py: batched 2D DCT on GPU via PyTorch matrix multiply,
256 images/batch, produces imagehash-compatible 64-bit hex hashes,
auto-falls back to CPU when CUDA unavailable
- Replace per-image phash loop in scanner.py with phasher.hash_files()
- docker-compose.yml: add nvidia GPU device reservation
Hang fix:
- takeout.is_takeout_folder() now caps at 50 directories (was walking
entire tree — blocked for minutes on 65k+ file libraries)
- Add "Not a Takeout folder" status message so takeout phase is never silent
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- build-release.ps1: builds Docker image, saves to tar, bundles
everything into dist\ ready to copy to a flash drive
- installer/install.ps1: checks WSL2, Docker Desktop, loads image
(or builds from source as fallback), prompts for photo/data paths,
writes docker-compose.override.yml, starts container, creates
desktop shortcut
- installer/uninstall.ps1: stops container, optionally removes image
and data, removes shortcut and app directory
- installer/dupfinder-start-stop.ps1: start/stop/restart/open helper
copied to target machine during install; desktop shortcut uses -Action open
which polls until the app is responsive before launching browser
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Scanner now updates message every 250 files during os.walk so the UI
shows a live count. Progress bar switches to an indeterminate animated
pulse during discovery and takeout phases (no known total yet), then
reverts to a normal percentage bar once indexing begins.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New GET /api/browse endpoint lists subdirectories at any path.
UI gets a folder icon button next to each path input that opens
a browsable directory tree modal. Escape or Cancel closes it,
clicking a folder navigates into it, Select confirms the choice.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>