Instead of walk-everything-first then index, workers now receive files
the instant os.walk yields them. The thread pool is open before the
walk starts; each discovered file is submitted immediately. Completed
futures are drained after each directory to keep memory flat.
Progress message shows:
"Discovering & indexing (8w): 1,234 — 5,678 found so far"
then once walk finishes:
"Indexing (8w): 8,000 / 9,100"
UI: merged Discovery + Indexing into a single "Discover + Index" phase pill.
Indeterminate progress bar stays on until total file count is known.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace single-threaded indexing loop with ThreadPoolExecutor.
Default workers = min(cpu_count*2, 16), tunable via DUPFINDER_WORKERS
env var. Pre-loads all existing DB records in one query instead of
N per-file queries. Progress message shows worker count and live
done/total count. Skipped files bulk-stamped in batches of 500.
On an 8-core machine over NAS: ~4-8x faster indexing phase.
On NVMe: up to 16x faster with 16 workers.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
GPU:
- Switch Dockerfile base to pytorch/pytorch:2.3.1-cuda12.1-cudnn8-runtime
- Add gpu_hasher.py: batched 2D DCT on GPU via PyTorch matrix multiply,
256 images/batch, produces imagehash-compatible 64-bit hex hashes,
auto-falls back to CPU when CUDA unavailable
- Replace per-image phash loop in scanner.py with phasher.hash_files()
- docker-compose.yml: add nvidia GPU device reservation
Hang fix:
- takeout.is_takeout_folder() now caps at 50 directories (was walking
entire tree — blocked for minutes on 65k+ file libraries)
- Add "Not a Takeout folder" status message so takeout phase is never silent
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Scanner now updates message every 250 files during os.walk so the UI
shows a live count. Progress bar switches to an indeterminate animated
pulse during discovery and takeout phases (no known total yet), then
reverts to a normal percentage bar once indexing begins.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New GET /api/browse endpoint lists subdirectories at any path.
UI gets a folder icon button next to each path input that opens
a browsable directory tree modal. Escape or Cancel closes it,
clicking a folder navigates into it, Select confirms the choice.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>