Replace single-threaded indexing loop with ThreadPoolExecutor. Default workers = min(cpu_count*2, 16), tunable via DUPFINDER_WORKERS env var. Pre-loads all existing DB records in one query instead of N per-file queries. Progress message shows worker count and live done/total count. Skipped files bulk-stamped in batches of 500. On an 8-core machine over NAS: ~4-8x faster indexing phase. On NVMe: up to 16x faster with 16 workers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
30 KiB
30 KiB