GPU-accelerated phash + fix discovery/takeout hang
GPU: - Switch Dockerfile base to pytorch/pytorch:2.3.1-cuda12.1-cudnn8-runtime - Add gpu_hasher.py: batched 2D DCT on GPU via PyTorch matrix multiply, 256 images/batch, produces imagehash-compatible 64-bit hex hashes, auto-falls back to CPU when CUDA unavailable - Replace per-image phash loop in scanner.py with phasher.hash_files() - docker-compose.yml: add nvidia GPU device reservation Hang fix: - takeout.is_takeout_folder() now caps at 50 directories (was walking entire tree — blocked for minutes on 65k+ file libraries) - Add "Not a Takeout folder" status message so takeout phase is never silent Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -50,14 +50,19 @@ def is_takeout_folder(folder_path: str) -> bool:
|
||||
adjacent media files. If we find at least 5 such pairs, call it Takeout.
|
||||
"""
|
||||
count = 0
|
||||
dirs_checked = 0
|
||||
MAX_DIRS = 50 # sample at most 50 directories — fast on any library size
|
||||
|
||||
for root, dirs, files in os.walk(folder_path):
|
||||
# Skip hidden dirs
|
||||
dirs[:] = [d for d in dirs if not d.startswith(".")]
|
||||
dirs_checked += 1
|
||||
if dirs_checked > MAX_DIRS:
|
||||
break
|
||||
|
||||
file_set = set(files)
|
||||
for f in files:
|
||||
if not f.endswith(".json"):
|
||||
continue
|
||||
# Check if a media file exists that this could be a sidecar for
|
||||
base = f[:-5] # strip .json
|
||||
if base in file_set:
|
||||
count += 1
|
||||
|
||||
Reference in New Issue
Block a user