No description

Python 100%

Find a file

Ryan 36740f8022 Use exact video file path for scene lookup instead of regex Find the video file sitting alongside the NFO (same stem, any video extension), then search Stash by its exact path using EQUALS. This is unambiguous and requires no regex escaping or partial matching. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>		2026-03-28 14:25:39 -05:00
src/stash_nfo_sync	Use exact video file path for scene lookup instead of regex	2026-03-28 14:25:39 -05:00
tests	Add initial implementation of stash-nfo-sync service	2026-03-28 12:08:57 -05:00
CHANGELOG.md	Add initial implementation of stash-nfo-sync service	2026-03-28 12:08:57 -05:00
config.example.yaml	Replace per-file scans with a single scan-then-update flow	2026-03-28 13:29:57 -05:00
pyproject.toml	Add initial implementation of stash-nfo-sync service	2026-03-28 12:08:57 -05:00
README.md	Add initial implementation of stash-nfo-sync service	2026-03-28 12:08:57 -05:00
stash-nfo-sync.service	Add initial implementation of stash-nfo-sync service	2026-03-28 12:08:57 -05:00
StashEnrichmentSpec.md	Upload files to "/"	2026-03-28 06:13:21 +00:00

README.md

stash-nfo-sync

A Python service that watches a media directory for .nfo files written by Pinchflat and syncs their metadata into Stash via its GraphQL API.

Includes a one-shot backfill script for existing libraries and optional AI-assisted performer extraction powered by Claude.

How it works
Requirements
Installation
Configuration
Usage
Running as a systemd service
AI Performer Extraction
Development
Architecture
V2 / Future Work

How it works

Pinchflat downloads YouTube videos and writes Kodi-format .nfo sidecar files alongside them. Stash does not read these automatically. stash-nfo-sync bridges that gap:

Watcher — monitors your media directory recursively using watchdog. When a new or modified .nfo is detected, it debounces the event (waits for the file to stop changing), then processes it.
Backfill — a one-shot script that walks your full media directory and processes every .nfo file it finds. Safe to re-run; scenes that are already up to date are skipped.
Processor — for each .nfo, it triggers a Stash library scan, waits for the video to appear in Stash, then writes the title, studio, release date, description, and YouTube URL to the scene. Keyed on YouTube ID to prevent duplicate entries.

NFO field mapping

NFO field	Stash field	Notes
`<title>`	Scene title
`<showtitle>`	Studio name	Leading/trailing whitespace trimmed
`<uniqueid>`	Studio code + scene URL	Full YouTube URL built from the ID
`<plot>`	Scene details	Optionally fed to AI extractor
`<aired>`	Release date	Normalized to `YYYY-MM-DD`

Requirements

Python 3.11+
A running Stash instance with API access
Pinchflat (or any tool that writes Kodi-format .nfo files)
(Optional) An Anthropic API key for AI performer extraction

Installation

# Clone the repository
git clone <repo-url>
cd StashEnrichmentScript

# Install the package (use a virtual environment)
python -m venv .venv
source .venv/bin/activate        # Windows: .venv\Scripts\activate
pip install .

# Copy and edit the example config
cp config.example.yaml config.yaml

Configuration

All settings live in config.yaml. No credentials are hardcoded.

stash_url: http://localhost:9999        # URL of your Stash instance
stash_api_key: YOUR_KEY_HERE            # Settings → Security → API Key
media_dir: /path/to/media              # Root of your Pinchflat media library
scan_wait_timeout_seconds: 30          # How long to wait for a scan to pick up a new file
debounce_seconds: 3.0                  # Wait this long after the last file event before processing

logging:
  log_file: /var/log/stash-nfo-sync.log
  max_bytes: 10485760                  # 10 MB per log file
  backup_count: 5                      # Number of rotated log files to keep

ai_extraction:
  enabled: false                       # Set to true to enable AI performer extraction
  anthropic_api_key: YOUR_KEY_HERE

retry_queue_path: /var/lib/stash-nfo-sync/retry_queue.json

Getting your Stash API key

In Stash: Settings → Security → Authentication → API Key. Generate one if you haven't already.

Usage

Watch mode (daemon)

Start the filesystem watcher. It will drain any previously failed paths from the retry queue, then begin monitoring for new or changed .nfo files.

stash-nfo-sync watch
stash-nfo-sync watch --config /etc/stash-nfo-sync/config.yaml

Backfill

Process every .nfo file in your media directory. Idempotent — scenes already in sync are skipped.

stash-nfo-sync backfill

Preview what would happen without writing anything to Stash:

stash-nfo-sync backfill --dry-run

Reprocess only the paths that previously failed (i.e. paths currently in the retry queue):

stash-nfo-sync backfill --retry-failed

All commands accept --config PATH to specify a non-default config file.

Typical first-run workflow

# 1. Dry run to see what would change
stash-nfo-sync backfill --dry-run

# 2. Run the full backfill
stash-nfo-sync backfill

# 3. If anything failed (e.g. Stash was busy), retry just those
stash-nfo-sync backfill --retry-failed

# 4. Start the watcher for ongoing sync
stash-nfo-sync watch

Running as a systemd service

A .service unit file is included. To install it:

# Copy the binary (adjust path if using a venv)
sudo cp stash-nfo-sync.service /etc/systemd/system/

# Create the config directory and place your config
sudo mkdir -p /etc/stash-nfo-sync
sudo cp config.yaml /etc/stash-nfo-sync/config.yaml

# Create the log and queue directories (as the service user)
sudo mkdir -p /var/log /var/lib/stash-nfo-sync
sudo chown stash:stash /var/lib/stash-nfo-sync

# Enable and start
sudo systemctl daemon-reload
sudo systemctl enable stash-nfo-sync
sudo systemctl start stash-nfo-sync

# Check status
sudo systemctl status stash-nfo-sync
sudo journalctl -u stash-nfo-sync -f

The service runs as the stash user, restarts on failure, and logs to the system journal (which is also captured to the rotating log file configured in config.yaml).

AI Performer Extraction

When ai_extraction.enabled: true, the <plot> text from each .nfo is sent to Claude (Haiku model) to extract performer and director names as structured JSON. Extracted names are upserted in Stash and linked to the scene.

The prompt instructs the model to return empty lists rather than guess when names are not clearly stated — no hallucinated credits.

To enable:

ai_extraction:
  enabled: true
  anthropic_api_key: sk-ant-...

AI extraction failures are non-blocking: if the Claude API is unavailable or returns an unexpected response, the scene is still synced with all other metadata; only the performer/director fields are left empty.

Development

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run tests with coverage
pytest --cov=stash_nfo_sync --cov-report=term-missing

Project structure

src/stash_nfo_sync/
├── cli.py               # Entrypoint; argparse subcommands
├── config.py            # Config loader + validation
├── models.py            # Dataclasses: NFORecord, AppConfig, ProcessResult, …
├── nfo_parser.py        # XML → NFORecord
├── stash_client.py      # Stash GraphQL API wrapper
├── processor.py         # Orchestration: parse → scan → upsert → update
├── performer_extractor.py  # Claude API integration (optional)
├── retry_queue.py       # Atomic JSON file queue
├── watcher.py           # watchdog filesystem observer + debounce
├── backfill.py          # One-shot directory walker
└── logging_setup.py     # Rotating file + stderr log handlers

tests/
├── fixtures/
│   ├── sample.nfo       # Real-world Pinchflat NFO fixture
│   └── config.test.yaml
├── test_nfo_parser.py
├── test_stash_client.py
├── test_retry_queue.py
├── test_processor.py
├── test_performer_extractor.py
└── test_backfill.py

Adding a V2 post-process hook

The SceneProcessor.process() method accepts an optional post_process_hooks list. Each hook receives (ProcessResult, NFORecord) after a successful sync. Hook exceptions are caught and logged — they never block a sync.

def my_hook(result: ProcessResult, record: NFORecord) -> None:
    print(f"Synced: {record.youtube_url} → scene {result.scene_id}")

processor.process(path, post_process_hooks=[my_hook])

This is the intended extension point for the planned Tea Leaves connector (see V2 / Future Work).

Architecture

                    ┌──────────┐
   .nfo file ──────▶│ watcher  │──────┐
   (new/modified)   └──────────┘      │
                                      ▼
   stash-nfo-sync backfill ──────▶ processor
                                      │
                              ┌───────┼───────────┐
                              ▼       ▼            ▼
                          nfo_parser  stash_client performer_extractor
                              │       │            │ (optional)
                              │       ▼            │
                              │  [GraphQL API]     │
                              │                    │
                              └────────────────────┘
                                       │
                              (on failure)
                                       ▼
                                 retry_queue

Deduplication: The YouTube video ID from <uniqueid> is the canonical key. update_scene always overwrites metadata, so the backfill is safe to re-run. A quick pre-check compares the existing scene's title, date, and studio before triggering a scan — scenes already in sync are skipped without touching Stash.

Scan-then-poll: Stash's metadataScan mutation is asynchronous. After triggering it, poll_for_scene queries findScenes with the YouTube URL filter every 2 seconds up to scan_wait_timeout_seconds. If the video file hasn't been downloaded yet (Pinchflat sometimes writes the NFO before the video finishes downloading), the poll times out and the path is added to the retry queue for the next startup.

V2 / Future Work

After metadata is written to Stash, optionally call a Tea Leaves API endpoint with the YouTube URL, title, studio, performers, and Stash scene ID. The exact API shape is TBD. Implementation will use the post_process_hooks extension point — no changes to the core processor needed.