Files
netfelix-audio-fix/docs/superpowers/specs/2026-03-27-unified-pipeline-design.md
Felix Förtsch 93ed0ac33c fix analyzer + api boundary + perf + scheduler hardening
- analyzer: rewrite checkAudioOrderChanged to compare actual output order, unify assignTargetOrder with a shared sortKeptStreams util in ffmpeg builder
- review: recompute is_noop via full audio removed/reordered/transcode/subs check on toggle, preserve custom_title across rescan by matching (type,lang,stream_index,title), batch pipeline transcode-reasons query to avoid N+1
- validate: add lib/validate.ts with parseId + isOneOf helpers; replace bare Number(c.req.param('id')) with 400 on invalid ids across review/subtitles
- scan: atomic CAS on scan_running config to prevent concurrent scans
- subtitles: path-traversal guard — only unlink sidecars within the media item's directory; log-and-orphan DB entries pointing outside
- schedule: include end minute in window (<= vs <)
- db: add indexes on review_plans(status,is_noop), stream_decisions(plan_id), media_items(series_jellyfin_id,series_name,type), media_streams(item_id,type), subtitle_files(item_id), jobs(status,item_id)
2026-04-13 07:31:48 +02:00

8.7 KiB

Unified Media Processing Pipeline

Date: 2026-03-27 Status: Draft

Problem

The app currently handles subtitle extraction and audio cleanup as separate workflows with separate FFmpeg commands. Apple device compatibility (DTS/TrueHD transcoding) is not addressed at all. Users must manually navigate multiple pages and approve items one by one.

Goal

Unify all media processing into a single pipeline per file. One scan, one review, one FFmpeg command. Minimize user interaction by auto-approving high-confidence items and enabling batch confirmation.

Pipeline

Each file goes through three checks. The analyzer evaluates all three and produces one plan with one FFmpeg command.

Step 1: Subtitle Extraction

  • Extract all embedded subtitles to sidecar files on disk
  • Remove subtitle streams from container
  • Sidecar naming: video.en.srt, video.de.forced.srt, video.es.hi.vtt
  • Populates subtitle_files table (feeds the existing subtitle manager)

Step 2: Audio Cleanup

  • Identify original language audio → set as default, first track
  • Keep configured additional languages (audio_languages config), sorted by config order
  • Remove all other audio tracks
  • Preserve custom titles on kept tracks

Step 3: Apple Compatibility

Check each kept audio stream's codec against the Apple-compatible set:

Compatible (no action): AAC, AC3 (DD), EAC3 (DD+), ALAC, FLAC, MP3, PCM, Opus

Incompatible → transcode:

Source codec Container = MKV Container = MP4
DTS / DTS-ES / DTS-HD HRA EAC3 EAC3
DTS-HD MA / DTS:X FLAC EAC3
TrueHD / TrueHD Atmos FLAC EAC3

Rationale: FLAC preserves lossless quality and is Apple-compatible (iOS 11+), but only works in MKV containers. EAC3 is the best lossy option for surround that Apple devices decode natively.

Combined FFmpeg Command

A single FFmpeg invocation handles all three steps:

ffmpeg -y -i 'input.mkv' \
  # Subtitle extraction (multiple outputs)
  -map 0:s:0 'input.en.srt' \
  -map 0:s:1 'input.de.forced.srt' \
  # Remuxed output (no subs, reordered audio, transcoded where needed)
  -map 0:v:0 -map 0:a:2 -map 0:a:0 \
  -c:v copy \
  -c:a:0 copy \          # OG audio (AAC) — already compatible
  -c:a:1 eac3 \          # secondary audio (DTS) — transcode to EAC3
  -disposition:a:0 default -disposition:a:1 0 \
  -metadata:s:a:0 title='English' \
  -metadata:s:a:1 title='German' \
  'input.tmp.mkv' && mv 'input.tmp.mkv' 'input.mkv'

Note: FFmpeg can output to multiple files in one invocation. Subtitle sidecar extraction and the remuxed output are produced in a single pass.

is_noop Definition

A file is a no-op (already fully processed) when ALL of:

  • All subtitles already extracted (or none embedded)
  • Audio tracks already in correct order with no unwanted tracks
  • All kept audio codecs are Apple-compatible

is_noop files are marked as done during scan without entering the pipeline.

Confidence Scoring

Each file gets a confidence score based on OG language reliability:

High confidence (auto-approve) — ALL of:

  • OG language is known (not null/unknown)
  • At least two sources agree (any combination of Jellyfin, Radarr, Sonarr), OR only one source exists and it returned a language
  • No needs_review flag from scan

Low confidence (needs review) — ANY of:

  • OG language is null or unknown
  • Sources disagree (e.g., Jellyfin says "eng", Radarr says "fra")
  • needs_review flag set during scan
  • Zero audio tracks match the detected OG language

High-confidence files are pre-approved and sorted to the top of the review column. Low-confidence files require human confirmation.

Kanban Board UI

Replaces the current separate scan/review/execute pages with a unified pipeline view.

Columns

Scan Review Queued Processing Done
Incoming from scan Needs confirmation Confirmed, waiting FFmpeg running Completed

Card Content

Each card represents one media file:

  • Title: movie name or "S01E03 — Episode Title"
  • OG language: badge with confidence color (green/yellow/red) + inline dropdown to change
  • Pipeline badges: icons showing which steps apply (sub extract, audio reorder, audio transcode)
  • Job type: copy (fast, seconds) vs transcode (slow, minutes)
  • "Approve up to here" button: confirms this card and all cards above it in the Review column

Series Grouping

  • Series appear as collapsible cards showing series name + episode count
  • OG language is set at series level (one dropdown for the whole series)
  • "Approve series" button confirms all episodes at once
  • Individual episodes can be expanded and overridden if needed
  • Rationale: if a series is English OG, it's unlikely a single episode differs

Processing Column

  • Shows currently running job with progress info
  • For transcode jobs: progress bar (% complete, elapsed, ETA) parsed from FFmpeg stderr time=
  • Queue status: idle / running / sleeping / paused until HH:MM

Done Column

  • Completed items with summary of what changed
  • Collapsible, auto-archives

Execution & Scheduling

Job Queue

  • Jobs execute sequentially (one FFmpeg command at a time)
  • Each job tagged as copy or transcode based on whether any audio streams need transcoding

Sleep Between Jobs

  • Configurable job_sleep_seconds (default: 0)
  • Applied after each job completes, before the next starts
  • Changeable at runtime via UI

Schedule Window

  • Configurable schedule_start and schedule_end (e.g., "01:00" and "07:00")
  • schedule_enabled toggle (default: off = run anytime)
  • When enabled: jobs only start within the window
  • A running job is never interrupted — it finishes, then the queue pauses
  • Changeable at runtime via UI

Config Keys (added to config table)

job_sleep_seconds: '0'
schedule_enabled: '0'
schedule_start: '01:00'
schedule_end: '07:00'

Schema Changes

review_plans — new columns

Column Type Description
confidence TEXT high / low — based on OG language source agreement
apple_compat TEXT direct_play / remux / audio_transcode / video_transcode
job_type TEXT copy / transcode — determines expected duration
subs_extracted INTEGER 1 if subtitles already extracted (existing column, kept)

stream_decisions — new columns

Column Type Description
transcode_codec TEXT Target codec if transcoding (e.g., eac3, flac), NULL if copy

jobs — updated job_type values

Current: audio, extract, convert New: copy (stream copy only), transcode (includes audio re-encoding)

New config defaults

job_sleep_seconds: '0',
schedule_enabled: '0',
schedule_start: '01:00',
schedule_end: '07:00',

Subtitle Manager (unchanged)

The existing subtitle manager remains as a separate page/tool:

  • Browse extracted sidecar files per media item
  • Delete unwanted sidecar files (the Bazarr gap)
  • Language summary view
  • Title harmonization

The pipeline populates subtitle_files during step 1. The manager reads from that table independently. No coupling between the two beyond the shared table.

Out of Scope

  • Configarr / custom format management — handled externally on Unraid
  • Sonarr/Radarr re-search trigger — future feature (flag incompatible files for re-download)
  • Video transcoding (VP9 → H.264, etc.) — rare edge case, handle via re-download
  • Container conversion (MKV ↔ MP4) — not needed for the pipeline, existing MKV convert command stays available

Guided Gates

  • GG-1: Scan a library with mixed codecs (DTS, AAC, TrueHD, EAC3). Verify the analyzer correctly identifies which files need transcoding vs copy-only.
  • GG-2: Process a DTS-only MKV file. Verify the FFmpeg command transcodes DTS → FLAC (lossless) and the output plays on an iPhone without transcoding.
  • GG-3: Process a TrueHD file in MP4 container. Verify it transcodes to EAC3 (not FLAC, since MP4 doesn't support FLAC).
  • GG-4: Run the Kanban board with 20+ items. Use "Approve up to here" to batch-confirm 15 items. Verify all 15 move to Queued.
  • GG-5: Set schedule window to a past time range. Verify queue pauses and shows "paused until HH:MM".
  • GG-6: Process a file that is already fully compliant (Apple-compatible audio, subs extracted, correct order). Verify it's marked is_noop and shows as Done without entering the queue.
  • GG-7: Verify the subtitle manager still works independently — delete a sidecar file, confirm it's removed from disk and subtitle_files table.
  • GG-8: Collapse/expand a series in the review column. Set OG language at series level. Verify all episodes inherit it. Override one episode. Verify only that episode differs.