Abraham Lincoln crashed with exit 234 because the file had 14 dvd_subtitle
streams: our extraction dict only keyed on the long form (dvd_subtitle)
while jellyfin stores the short form (dvdsub), so the lookup fell back
to .srt, ffmpeg picked the srt muxer, and srt can't encode image-based
subs. textbook silent dict miss.
replaced the extension dict with an EXTRACTABLE map that pairs codec →
{ext, codecArg} and explicitly enumerates every codec we can route to a
single-file sidecar. everything else (dvd_subtitle/dvdsub, dvb_subtitle/
dvbsub, unknown codecs) is now skipped at command-build time. the plan
picks up a note like '14 subtitle(s) dropped: dvdsub (eng, est, ind,
kor, jpn, lav, lit, may, chi, chi, tha, vie, rus, ukr) — not extractable
to sidecar' so the user sees exactly what didn't make it.
also added extractErrorSummary in execute.ts: when a job errors, scan
the last 60 stderr lines for fatal keywords (Error:, Conversion failed!,
Unsupported, Invalid argument, Permission denied, No space left, …),
dedupe, prepend the summary to the job's stored output. the review_plan
notes get the same summary — surfaces the real cause next to the plan
instead of burying it under ffmpeg's 200-line banner.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
user reported ad astra got the double checkmark instantly after
transcode — correct, and correct to flag: the post-execute
verifyDesiredState ran ffprobe on the file we had just written, so it
tautologically matched the plan every time. not a second opinion.
replaced the flow with the semantics we actually wanted:
1. refreshItem now returns { refreshed: boolean } — true when jellyfin's
DateLastRefreshed actually advanced within the timeout, false when it
didn't. callers can tell 'jellyfin really re-probed' apart from
'we timed out waiting'.
2. handOffToJellyfin post-job: refresh → (only if refreshed=true) fetch
fresh streams → upsertJellyfinItem(source='webhook'). the rescan SQL
sets verified=1 exactly when the fresh analysis sees is_noop=1, so
✓✓ now means 'jellyfin independently re-probed the file we wrote
and agrees it matches the plan'. if jellyfin sees a drifted layout
the plan flips back to pending so the user notices instead of the
job silently rubber-stamping a bad output.
3. dropped the post-execute ffprobe block. the preflight-skipped branch
no longer self-awards verified=1 either; it now does the same hand-
off so jellyfin's re-probe drives the ✓✓ in that branch too.
refreshItem's other two callers (review /rescan, subtitles /rescan)
ignore the return value — their semantics haven't changed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
worked through AUDIT.md. triage:
- finding 2 (subtitle rescan wipes decisions): confirmed. /:id/rescan now
snapshots custom_titles and calls reanalyze() after the stream delete/
insert, mirroring the review rescan flow. exported reanalyze + titleKey
from review.ts so both routes share the logic.
- finding 3 (scan limit accepts NaN/negatives): confirmed. extracted
parseScanLimit into a pure helper, added unit tests covering NaN,
negatives, floats, infinity, numeric strings. invalid input 400s and
releases the scan_running lock.
- finding 4 (parseId lenient): confirmed. tightened the regex to /^\d+$/
so "42abc", "abc42", "+42", "42.0" all return null. rewrote the test
that codified the old lossy behaviour.
- finding 5 (setup_complete set before jellyfin test passes): confirmed.
the /jellyfin endpoint still persists url+key unconditionally, but now
only flips setup_complete=1 on a successful connection test.
- finding 6 (swallowed errors): partial. the mqtt restart and version-
fetch swallows are intentional best-effort with downstream surfaces
(getMqttStatus, UI fallback). only the scan.ts db-update swallow was
a real visibility gap — logs via logError now.
- finding 1 (auth): left as-is. redacting secrets on GET without auth
on POST is security theater; real fix is an auth layer, which is a
design decision not a bugfix. audit removed from the tree.
- lint fail on ffmpeg.test.ts: formatted.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
monitoring the mqtt broker revealed two bugs and one design dead-end:
1. the jellyfin-plugin-webhook publishes pascalcase fields
(NotificationType, ItemId, ItemType) and we were reading camelcase
(event, itemId, itemType). every real payload was rejected by the
first guard — the mqtt path never ingested anything.
2. the plugin has no ItemUpdated / Library.* notifications. file
rewrites on existing items produce zero broker traffic (observed:
transcode + manual refresh metadata + 'recently added' appearance
→ no mqtt messages). ✓✓ via webhook is structurally impossible.
fix the webhook path so brand-new library items actually get ingested,
and narrow ACCEPTED_EVENTS to just 'ItemAdded' (the only library-side
event the plugin emits).
move the ✓✓ signal from webhook-corroboration to post-execute ffprobe
via the existing verifyDesiredState helper: after ffmpeg returns 0 we
probe the output file ourselves and flip verified=1 on match. the
preflight-skipped path sets verified=1 too. renamed the db column
webhook_verified → verified (via idempotent RENAME COLUMN migration)
since the signal is no longer webhook-sourced, and updated the Done
column tooltip to reflect that ffprobe is doing the verification.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
audio tracks now get a harmonized title on output (overriding any file
title like 'Audio Description' — review has already filtered out tracks
we don't want to keep). mono/stereo render numerically (1.0/2.0), matching
the .1-suffixed surround layouts. pipeline card rows become two-line so
long titles wrap instead of being clipped by the column.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
adds review_plans.webhook_verified, set to 1 whenever a fresh analysis
(scan or post-execute webhook) sees is_noop=1, cleared if a webhook
later flips the plan off-noop. resurrected the try/catch alter table
migration pattern in server/db/index.ts for the new column.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
two simplifications to how we pick and transcode the one-per-language
audio track, motivated by seeing inconsistent DTS → FLAC vs DTS →
EAC3 outputs in the wild:
transcode target:
- drop the FLAC path entirely. every incompatible source now targets
EAC3 regardless of container or lossless/lossy status
- FLAC for movie audio is bad value: ~2-3× the file size vs EAC3, no
Atmos spatial metadata (TrueHD Atmos → FLAC silently loses Atmos),
no AVR passthrough on Apple TV
- one target = no more container-conditional surprises
winner within a language group (betterAudio):
- new priority: highest channels → Apple-compatible → default → index
- old order put 'default' on top which forced a DTS-HD MA transcode
even when an AC3 track at equal channels was right next to it.
flipping means AC3 beats DTS-HD MA at the same channel count — pure
copy instead of a lossless-then-re-encode round trip
- channel count still dominates, so 7.1 TrueHD still beats 5.1 AC3
(and gets transcoded, which is the right call for real surround)
tests: new case for DTS-HD MA default + AC3 non-default at 5.1 → AC3
wins, job_type=copy. new case for 7.1 TrueHD beats 5.1 AC3 default.
every other existing test still holds.
a release with 2× english (main + director's commentary, or a surround
track plus an audio-description track) was keeping both. the user only
wants one per language. rules, in priority order:
- always drop commentary / audio-description / visually-impaired /
karaoke / sign-language tracks (matched by title regex + the
is_hearing_impaired flag)
- within each kept-language group, pick one winner by:
1. default disposition (main track the muxer chose)
2. highest channel count
3. apple-compatible codec (skip a transcode pass)
4. lowest stream_index for stability
tests cover: commentary dropped even when it matches OG, AD flag
dropped, default beats non-default, higher channels beat default-less
candidates of equal type, Apple-compat tiebreak, per-language dedupe
runs independently, and single-stream files stay noop.
two fixes based on actual behavior of the jellyfin webhook plugin:
- 'Webhook Url' setup value no longer re-serialized with mqtt://. show
the user's broker url verbatim so whatever protocol they use (ws://,
http://, etc.) survives the round trip
- dropped the server-side 'trigger a jellyfin rescan during the test'
machinery. a refresh that doesn't mutate metadata won't fire Item
Added, so relying on it produced false negatives. now we just wait
for any message on the topic; ui instructs the user to hit play on a
movie in jellyfin while the test runs — playback start is a
deterministic trigger, unlike library events
- setup panel now lists Notification Types as 'Item Added, Playback
Start'. playback start is for the test only; the production handler
still filters events down to item added / updated
- MqttSection now renders as a nested block inside the Jellyfin
ConnSection instead of its own card; ConnSection grew a children slot
- when the enable checkbox is off, broker/topic/credentials inputs and
the whole plugin setup panel are hidden; only the toggle + a small
save button remain
- 'Test Connection' became 'Test end-to-end': connects to the broker,
subscribes, picks a random scanned movie/episode, asks jellyfin to
refresh it, and waits for a matching webhook message. the UI walks
through all three steps (broker reachable → jellyfin rescan triggered
→ webhook received) with per-step success/failure so a broken
plugin config is obvious
- new mqtt_enabled config + toggle at top of the section; subscriber
only starts when the box is checked
- moved the whole MqttSection directly below the Jellyfin section so
all jellyfin-adjacent config lives together
- rewrote the plugin setup list to match the actual form order and
group it: 'Top of plugin page' (Server Url = jellyfin base URL),
'Generic destination', 'MQTT settings', 'Template'
- fields the user picks from a dropdown or toggles (Status,
Notification Type, Item Type, Use TLS, Use Credentials, QoS) now
render a 'select' hint instead of a broken Copy button
after ffmpeg finishes we used to block the queue on a jellyfin refresh
+ re-analyze round-trip. now we just kick jellyfin and return. a new
mqtt subscriber listens for library events from jellyfin's webhook
plugin and re-runs upsertJellyfinItem — flipping plans back to pending
when the on-disk streams still don't match, otherwise confirming done.
- execute.ts: hand-off is fire-and-forget; no more sync re-analyze
- rescan.ts: upsertJellyfinItem takes source: 'scan' | 'webhook'.
webhook-sourced rescans can reopen terminal 'done' plans when
is_noop flips back to 0; scan-sourced rescans still treat done as
terminal (keeps the dup-job fix from a06ab34 intact).
- mqtt.ts: long-lived client, auto-reconnect, status feed for UI badge
- webhook.ts: pure processWebhookEvent(db, deps) handler + 5s dedupe
map to kill jellyfin's burst re-fires during library scans
- settings: /api/settings/mqtt{,/status,/test} + /api/settings/
jellyfin/webhook-plugin (checks if the plugin is installed)
- ui: new Settings section with broker form, test button, copy-paste
setup panel for the Jellyfin plugin template. MQTT status badge on
the scan page.
root cause of duplicate pipeline entries: rescan.ts flipped done plans
back to pending whenever a post-job jellyfin refresh returned stale
metadata, putting the item back in review and letting a second jobs row
pile up in done. done is now sticky across rescans (error still
re-opens for retries).
second line of defense: before spawning ffmpeg, ffprobe the file and
compare audio count/language/codec order + embedded subtitle count
against the plan. if it already matches, mark the job done with the
reason in jobs.output and skip the spawn. prevents corrupting a
post-processed file with a stale stream-index command.
analyzer removes every subtitle unconditionally (see case 'Subtitle' in
decideAction) and the pipeline extracts all of them to sidecars — the config
was purely informational and only subtitles.ts echoed it back as
'keepLanguages' for a subtitle-manager ui that doesn't exist yet. we'll
revive language preferences inside that manager when it ships.
removes: the settings card + ui state, POST /api/settings/subtitle-languages,
the config default, the SUBTITLE_LANGUAGES env mapping, AnalyzerConfig's
subtitleLanguages field, RescanConfig's subtitleLanguages field, every
caller site (scan.ts / execute.ts / review.ts), and the keepLanguages
surface in subtitles.ts.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
the old one-window scheduler gated only the job queue. now the scan loop and
the processing queue have independent windows — useful when the container
runs as an always-on service and we only want to hammer jellyfin + ffmpeg
at night.
config keys renamed from schedule_* to scan_schedule_* / process_schedule_*,
plus the existing job_sleep_seconds. scheduler.ts exposes parallel helpers
(isInScanWindow / isInProcessWindow, waitForScanWindow / waitForProcessWindow)
so each caller picks its window without cross-contamination.
scan.ts checks the scan window between items and emits paused/resumed sse.
execute.ts keeps its per-job pause + sleep-between-jobs but now on the
process window. /api/execute/scheduler moved to /api/settings/schedule.
frontend: ScheduleControls popup deleted from the pipeline header, replaced
with a plain Start queue button. settings page grows a Schedule section with
both windows and the job sleep input.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ffmpeg now writes -metadata:s:a:i language=<iso3> on every kept audio track so
files end up with canonical 3-letter tags (en → eng, ger → deu, null → und).
analyzer passes stream.profile (not title) to transcodeTarget so lossless
dts-hd ma in mkv correctly targets flac. is_noop also checks og-is-default and
canonical-language so pipeline-would-change-it cases stop showing as done.
normalizeLanguage gains 2→3 mapping, and mapStream no longer normalizes at
ingest so the raw jellyfin tag survives for the canonical check.
per-item scan work runs in a single db.transaction for large sqlite speedups,
extracted into server/services/rescan.ts so execute.ts can reuse it.
on successful job, execute calls jellyfin /Items/{id}/Refresh, waits for
DateLastRefreshed to change, refetches the item, and upserts it through the
same pipeline; plan flips to done iff the fresh streams satisfy is_noop.
schema wiped + rewritten to carry jellyfin_raw, external_raw, profile,
bit_depth, date_last_refreshed, runtime_ticks, original_title, last_executed_at
— so future scans aren't required to stay correct. user must drop data/*.db.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two regressions from the radarr/sonarr fix:
1. ERR_INVALID_URL spam — when radarr_enabled='1' but radarr_url is empty
or not http(s), every per-item fetch threw TypeError. We caught it but
still ate the cost (and the log noise) on every movie. New isUsable()
check on each service: enabled-in-config but URL doesn't parse →
warn ONCE and skip arr lookups for the whole scan.
2. Per-item HTTP storm — for movies not in Radarr's library we used to
hit /api/v3/movie (the WHOLE library) again per item, then two
metadata-lookup calls. With 2000 items that's thousands of extra
round-trips and the scan crawled. Now: pre-load the Radarr/Sonarr
library once into Map<tmdbId,..>+Map<imdbId,..>+Map<tvdbId,..>,
per-item lookups are O(1) memory hits, and only the genuinely-missing
items make a single lookup-endpoint HTTP call.
The startup line now reports the library size:
External language sources: radarr=enabled (https://..., 1287 movies in library), sonarr=...
so you can immediately see whether the cache loaded.
The real reason 8 Mile landed as Turkish: Radarr WAS being called, but the
call path had three silent failure modes that all looked identical from
outside.
1. try { … } catch { return null } swallowed every error. No log when
Radarr was unreachable, when the API key was wrong, when HTTP returned
404/500, or when JSON parsing failed. A miss and a crash looked the
same: null, fall back to Jellyfin's dub guess.
2. /api/v3/movie?tmdbId=X only queries Radarr's LIBRARY. If the movie is
on disk + in Jellyfin but not actively managed in Radarr, returns [].
We then gave up and used the Jellyfin guess.
3. iso6391To6392 fell back to normalizeLanguage(name.slice(0, 3)) for any
unknown language name — pretending 'Mandarin' → 'man' and 'Flemish' →
'fle' are valid ISO 639-2 codes.
Fixes:
- Both services: fetchJson helper logs HTTP errors with context and the
url (api key redacted), plus catches+logs thrown errors.
- Added a metadata-lookup fallback: /api/v3/movie/lookup/tmdb and
/lookup/imdb for Radarr, /api/v3/series/lookup?term=tvdb:X for Sonarr.
These hit TMDB/TVDB via the arr service for titles not in its library.
- Expanded NAME_TO_639_2: Mandarin/Cantonese → zho, Flemish → nld,
Farsi → fas, plus common European langs that were missing.
- Unknown name → return null (log a warning) instead of a made-up 3-char
code. scan.ts then marks needs_review.
- scan.ts: per-item warn when Radarr/Sonarr miss; per-scan summary line
showing hits/misses/no-provider-id tallies.
Run a scan — the logs will now tell you whether Radarr was called, what
it answered, and why it fell back if it did.
Two bugs compounded:
1. extractOriginalLanguage() in jellyfin.ts picked the FIRST audio stream's
language and called it 'original'. Files sourced from non-English regions
often have a local dub as track 0, so 8 Mile with a Turkish dub first
got labelled Turkish.
2. scan.ts promoted any single-source answer to confidence='high' — even
the pure Jellyfin guess, as long as no second source (Radarr/Sonarr)
contradicted it. Jellyfin's dub-magnet guess should never be green.
Fixes:
- extractOriginalLanguage now prefers the IsDefault audio track and skips
tracks whose title shouts 'dub' / 'commentary' / 'director'. Still a
heuristic, but much less wrong. Fallback to the first track when every
candidate looks like a dub so we have *something* to flag.
- scan.ts: high confidence requires an authoritative source (Radarr/Sonarr)
with no conflict. A Jellyfin-only answer is always low confidence AND
gets needs_review=1 so it surfaces in the pipeline for manual override.
- Data migration (idempotent): downgrade existing plans backed only by the
Jellyfin heuristic to low confidence and mark needs_review=1, so users
don't have to rescan to benefit.
- New server/services/__tests__/jellyfin.test.ts covers the default-track
preference and dub-skip behavior.
Bug: every approve path (buildCommand used by review approve/approve-all/
series approve-all/season approve-all/retry/detail preview) was building
an ffmpeg command that -map'd only the 'keep' streams and dropped all
subtitles. For a file like Wuthering Heights with 37 embedded subs, the
run would delete every sub into the void — user expected extraction to
sidecar files per the pipeline contract.
buildPipelineCommand already did the right thing (extract every subtitle
with -map 0:s:N -c:s copy 'basename.lang.srt', then remux kept streams)
but it was only reached by tests. buildCommand now delegates to it — one
call site, subtitle extraction always runs, predictExtractedFiles records
the sidecar paths after job success (same logic, same basePath).
Added a regression test: buildCommand on a 2-subtitle file contains
-map 0:s:0, -map 0:s:1 and the expected 'basename.en.srt'/'.de.srt' paths.
Subtitle extraction lives only in the pipeline now; a file is 'done' when it
matches the desired end state — no embedded subs AND audio matches the
language config. The separate Extract page was redundant.
- delete src/routes/review/subtitles/extract.tsx + SubtitleExtractPage
- delete /api/subtitles/extract-all + /:id/extract endpoints
- delete buildExtractOnlyCommand + unused buildExtractionOutputs from ffmpeg.ts
- detail page: drop Extract button + extractCommand textarea, replace with
'will be extracted via pipeline' note when embedded subs present
- pipeline endpoint: doneCount = is_noop OR status='done' (a file in the
desired state, however it got there); UI label 'N files in desired state'
- nav: drop the now-defunct 'Extract subs' link, default activeOptions.exact
to false so detail subpages (e.g. /review/audio/123) highlight their
parent ('Audio') in the menu — was the cause of the broken-feeling menu
- execute: actually call isInScheduleWindow/waitForWindow/sleepBetweenJobs in runSequential (they were dead code); emit queue_status SSE events (running/paused/sleeping/idle) so the pipeline's existing QueueStatus listener lights up
- review: POST /:id/retry resets an errored plan to approved, wipes old done/error jobs, rebuilds command from current decisions, queues fresh job
- scan: dev-mode DELETE now also wipes jobs + subtitle_files (previously orphaned after every dev reset)
- biome: migrate config to 2.4 schema, autoformat 68 files (strings + indentation), relax opinionated a11y/hooks-deps/index-key rules that don't fit this codebase
- routeTree.gen.ts regenerated after /nodes removal
- analyzer: rewrite checkAudioOrderChanged to compare actual output order, unify assignTargetOrder with a shared sortKeptStreams util in ffmpeg builder
- review: recompute is_noop via full audio removed/reordered/transcode/subs check on toggle, preserve custom_title across rescan by matching (type,lang,stream_index,title), batch pipeline transcode-reasons query to avoid N+1
- validate: add lib/validate.ts with parseId + isOneOf helpers; replace bare Number(c.req.param('id')) with 400 on invalid ids across review/subtitles
- scan: atomic CAS on scan_running config to prevent concurrent scans
- subtitles: path-traversal guard — only unlink sidecars within the media item's directory; log-and-orphan DB entries pointing outside
- schedule: include end minute in window (<= vs <)
- db: add indexes on review_plans(status,is_noop), stream_decisions(plan_id), media_items(series_jellyfin_id,series_name,type), media_streams(item_id,type), subtitle_files(item_id), jobs(status,item_id)
- remove nodes table, ssh service, nodes api, NodesPage route
- execute.ts: local-only spawn, atomic CAS job claim via UPDATE status
- wrap job done + subtitle_files insert + review_plans status in db transaction
- stream ffmpeg output per line with 500ms throttled flush
- bump version to 2026.04.13
- server-side filter + LIMIT 200 + totalCounts on GET /api/execute
- shared FilterTabs component with status-colored active tabs
- execute page: filter tabs, SSE live count updates, module-level cache
- replace inline tab pills in AudioListPage, SubtitleListPage with FilterTabs
- fix buildExtractOnlyCommand: skip -map 0:a when no audio streams exist
- bump version
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
rewrite from monolithic hono jsx to react 19 spa with tanstack router
+ hono json api backend. add scan, review, execute, nodes, and setup
pages. multi-stage dockerfile (node for vite build, bun for runtime).
previously, server/ and src/shared/lib/ were silently excluded by
global gitignore patterns (/server/ from emacs, lib/ from python).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>