The scalar subquery I added in 7d30e6c ran one aggregate scan of
media_streams per row. On a real library (33k items / 212k streams)
a single page took 500+ seconds synchronously, blocking the event
loop and timing out every other request — Library AND Pipeline both
stopped loading.
Swap it for a single batched `GROUP_CONCAT ... WHERE item_id IN (?...)`
query over the current page's ids (max 25), then merge back into rows.
v2026.04.15.10
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Per-row audio codec summary (distinct lowercased codecs across an
item's audio streams) via scalar subquery on media_streams, rendered
as "ac3 · aac" in a new monospace Audio column.
v2026.04.15.9
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
worked through AUDIT.md. triage:
- finding 2 (subtitle rescan wipes decisions): confirmed. /:id/rescan now
snapshots custom_titles and calls reanalyze() after the stream delete/
insert, mirroring the review rescan flow. exported reanalyze + titleKey
from review.ts so both routes share the logic.
- finding 3 (scan limit accepts NaN/negatives): confirmed. extracted
parseScanLimit into a pure helper, added unit tests covering NaN,
negatives, floats, infinity, numeric strings. invalid input 400s and
releases the scan_running lock.
- finding 4 (parseId lenient): confirmed. tightened the regex to /^\d+$/
so "42abc", "abc42", "+42", "42.0" all return null. rewrote the test
that codified the old lossy behaviour.
- finding 5 (setup_complete set before jellyfin test passes): confirmed.
the /jellyfin endpoint still persists url+key unconditionally, but now
only flips setup_complete=1 on a successful connection test.
- finding 6 (swallowed errors): partial. the mqtt restart and version-
fetch swallows are intentional best-effort with downstream surfaces
(getMqttStatus, UI fallback). only the scan.ts db-update swallow was
a real visibility gap — logs via logError now.
- finding 1 (auth): left as-is. redacting secrets on GET without auth
on POST is security theater; real fix is an auth layer, which is a
design decision not a bugfix. audit removed from the tree.
- lint fail on ffmpeg.test.ts: formatted.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
analyzer removes every subtitle unconditionally (see case 'Subtitle' in
decideAction) and the pipeline extracts all of them to sidecars — the config
was purely informational and only subtitles.ts echoed it back as
'keepLanguages' for a subtitle-manager ui that doesn't exist yet. we'll
revive language preferences inside that manager when it ships.
removes: the settings card + ui state, POST /api/settings/subtitle-languages,
the config default, the SUBTITLE_LANGUAGES env mapping, AnalyzerConfig's
subtitleLanguages field, RescanConfig's subtitleLanguages field, every
caller site (scan.ts / execute.ts / review.ts), and the keepLanguages
surface in subtitles.ts.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
the old one-window scheduler gated only the job queue. now the scan loop and
the processing queue have independent windows — useful when the container
runs as an always-on service and we only want to hammer jellyfin + ffmpeg
at night.
config keys renamed from schedule_* to scan_schedule_* / process_schedule_*,
plus the existing job_sleep_seconds. scheduler.ts exposes parallel helpers
(isInScanWindow / isInProcessWindow, waitForScanWindow / waitForProcessWindow)
so each caller picks its window without cross-contamination.
scan.ts checks the scan window between items and emits paused/resumed sse.
execute.ts keeps its per-job pause + sleep-between-jobs but now on the
process window. /api/execute/scheduler moved to /api/settings/schedule.
frontend: ScheduleControls popup deleted from the pipeline header, replaced
with a plain Start queue button. settings page grows a Schedule section with
both windows and the job sleep input.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ffmpeg now writes -metadata:s:a:i language=<iso3> on every kept audio track so
files end up with canonical 3-letter tags (en → eng, ger → deu, null → und).
analyzer passes stream.profile (not title) to transcodeTarget so lossless
dts-hd ma in mkv correctly targets flac. is_noop also checks og-is-default and
canonical-language so pipeline-would-change-it cases stop showing as done.
normalizeLanguage gains 2→3 mapping, and mapStream no longer normalizes at
ingest so the raw jellyfin tag survives for the canonical check.
per-item scan work runs in a single db.transaction for large sqlite speedups,
extracted into server/services/rescan.ts so execute.ts can reuse it.
on successful job, execute calls jellyfin /Items/{id}/Refresh, waits for
DateLastRefreshed to change, refetches the item, and upserts it through the
same pipeline; plan flips to done iff the fresh streams satisfy is_noop.
schema wiped + rewritten to carry jellyfin_raw, external_raw, profile,
bit_depth, date_last_refreshed, runtime_ticks, original_title, last_executed_at
— so future scans aren't required to stay correct. user must drop data/*.db.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two regressions from the radarr/sonarr fix:
1. ERR_INVALID_URL spam — when radarr_enabled='1' but radarr_url is empty
or not http(s), every per-item fetch threw TypeError. We caught it but
still ate the cost (and the log noise) on every movie. New isUsable()
check on each service: enabled-in-config but URL doesn't parse →
warn ONCE and skip arr lookups for the whole scan.
2. Per-item HTTP storm — for movies not in Radarr's library we used to
hit /api/v3/movie (the WHOLE library) again per item, then two
metadata-lookup calls. With 2000 items that's thousands of extra
round-trips and the scan crawled. Now: pre-load the Radarr/Sonarr
library once into Map<tmdbId,..>+Map<imdbId,..>+Map<tvdbId,..>,
per-item lookups are O(1) memory hits, and only the genuinely-missing
items make a single lookup-endpoint HTTP call.
The startup line now reports the library size:
External language sources: radarr=enabled (https://..., 1287 movies in library), sonarr=...
so you can immediately see whether the cache loaded.
All ack'd as real bugs:
frontend
- AudioDetailPage / SubtitleDetailPage / PathsPage / ScanPage /
SubtitleListPage / ExecutePage: load() was a fresh function reference
every render, so 'useEffect(() => load(), [load])' refetched on every
render. Wrap each in useCallback with the right deps ([id], [filter],
or []).
- SetupPage: langsLoaded was useState; setting it inside load() retriggered
the same effect → infinite loop. Switch to useRef. Also wrap saveJellyfin/
Radarr/Sonarr in async fns so they return Promise<void> (matches the
consumer signatures, fixes the latent TS error).
- DashboardPage: redirect target /setup doesn't exist; the route is
/settings.
- ExecutePage: <>...</> fragment with two <tr> children had keys on the
rows but not on the fragment → React reconciliation warning. Use
<Fragment key>. jobTypeLabel + badge variant still branched on the
removed 'subtitle' job_type — relabel to 'Audio Transcode' / 'Audio
Remux' and use 'manual'/'noop' variants.
server
- review.ts + scan.ts: parseLanguageList helper catches JSON errors and
enforces array-of-strings shape with a fallback. A corrupted config
row would otherwise throw mid-scan.
The real reason 8 Mile landed as Turkish: Radarr WAS being called, but the
call path had three silent failure modes that all looked identical from
outside.
1. try { … } catch { return null } swallowed every error. No log when
Radarr was unreachable, when the API key was wrong, when HTTP returned
404/500, or when JSON parsing failed. A miss and a crash looked the
same: null, fall back to Jellyfin's dub guess.
2. /api/v3/movie?tmdbId=X only queries Radarr's LIBRARY. If the movie is
on disk + in Jellyfin but not actively managed in Radarr, returns [].
We then gave up and used the Jellyfin guess.
3. iso6391To6392 fell back to normalizeLanguage(name.slice(0, 3)) for any
unknown language name — pretending 'Mandarin' → 'man' and 'Flemish' →
'fle' are valid ISO 639-2 codes.
Fixes:
- Both services: fetchJson helper logs HTTP errors with context and the
url (api key redacted), plus catches+logs thrown errors.
- Added a metadata-lookup fallback: /api/v3/movie/lookup/tmdb and
/lookup/imdb for Radarr, /api/v3/series/lookup?term=tvdb:X for Sonarr.
These hit TMDB/TVDB via the arr service for titles not in its library.
- Expanded NAME_TO_639_2: Mandarin/Cantonese → zho, Flemish → nld,
Farsi → fas, plus common European langs that were missing.
- Unknown name → return null (log a warning) instead of a made-up 3-char
code. scan.ts then marks needs_review.
- scan.ts: per-item warn when Radarr/Sonarr miss; per-scan summary line
showing hits/misses/no-provider-id tallies.
Run a scan — the logs will now tell you whether Radarr was called, what
it answered, and why it fell back if it did.
Two bugs compounded:
1. extractOriginalLanguage() in jellyfin.ts picked the FIRST audio stream's
language and called it 'original'. Files sourced from non-English regions
often have a local dub as track 0, so 8 Mile with a Turkish dub first
got labelled Turkish.
2. scan.ts promoted any single-source answer to confidence='high' — even
the pure Jellyfin guess, as long as no second source (Radarr/Sonarr)
contradicted it. Jellyfin's dub-magnet guess should never be green.
Fixes:
- extractOriginalLanguage now prefers the IsDefault audio track and skips
tracks whose title shouts 'dub' / 'commentary' / 'director'. Still a
heuristic, but much less wrong. Fallback to the first track when every
candidate looks like a dub so we have *something* to flag.
- scan.ts: high confidence requires an authoritative source (Radarr/Sonarr)
with no conflict. A Jellyfin-only answer is always low confidence AND
gets needs_review=1 so it surfaces in the pipeline for manual override.
- Data migration (idempotent): downgrade existing plans backed only by the
Jellyfin heuristic to low confidence and mark needs_review=1, so users
don't have to rescan to benefit.
- New server/services/__tests__/jellyfin.test.ts covers the default-track
preference and dub-skip behavior.
- execute: actually call isInScheduleWindow/waitForWindow/sleepBetweenJobs in runSequential (they were dead code); emit queue_status SSE events (running/paused/sleeping/idle) so the pipeline's existing QueueStatus listener lights up
- review: POST /:id/retry resets an errored plan to approved, wipes old done/error jobs, rebuilds command from current decisions, queues fresh job
- scan: dev-mode DELETE now also wipes jobs + subtitle_files (previously orphaned after every dev reset)
- biome: migrate config to 2.4 schema, autoformat 68 files (strings + indentation), relax opinionated a11y/hooks-deps/index-key rules that don't fit this codebase
- routeTree.gen.ts regenerated after /nodes removal
- analyzer: rewrite checkAudioOrderChanged to compare actual output order, unify assignTargetOrder with a shared sortKeptStreams util in ffmpeg builder
- review: recompute is_noop via full audio removed/reordered/transcode/subs check on toggle, preserve custom_title across rescan by matching (type,lang,stream_index,title), batch pipeline transcode-reasons query to avoid N+1
- validate: add lib/validate.ts with parseId + isOneOf helpers; replace bare Number(c.req.param('id')) with 400 on invalid ids across review/subtitles
- scan: atomic CAS on scan_running config to prevent concurrent scans
- subtitles: path-traversal guard — only unlink sidecars within the media item's directory; log-and-orphan DB entries pointing outside
- schedule: include end minute in window (<= vs <)
- db: add indexes on review_plans(status,is_noop), stream_decisions(plan_id), media_items(series_jellyfin_id,series_name,type), media_streams(item_id,type), subtitle_files(item_id), jobs(status,item_id)
all server output now prefixed with ISO timestamp and level [INFO/WARN/ERROR].
logs requests, scan start/complete, job lifecycle, errors. skips noisy SSE
endpoints.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
jellyfin may use different internal paths (e.g. /tv/) than container mounts
(/series/). path_mappings config (or PATH_MAPPINGS env var) translates at scan
time. configurable via setup ui or env var format: /tv/=/series/,/data/=/movies/
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
rewrite from monolithic hono jsx to react 19 spa with tanstack router
+ hono json api backend. add scan, review, execute, nodes, and setup
pages. multi-stage dockerfile (node for vite build, bun for runtime).
previously, server/ and src/shared/lib/ were silently excluded by
global gitignore patterns (/server/ from emacs, lib/ from python).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>