20 Commits

Author SHA1 Message Date
0c595a787e library: batch audio-codec lookup — per-row subquery was O(page×streams)
All checks were successful
Build and Push Docker Image / build (push) Successful in 1m11s
The scalar subquery I added in 7d30e6c ran one aggregate scan of
media_streams per row. On a real library (33k items / 212k streams)
a single page took 500+ seconds synchronously, blocking the event
loop and timing out every other request — Library AND Pipeline both
stopped loading.

Swap it for a single batched `GROUP_CONCAT ... WHERE item_id IN (?...)`
query over the current page's ids (max 25), then merge back into rows.

v2026.04.15.10

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 19:42:23 +02:00
7d30e6c1a6 library: rename Scan nav/page to Library, show audio codecs per row
All checks were successful
Build and Push Docker Image / build (push) Successful in 1m4s
Per-row audio codec summary (distinct lowercased codecs across an
item's audio streams) via scalar subquery on media_streams, rendered
as "ac3 · aac" in a new monospace Audio column.

v2026.04.15.9

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 19:10:00 +02:00
a2bdecd298 rework scan page, add ingest-source browsing, bump version to 2026.04.15.8
All checks were successful
Build and Push Docker Image / build (push) Successful in 4m56s
2026-04-15 18:33:08 +02:00
1de5b8a89e address audit findings: subtitle rescan decisions, scan limit, parseId, setup gate
All checks were successful
Build and Push Docker Image / build (push) Successful in 1m30s
worked through AUDIT.md. triage:
- finding 2 (subtitle rescan wipes decisions): confirmed. /:id/rescan now
  snapshots custom_titles and calls reanalyze() after the stream delete/
  insert, mirroring the review rescan flow. exported reanalyze + titleKey
  from review.ts so both routes share the logic.
- finding 3 (scan limit accepts NaN/negatives): confirmed. extracted
  parseScanLimit into a pure helper, added unit tests covering NaN,
  negatives, floats, infinity, numeric strings. invalid input 400s and
  releases the scan_running lock.
- finding 4 (parseId lenient): confirmed. tightened the regex to /^\d+$/
  so "42abc", "abc42", "+42", "42.0" all return null. rewrote the test
  that codified the old lossy behaviour.
- finding 5 (setup_complete set before jellyfin test passes): confirmed.
  the /jellyfin endpoint still persists url+key unconditionally, but now
  only flips setup_complete=1 on a successful connection test.
- finding 6 (swallowed errors): partial. the mqtt restart and version-
  fetch swallows are intentional best-effort with downstream surfaces
  (getMqttStatus, UI fallback). only the scan.ts db-update swallow was
  a real visibility gap — logs via logError now.
- finding 1 (auth): left as-is. redacting secrets on GET without auth
  on POST is security theater; real fix is an auth layer, which is a
  design decision not a bugfix. audit removed from the tree.
- lint fail on ffmpeg.test.ts: formatted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 17:41:36 +02:00
6d8a8fa6d6 drop the subtitle-languages setting, it never influenced extraction
All checks were successful
Build and Push Docker Image / build (push) Successful in 53s
analyzer removes every subtitle unconditionally (see case 'Subtitle' in
decideAction) and the pipeline extracts all of them to sidecars — the config
was purely informational and only subtitles.ts echoed it back as
'keepLanguages' for a subtitle-manager ui that doesn't exist yet. we'll
revive language preferences inside that manager when it ships.

removes: the settings card + ui state, POST /api/settings/subtitle-languages,
the config default, the SUBTITLE_LANGUAGES env mapping, AnalyzerConfig's
subtitleLanguages field, RescanConfig's subtitleLanguages field, every
caller site (scan.ts / execute.ts / review.ts), and the keepLanguages
surface in subtitles.ts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 15:26:48 +02:00
23dca8bf0b split scheduling into scan + process windows, move controls to settings page
Some checks failed
Build and Push Docker Image / build (push) Failing after 8s
the old one-window scheduler gated only the job queue. now the scan loop and
the processing queue have independent windows — useful when the container
runs as an always-on service and we only want to hammer jellyfin + ffmpeg
at night.

config keys renamed from schedule_* to scan_schedule_* / process_schedule_*,
plus the existing job_sleep_seconds. scheduler.ts exposes parallel helpers
(isInScanWindow / isInProcessWindow, waitForScanWindow / waitForProcessWindow)
so each caller picks its window without cross-contamination.

scan.ts checks the scan window between items and emits paused/resumed sse.
execute.ts keeps its per-job pause + sleep-between-jobs but now on the
process window. /api/execute/scheduler moved to /api/settings/schedule.

frontend: ScheduleControls popup deleted from the pipeline header, replaced
with a plain Start queue button. settings page grows a Schedule section with
both windows and the job sleep input.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 14:50:25 +02:00
6fcaeca82c write canonical iso3 language metadata, tighten is_noop, store full jellyfin data
Some checks failed
Build and Push Docker Image / build (push) Failing after 16s
ffmpeg now writes -metadata:s:a:i language=<iso3> on every kept audio track so
files end up with canonical 3-letter tags (en → eng, ger → deu, null → und).
analyzer passes stream.profile (not title) to transcodeTarget so lossless
dts-hd ma in mkv correctly targets flac. is_noop also checks og-is-default and
canonical-language so pipeline-would-change-it cases stop showing as done.

normalizeLanguage gains 2→3 mapping, and mapStream no longer normalizes at
ingest so the raw jellyfin tag survives for the canonical check.

per-item scan work runs in a single db.transaction for large sqlite speedups,
extracted into server/services/rescan.ts so execute.ts can reuse it.

on successful job, execute calls jellyfin /Items/{id}/Refresh, waits for
DateLastRefreshed to change, refetches the item, and upserts it through the
same pipeline; plan flips to done iff the fresh streams satisfy is_noop.

schema wiped + rewritten to carry jellyfin_raw, external_raw, profile,
bit_depth, date_last_refreshed, runtime_ticks, original_title, last_executed_at
— so future scans aren't required to stay correct. user must drop data/*.db.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 13:56:19 +02:00
b8525be015 scan: validate arr URLs upfront, cache library once per scan
All checks were successful
Build and Push Docker Image / build (push) Successful in 30s
Two regressions from the radarr/sonarr fix:

1. ERR_INVALID_URL spam — when radarr_enabled='1' but radarr_url is empty
   or not http(s), every per-item fetch threw TypeError. We caught it but
   still ate the cost (and the log noise) on every movie. New isUsable()
   check on each service: enabled-in-config but URL doesn't parse →
   warn ONCE and skip arr lookups for the whole scan.

2. Per-item HTTP storm — for movies not in Radarr's library we used to
   hit /api/v3/movie (the WHOLE library) again per item, then two
   metadata-lookup calls. With 2000 items that's thousands of extra
   round-trips and the scan crawled. Now: pre-load the Radarr/Sonarr
   library once into Map<tmdbId,..>+Map<imdbId,..>+Map<tvdbId,..>,
   per-item lookups are O(1) memory hits, and only the genuinely-missing
   items make a single lookup-endpoint HTTP call.

The startup line now reports the library size:
  External language sources: radarr=enabled (https://..., 1287 movies in library), sonarr=...
so you can immediately see whether the cache loaded.
2026-04-13 12:06:17 +02:00
1aafcb4972 apply codex code review: fix useEffect refetch loops, dead routes, subtitle job_type leftovers
All checks were successful
Build and Push Docker Image / build (push) Successful in 36s
All ack'd as real bugs:

frontend
- AudioDetailPage / SubtitleDetailPage / PathsPage / ScanPage /
  SubtitleListPage / ExecutePage: load() was a fresh function reference
  every render, so 'useEffect(() => load(), [load])' refetched on every
  render. Wrap each in useCallback with the right deps ([id], [filter],
  or []).
- SetupPage: langsLoaded was useState; setting it inside load() retriggered
  the same effect → infinite loop. Switch to useRef. Also wrap saveJellyfin/
  Radarr/Sonarr in async fns so they return Promise<void> (matches the
  consumer signatures, fixes the latent TS error).
- DashboardPage: redirect target /setup doesn't exist; the route is
  /settings.
- ExecutePage: <>...</> fragment with two <tr> children had keys on the
  rows but not on the fragment → React reconciliation warning. Use
  <Fragment key>. jobTypeLabel + badge variant still branched on the
  removed 'subtitle' job_type — relabel to 'Audio Transcode' / 'Audio
  Remux' and use 'manual'/'noop' variants.

server
- review.ts + scan.ts: parseLanguageList helper catches JSON errors and
  enforces array-of-strings shape with a fallback. A corrupted config
  row would otherwise throw mid-scan.
2026-04-13 12:01:57 +02:00
cafb3852a1 radarr/sonarr: stop silent failures, add metadata lookup fallback, diagnostic logs
All checks were successful
Build and Push Docker Image / build (push) Successful in 25s
The real reason 8 Mile landed as Turkish: Radarr WAS being called, but the
call path had three silent failure modes that all looked identical from
outside.

1. try { … } catch { return null } swallowed every error. No log when
   Radarr was unreachable, when the API key was wrong, when HTTP returned
   404/500, or when JSON parsing failed. A miss and a crash looked the
   same: null, fall back to Jellyfin's dub guess.

2. /api/v3/movie?tmdbId=X only queries Radarr's LIBRARY. If the movie is
   on disk + in Jellyfin but not actively managed in Radarr, returns [].
   We then gave up and used the Jellyfin guess.

3. iso6391To6392 fell back to normalizeLanguage(name.slice(0, 3)) for any
   unknown language name — pretending 'Mandarin' → 'man' and 'Flemish' →
   'fle' are valid ISO 639-2 codes.

Fixes:
- Both services: fetchJson helper logs HTTP errors with context and the
  url (api key redacted), plus catches+logs thrown errors.
- Added a metadata-lookup fallback: /api/v3/movie/lookup/tmdb and
  /lookup/imdb for Radarr, /api/v3/series/lookup?term=tvdb:X for Sonarr.
  These hit TMDB/TVDB via the arr service for titles not in its library.
- Expanded NAME_TO_639_2: Mandarin/Cantonese → zho, Flemish → nld,
  Farsi → fas, plus common European langs that were missing.
- Unknown name → return null (log a warning) instead of a made-up 3-char
  code. scan.ts then marks needs_review.
- scan.ts: per-item warn when Radarr/Sonarr miss; per-scan summary line
  showing hits/misses/no-provider-id tallies.

Run a scan — the logs will now tell you whether Radarr was called, what
it answered, and why it fell back if it did.
2026-04-13 11:46:26 +02:00
50d3e50280 fix '8 Mile is Turkish': jellyfin guesses never earn high confidence
All checks were successful
Build and Push Docker Image / build (push) Successful in 28s
Two bugs compounded:

1. extractOriginalLanguage() in jellyfin.ts picked the FIRST audio stream's
   language and called it 'original'. Files sourced from non-English regions
   often have a local dub as track 0, so 8 Mile with a Turkish dub first
   got labelled Turkish.

2. scan.ts promoted any single-source answer to confidence='high' — even
   the pure Jellyfin guess, as long as no second source (Radarr/Sonarr)
   contradicted it. Jellyfin's dub-magnet guess should never be green.

Fixes:
- extractOriginalLanguage now prefers the IsDefault audio track and skips
  tracks whose title shouts 'dub' / 'commentary' / 'director'. Still a
  heuristic, but much less wrong. Fallback to the first track when every
  candidate looks like a dub so we have *something* to flag.
- scan.ts: high confidence requires an authoritative source (Radarr/Sonarr)
  with no conflict. A Jellyfin-only answer is always low confidence AND
  gets needs_review=1 so it surfaces in the pipeline for manual override.
- Data migration (idempotent): downgrade existing plans backed only by the
  Jellyfin heuristic to low confidence and mark needs_review=1, so users
  don't have to rescan to benefit.
- New server/services/__tests__/jellyfin.test.ts covers the default-track
  preference and dub-skip behavior.
2026-04-13 11:39:59 +02:00
874f04b7a5 wire scheduler into queue, add retry, dev-reset cleanup, biome 2.4 migrate
- execute: actually call isInScheduleWindow/waitForWindow/sleepBetweenJobs in runSequential (they were dead code); emit queue_status SSE events (running/paused/sleeping/idle) so the pipeline's existing QueueStatus listener lights up
- review: POST /:id/retry resets an errored plan to approved, wipes old done/error jobs, rebuilds command from current decisions, queues fresh job
- scan: dev-mode DELETE now also wipes jobs + subtitle_files (previously orphaned after every dev reset)
- biome: migrate config to 2.4 schema, autoformat 68 files (strings + indentation), relax opinionated a11y/hooks-deps/index-key rules that don't fit this codebase
- routeTree.gen.ts regenerated after /nodes removal
2026-04-13 07:41:19 +02:00
93ed0ac33c fix analyzer + api boundary + perf + scheduler hardening
- analyzer: rewrite checkAudioOrderChanged to compare actual output order, unify assignTargetOrder with a shared sortKeptStreams util in ffmpeg builder
- review: recompute is_noop via full audio removed/reordered/transcode/subs check on toggle, preserve custom_title across rescan by matching (type,lang,stream_index,title), batch pipeline transcode-reasons query to avoid N+1
- validate: add lib/validate.ts with parseId + isOneOf helpers; replace bare Number(c.req.param('id')) with 400 on invalid ids across review/subtitles
- scan: atomic CAS on scan_running config to prevent concurrent scans
- subtitles: path-traversal guard — only unlink sidecars within the media item's directory; log-and-orphan DB entries pointing outside
- schedule: include end minute in window (<= vs <)
- db: add indexes on review_plans(status,is_noop), stream_decisions(plan_id), media_items(series_jellyfin_id,series_name,type), media_streams(item_id,type), subtitle_files(item_id), jobs(status,item_id)
2026-04-13 07:31:48 +02:00
ecb0732185 store confidence, apple_compat, job_type, transcode_codec during scan 2026-03-27 01:45:56 +01:00
76d3b1acfb remove path mappings, add subtitle summary endpoint, cache setup page, bump version
All checks were successful
Build and Push Docker Image / build (push) Successful in 1m50s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 12:02:26 +01:00
f562cb42d9 show file name in scan log, fix progress total by using Jellyfin page callback
All checks were successful
Build and Push Docker Image / build (push) Successful in 33s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 10:19:45 +01:00
a4d5eb59e1 add configurable audio languages, sortable language lists in settings
All checks were successful
Build and Push Docker Image / build (push) Successful in 1m9s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 09:51:03 +01:00
3682ee98e0 add structured logging with timestamps for docker/unraid log viewer
All checks were successful
Build and Push Docker Image / build (push) Successful in 15s
all server output now prefixed with ISO timestamp and level [INFO/WARN/ERROR].
logs requests, scan start/complete, job lifecycle, errors. skips noisy SSE
endpoints.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 17:29:22 +01:00
ef785de955 add path mappings to translate jellyfin library paths to container mount paths
All checks were successful
Build and Push Docker Image / build (push) Successful in 20s
jellyfin may use different internal paths (e.g. /tv/) than container mounts
(/series/). path_mappings config (or PATH_MAPPINGS env var) translates at scan
time. configurable via setup ui or env var format: /tv/=/series/,/data/=/movies/

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 16:57:22 +01:00
5ac44b7551 restructure to react spa + hono api, fix missing server/ and lib/
rewrite from monolithic hono jsx to react 19 spa with tanstack router
+ hono json api backend. add scan, review, execute, nodes, and setup
pages. multi-stage dockerfile (node for vite build, bun for runtime).

previously, server/ and src/shared/lib/ were silently excluded by
global gitignore patterns (/server/ from emacs, lib/ from python).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 22:57:40 +01:00