Commit Graph

29 Commits

Author SHA1 Message Date
felixfoertsch 43d934ff68 background processing with stop, remove inbox skip, progressive column updates, fix header height
processing runs in background, items appear in columns progressively via
SSE-triggered reloads. stop button aborts mid-run, remaining items stay in
inbox. remove skip/skip-all from inbox. fix column header height jitter by
giving subtitle slot a fixed height.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-20 09:52:58 +02:00
felixfoertsch 648a17b763 rename sortInbox to processInbox, integrate language resolution into process step
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-20 09:04:18 +02:00
felixfoertsch 94610d05b7 strip Sonarr/Radarr lookups from scan path, make upsertJellyfinItem Jellyfin-only
RescanConfig now only carries audioLanguages. Radarr/Sonarr library
loading, language resolution, and resolveSeriesTvdb callback removed
from rescan.ts, scan.ts, and webhook.ts. RescanResult no longer tracks
radarrHit/sonarrHit/missingProviderId counters. Tests updated: removed
authoritative-source and resolved-TVDB-enables-Sonarr tests (moving to
processInbox in a later task), added assertion that scan never sets
sonarr/radarr as language source.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-20 08:54:43 +02:00
felixfoertsch a1da644aa3 fix sonarr OG-language miss when Jellyfin omits SeriesProviderIds
Build and Push Docker Image / build (push) Successful in 3m49s
episodes without SeriesProviderIds.Tvdb (older Jellyfin, un-refreshed items)
fell back to the episode-level TVDB ID, which never matched Sonarr's
series-keyed library. add resolveSeriesTvdb callback that fetches the series
item from Jellyfin to get the correct series TVDB ID, with per-scan caching.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-20 07:39:24 +02:00
felixfoertsch ee9add076a collapse Library page into a compact Pipeline header
Build and Push Docker Image / build (push) Successful in 1m17s
the separate Library page was mostly duplicating what the pipeline
columns + the item detail page already show. move its two useful bits —
the stats row (total / scanned / needs action / no change / approved /
done / errors) and the scan control bar — into a compact two-row header
above the pipeline columns, drop the library items table entirely, and
redirect "/" to "/pipeline".

the scan SSE buffering logic moves verbatim into PipelineHeader.tsx so
the progress bar and the stats refresh on completion keep working. the
dead /api/scan/items endpoint and its parseScanItemsQuery +
buildScanItemsWhere helpers (plus their tests) go away with the UI;
/api/scan, /api/scan/start, /api/scan/stop, /api/scan/events stay.

nav loses "Library" — Pipeline is the only entry point now.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 00:16:21 +02:00
felixfoertsch 05a1345750 rip subtitle manager → sibling project, keep extraction only
Build and Push Docker Image / build (push) Successful in 2m31s
subtitle management (list/detail pages, /api/subtitles, subtitle_files
table, SubtitleFile types, predictExtractedFiles, nav link) moved to a
new sibling project at ~/Developer/netfelix-subtitles-manager/ where
it'll be rebuilt standalone later. this project now owns audio fixing
+ subtitle extraction only.

extraction still runs end to end: analyzeItem still marks every subtitle
stream as "remove from container", buildExtractionOutputs still wires
the -map 0:s:N + sidecar outputs into the ffmpeg command, and execute.ts
still flips review_plans.subs_extracted so verify.ts can check desired
state — just derived from the streams directly instead of by writing a
row per file to the now-gone subtitle_files table.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 22:50:52 +02:00
felixfoertsch a21bcefb54 stream auto review progress over SSE so large inboxes don't feel frozen
Build and Push Docker Image / build (push) Successful in 1m18s
sortInbox is now async, yields every 10 items, and emits inbox_sort_start
+ inbox_sort_progress via optional hooks. the pipeline route handler
wires those hooks to the existing job events stream and guards against
a second concurrent sort with a 409.

the inbox column swaps its Auto Review button for a live "Sorting N/T"
counter and progress bar while the sort is in flight; the auto-process
toggle hides to give the progress the full subtitle line. the previous
behaviour was a frozen button for the entire duration of the sort.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 20:56:51 +02:00
felixfoertsch 76a16ba84c reanalyze plans during auto review so config changes take effect
Build and Push Docker Image / build (push) Successful in 4m0s
sortInbox used to distribute plans by the auto_class and stream_decisions
captured at scan time, which meant toggling an audio_languages entry and
then running "back to inbox" + "auto review" re-queued the item with the
stale decisions. now sortInbox re-runs the analyzer per plan against the
current audio_languages before distributing, matching the user's mental
model that auto review = re-apply the rules.

reanalyze() takes audioLanguages explicitly so callers can pass it in
once and tests can drive it without reaching into the singleton db.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 20:38:14 +02:00
felixfoertsch b9b4a50e8a rescan: auto-sort inbox after scan when auto_processing is on
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 10:43:03 +02:00
felixfoertsch 0c595a787e library: batch audio-codec lookup — per-row subquery was O(page×streams)
Build and Push Docker Image / build (push) Successful in 1m11s
The scalar subquery I added in 7d30e6c ran one aggregate scan of
media_streams per row. On a real library (33k items / 212k streams)
a single page took 500+ seconds synchronously, blocking the event
loop and timing out every other request — Library AND Pipeline both
stopped loading.

Swap it for a single batched `GROUP_CONCAT ... WHERE item_id IN (?...)`
query over the current page's ids (max 25), then merge back into rows.

v2026.04.15.10

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 19:42:23 +02:00
felixfoertsch 7d30e6c1a6 library: rename Scan nav/page to Library, show audio codecs per row
Build and Push Docker Image / build (push) Successful in 1m4s
Per-row audio codec summary (distinct lowercased codecs across an
item's audio streams) via scalar subquery on media_streams, rendered
as "ac3 · aac" in a new monospace Audio column.

v2026.04.15.9

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 19:10:00 +02:00
felixfoertsch a2bdecd298 rework scan page, add ingest-source browsing, bump version to 2026.04.15.8
Build and Push Docker Image / build (push) Successful in 4m56s
2026-04-15 18:33:08 +02:00
felixfoertsch 1de5b8a89e address audit findings: subtitle rescan decisions, scan limit, parseId, setup gate
Build and Push Docker Image / build (push) Successful in 1m30s
worked through AUDIT.md. triage:
- finding 2 (subtitle rescan wipes decisions): confirmed. /:id/rescan now
  snapshots custom_titles and calls reanalyze() after the stream delete/
  insert, mirroring the review rescan flow. exported reanalyze + titleKey
  from review.ts so both routes share the logic.
- finding 3 (scan limit accepts NaN/negatives): confirmed. extracted
  parseScanLimit into a pure helper, added unit tests covering NaN,
  negatives, floats, infinity, numeric strings. invalid input 400s and
  releases the scan_running lock.
- finding 4 (parseId lenient): confirmed. tightened the regex to /^\d+$/
  so "42abc", "abc42", "+42", "42.0" all return null. rewrote the test
  that codified the old lossy behaviour.
- finding 5 (setup_complete set before jellyfin test passes): confirmed.
  the /jellyfin endpoint still persists url+key unconditionally, but now
  only flips setup_complete=1 on a successful connection test.
- finding 6 (swallowed errors): partial. the mqtt restart and version-
  fetch swallows are intentional best-effort with downstream surfaces
  (getMqttStatus, UI fallback). only the scan.ts db-update swallow was
  a real visibility gap — logs via logError now.
- finding 1 (auth): left as-is. redacting secrets on GET without auth
  on POST is security theater; real fix is an auth layer, which is a
  design decision not a bugfix. audit removed from the tree.
- lint fail on ffmpeg.test.ts: formatted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 17:41:36 +02:00
felixfoertsch 6d8a8fa6d6 drop the subtitle-languages setting, it never influenced extraction
Build and Push Docker Image / build (push) Successful in 53s
analyzer removes every subtitle unconditionally (see case 'Subtitle' in
decideAction) and the pipeline extracts all of them to sidecars — the config
was purely informational and only subtitles.ts echoed it back as
'keepLanguages' for a subtitle-manager ui that doesn't exist yet. we'll
revive language preferences inside that manager when it ships.

removes: the settings card + ui state, POST /api/settings/subtitle-languages,
the config default, the SUBTITLE_LANGUAGES env mapping, AnalyzerConfig's
subtitleLanguages field, RescanConfig's subtitleLanguages field, every
caller site (scan.ts / execute.ts / review.ts), and the keepLanguages
surface in subtitles.ts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 15:26:48 +02:00
felixfoertsch 23dca8bf0b split scheduling into scan + process windows, move controls to settings page
Build and Push Docker Image / build (push) Failing after 8s
the old one-window scheduler gated only the job queue. now the scan loop and
the processing queue have independent windows — useful when the container
runs as an always-on service and we only want to hammer jellyfin + ffmpeg
at night.

config keys renamed from schedule_* to scan_schedule_* / process_schedule_*,
plus the existing job_sleep_seconds. scheduler.ts exposes parallel helpers
(isInScanWindow / isInProcessWindow, waitForScanWindow / waitForProcessWindow)
so each caller picks its window without cross-contamination.

scan.ts checks the scan window between items and emits paused/resumed sse.
execute.ts keeps its per-job pause + sleep-between-jobs but now on the
process window. /api/execute/scheduler moved to /api/settings/schedule.

frontend: ScheduleControls popup deleted from the pipeline header, replaced
with a plain Start queue button. settings page grows a Schedule section with
both windows and the job sleep input.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 14:50:25 +02:00
felixfoertsch 6fcaeca82c write canonical iso3 language metadata, tighten is_noop, store full jellyfin data
Build and Push Docker Image / build (push) Failing after 16s
ffmpeg now writes -metadata:s:a:i language=<iso3> on every kept audio track so
files end up with canonical 3-letter tags (en → eng, ger → deu, null → und).
analyzer passes stream.profile (not title) to transcodeTarget so lossless
dts-hd ma in mkv correctly targets flac. is_noop also checks og-is-default and
canonical-language so pipeline-would-change-it cases stop showing as done.

normalizeLanguage gains 2→3 mapping, and mapStream no longer normalizes at
ingest so the raw jellyfin tag survives for the canonical check.

per-item scan work runs in a single db.transaction for large sqlite speedups,
extracted into server/services/rescan.ts so execute.ts can reuse it.

on successful job, execute calls jellyfin /Items/{id}/Refresh, waits for
DateLastRefreshed to change, refetches the item, and upserts it through the
same pipeline; plan flips to done iff the fresh streams satisfy is_noop.

schema wiped + rewritten to carry jellyfin_raw, external_raw, profile,
bit_depth, date_last_refreshed, runtime_ticks, original_title, last_executed_at
— so future scans aren't required to stay correct. user must drop data/*.db.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 13:56:19 +02:00
felixfoertsch b8525be015 scan: validate arr URLs upfront, cache library once per scan
Build and Push Docker Image / build (push) Successful in 30s
Two regressions from the radarr/sonarr fix:

1. ERR_INVALID_URL spam — when radarr_enabled='1' but radarr_url is empty
   or not http(s), every per-item fetch threw TypeError. We caught it but
   still ate the cost (and the log noise) on every movie. New isUsable()
   check on each service: enabled-in-config but URL doesn't parse →
   warn ONCE and skip arr lookups for the whole scan.

2. Per-item HTTP storm — for movies not in Radarr's library we used to
   hit /api/v3/movie (the WHOLE library) again per item, then two
   metadata-lookup calls. With 2000 items that's thousands of extra
   round-trips and the scan crawled. Now: pre-load the Radarr/Sonarr
   library once into Map<tmdbId,..>+Map<imdbId,..>+Map<tvdbId,..>,
   per-item lookups are O(1) memory hits, and only the genuinely-missing
   items make a single lookup-endpoint HTTP call.

The startup line now reports the library size:
  External language sources: radarr=enabled (https://..., 1287 movies in library), sonarr=...
so you can immediately see whether the cache loaded.
2026-04-13 12:06:17 +02:00
felixfoertsch 1aafcb4972 apply codex code review: fix useEffect refetch loops, dead routes, subtitle job_type leftovers
Build and Push Docker Image / build (push) Successful in 36s
All ack'd as real bugs:

frontend
- AudioDetailPage / SubtitleDetailPage / PathsPage / ScanPage /
  SubtitleListPage / ExecutePage: load() was a fresh function reference
  every render, so 'useEffect(() => load(), [load])' refetched on every
  render. Wrap each in useCallback with the right deps ([id], [filter],
  or []).
- SetupPage: langsLoaded was useState; setting it inside load() retriggered
  the same effect → infinite loop. Switch to useRef. Also wrap saveJellyfin/
  Radarr/Sonarr in async fns so they return Promise<void> (matches the
  consumer signatures, fixes the latent TS error).
- DashboardPage: redirect target /setup doesn't exist; the route is
  /settings.
- ExecutePage: <>...</> fragment with two <tr> children had keys on the
  rows but not on the fragment → React reconciliation warning. Use
  <Fragment key>. jobTypeLabel + badge variant still branched on the
  removed 'subtitle' job_type — relabel to 'Audio Transcode' / 'Audio
  Remux' and use 'manual'/'noop' variants.

server
- review.ts + scan.ts: parseLanguageList helper catches JSON errors and
  enforces array-of-strings shape with a fallback. A corrupted config
  row would otherwise throw mid-scan.
2026-04-13 12:01:57 +02:00
felixfoertsch cafb3852a1 radarr/sonarr: stop silent failures, add metadata lookup fallback, diagnostic logs
Build and Push Docker Image / build (push) Successful in 25s
The real reason 8 Mile landed as Turkish: Radarr WAS being called, but the
call path had three silent failure modes that all looked identical from
outside.

1. try { … } catch { return null } swallowed every error. No log when
   Radarr was unreachable, when the API key was wrong, when HTTP returned
   404/500, or when JSON parsing failed. A miss and a crash looked the
   same: null, fall back to Jellyfin's dub guess.

2. /api/v3/movie?tmdbId=X only queries Radarr's LIBRARY. If the movie is
   on disk + in Jellyfin but not actively managed in Radarr, returns [].
   We then gave up and used the Jellyfin guess.

3. iso6391To6392 fell back to normalizeLanguage(name.slice(0, 3)) for any
   unknown language name — pretending 'Mandarin' → 'man' and 'Flemish' →
   'fle' are valid ISO 639-2 codes.

Fixes:
- Both services: fetchJson helper logs HTTP errors with context and the
  url (api key redacted), plus catches+logs thrown errors.
- Added a metadata-lookup fallback: /api/v3/movie/lookup/tmdb and
  /lookup/imdb for Radarr, /api/v3/series/lookup?term=tvdb:X for Sonarr.
  These hit TMDB/TVDB via the arr service for titles not in its library.
- Expanded NAME_TO_639_2: Mandarin/Cantonese → zho, Flemish → nld,
  Farsi → fas, plus common European langs that were missing.
- Unknown name → return null (log a warning) instead of a made-up 3-char
  code. scan.ts then marks needs_review.
- scan.ts: per-item warn when Radarr/Sonarr miss; per-scan summary line
  showing hits/misses/no-provider-id tallies.

Run a scan — the logs will now tell you whether Radarr was called, what
it answered, and why it fell back if it did.
2026-04-13 11:46:26 +02:00
felixfoertsch 50d3e50280 fix '8 Mile is Turkish': jellyfin guesses never earn high confidence
Build and Push Docker Image / build (push) Successful in 28s
Two bugs compounded:

1. extractOriginalLanguage() in jellyfin.ts picked the FIRST audio stream's
   language and called it 'original'. Files sourced from non-English regions
   often have a local dub as track 0, so 8 Mile with a Turkish dub first
   got labelled Turkish.

2. scan.ts promoted any single-source answer to confidence='high' — even
   the pure Jellyfin guess, as long as no second source (Radarr/Sonarr)
   contradicted it. Jellyfin's dub-magnet guess should never be green.

Fixes:
- extractOriginalLanguage now prefers the IsDefault audio track and skips
  tracks whose title shouts 'dub' / 'commentary' / 'director'. Still a
  heuristic, but much less wrong. Fallback to the first track when every
  candidate looks like a dub so we have *something* to flag.
- scan.ts: high confidence requires an authoritative source (Radarr/Sonarr)
  with no conflict. A Jellyfin-only answer is always low confidence AND
  gets needs_review=1 so it surfaces in the pipeline for manual override.
- Data migration (idempotent): downgrade existing plans backed only by the
  Jellyfin heuristic to low confidence and mark needs_review=1, so users
  don't have to rescan to benefit.
- New server/services/__tests__/jellyfin.test.ts covers the default-track
  preference and dub-skip behavior.
2026-04-13 11:39:59 +02:00
felixfoertsch 874f04b7a5 wire scheduler into queue, add retry, dev-reset cleanup, biome 2.4 migrate
- execute: actually call isInScheduleWindow/waitForWindow/sleepBetweenJobs in runSequential (they were dead code); emit queue_status SSE events (running/paused/sleeping/idle) so the pipeline's existing QueueStatus listener lights up
- review: POST /:id/retry resets an errored plan to approved, wipes old done/error jobs, rebuilds command from current decisions, queues fresh job
- scan: dev-mode DELETE now also wipes jobs + subtitle_files (previously orphaned after every dev reset)
- biome: migrate config to 2.4 schema, autoformat 68 files (strings + indentation), relax opinionated a11y/hooks-deps/index-key rules that don't fit this codebase
- routeTree.gen.ts regenerated after /nodes removal
2026-04-13 07:41:19 +02:00
felixfoertsch 93ed0ac33c fix analyzer + api boundary + perf + scheduler hardening
- analyzer: rewrite checkAudioOrderChanged to compare actual output order, unify assignTargetOrder with a shared sortKeptStreams util in ffmpeg builder
- review: recompute is_noop via full audio removed/reordered/transcode/subs check on toggle, preserve custom_title across rescan by matching (type,lang,stream_index,title), batch pipeline transcode-reasons query to avoid N+1
- validate: add lib/validate.ts with parseId + isOneOf helpers; replace bare Number(c.req.param('id')) with 400 on invalid ids across review/subtitles
- scan: atomic CAS on scan_running config to prevent concurrent scans
- subtitles: path-traversal guard — only unlink sidecars within the media item's directory; log-and-orphan DB entries pointing outside
- schedule: include end minute in window (<= vs <)
- db: add indexes on review_plans(status,is_noop), stream_decisions(plan_id), media_items(series_jellyfin_id,series_name,type), media_streams(item_id,type), subtitle_files(item_id), jobs(status,item_id)
2026-04-13 07:31:48 +02:00
felixfoertsch ecb0732185 store confidence, apple_compat, job_type, transcode_codec during scan 2026-03-27 01:45:56 +01:00
felixfoertsch 76d3b1acfb remove path mappings, add subtitle summary endpoint, cache setup page, bump version
Build and Push Docker Image / build (push) Successful in 1m50s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 12:02:26 +01:00
felixfoertsch f562cb42d9 show file name in scan log, fix progress total by using Jellyfin page callback
Build and Push Docker Image / build (push) Successful in 33s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 10:19:45 +01:00
felixfoertsch a4d5eb59e1 add configurable audio languages, sortable language lists in settings
Build and Push Docker Image / build (push) Successful in 1m9s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 09:51:03 +01:00
felixfoertsch 3682ee98e0 add structured logging with timestamps for docker/unraid log viewer
Build and Push Docker Image / build (push) Successful in 15s
all server output now prefixed with ISO timestamp and level [INFO/WARN/ERROR].
logs requests, scan start/complete, job lifecycle, errors. skips noisy SSE
endpoints.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 17:29:22 +01:00
felixfoertsch ef785de955 add path mappings to translate jellyfin library paths to container mount paths
Build and Push Docker Image / build (push) Successful in 20s
jellyfin may use different internal paths (e.g. /tv/) than container mounts
(/series/). path_mappings config (or PATH_MAPPINGS env var) translates at scan
time. configurable via setup ui or env var format: /tv/=/series/,/data/=/movies/

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 16:57:22 +01:00
felixfoertsch 5ac44b7551 restructure to react spa + hono api, fix missing server/ and lib/
rewrite from monolithic hono jsx to react 19 spa with tanstack router
+ hono json api backend. add scan, review, execute, nodes, and setup
pages. multi-stage dockerfile (node for vite build, bun for runtime).

previously, server/ and src/shared/lib/ were silently excluded by
global gitignore patterns (/server/ from emacs, lib/ from python).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 22:57:40 +01:00