From 59f4d54cfef74f9feb673e855969fd0c8ebe08d8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Felix=20F=C3=B6rtsch?= Date: Sat, 14 Mar 2026 04:55:40 +0100 Subject: [PATCH] add design spec: memory fix, reader UI redesign Co-Authored-By: Claude Opus 4.6 (1M context) --- ...6-03-14-memory-fix-and-reader-ui-design.md | 125 ++++++++++++++++++ 1 file changed, 125 insertions(+) create mode 100644 docs/superpowers/specs/2026-03-14-memory-fix-and-reader-ui-design.md diff --git a/docs/superpowers/specs/2026-03-14-memory-fix-and-reader-ui-design.md b/docs/superpowers/specs/2026-03-14-memory-fix-and-reader-ui-design.md new file mode 100644 index 0000000..aa6ecb2 --- /dev/null +++ b/docs/superpowers/specs/2026-03-14-memory-fix-and-reader-ui-design.md @@ -0,0 +1,125 @@ +# Memory Fix and Reader UI Redesign + +**Date:** 2026-03-14 +**Status:** Approved +**Platforms:** iOS and macOS + +## Problem Statement + +Two issues with the current Vorleser app: + +1. **Memory crash on iOS** — The TTS synthesis loop accumulates MLX tensors across sentences without releasing them. iOS hits its ~1-2 GB memory ceiling and crashes after a few sentences of playback. +2. **Reading UI is basic** — Text is shown one chapter at a time. Users want a proper ebook reading experience: continuous scrolling or paged (Apple Books-style) reading with chapter navigation, preserved formatting, and auto-follow during playback. + +## Design + +### 1. Memory Fix + +**Root cause:** Each `synthesizer.synthesize(text:)` call creates MLX tensors (model activations, attention matrices, audio output) that accumulate because: +- Swift ARC doesn't eagerly release autoreleased ObjC/C++ objects inside async loops +- MLX holds a GPU memory pool that grows unless explicitly drained + +**Changes:** +1. Wrap each `synthesizer.synthesize()` call in `AudioEngine.playbackLoop()` — including the prefetch task — in `autoreleasepool { }` to force release of ObjC-bridged temporaries. +2. After each synthesis call, invoke `MLX.GPU.drain()` (or equivalent MLXUtilsLibrary cleanup API) to release the GPU memory pool. +3. Cache `Book.sentences` — currently a computed property that re-segments the entire book on every access. Change to a stored property computed once at init. + +### 2. EPUB Parser — Preserving Structure and Formatting + +Currently `EPUBParser` strips all HTML to plain text and collapses whitespace. For a proper reading experience, output `NSAttributedString` instead. + +**Changes to BookParser module:** +- `Chapter` stores `attributedText: NSAttributedString` alongside the plain `text: String` property (derived via `.string`). +- `EPUBParser` walks the SwiftSoup DOM and builds an `NSAttributedString`: + - `

` → paragraph with spacing + - `
` → line break + - ``, `` → bold trait + - ``, `` → italic trait + - `

`–`
` → bold + larger font size + - Everything else → body font +- Use dynamic type (`UIFont.preferredFont` / `NSFont.preferredFont`) as the base, respecting system font size settings. +- `PlainTextParser` similarly produces `NSAttributedString` with paragraph breaks on `\n\n`. + +**Offset compatibility:** `SentenceSegmenter` continues to operate on plain `String`. Character offsets remain valid because `NSAttributedString.string` matches the plain text used for segmentation. + +### 3. Reading UI — BookTextView (per platform) + +Replace `ReadingTextView` (iOS) and `MacReadingTextView` (macOS) with a single `BookTextView` per platform. + +**Responsibilities:** +- Display the **full book** as one `NSAttributedString` (all chapters concatenated with chapter break spacing). +- Sentence highlighting via temporary text attributes (yellow background on active sentence range). +- Tap/click → character offset callback for tap-to-play. +- Programmatic scrolling to a character offset (chapter jumps, auto-follow during playback). + +**Two reading modes:** + +1. **Scroll mode:** The text view uses standard scrolling (`UITextView` / `NSScrollView`). The full book is one tall scrollable column. + +2. **Book (paged) mode:** TextKit 2 pagination — the `NSTextLayoutManager` lays text into page-sized `NSTextContainer`s. Navigation via `UIPageViewController` (iOS) or horizontal swipe/arrow keys (macOS). Pages reflow dynamically on rotation / window resize. + +**Chapter navigation:** +- Each `Chapter` knows its character offset within the full book text. +- "Jump to chapter" computes the offset → scroll mode scrolls to it; paged mode finds which page contains that offset and navigates there. + +**Auto-follow during playback:** +- Observe `engine.currentPosition` → compute highlighted sentence range → apply highlight attribute → scroll/page to keep the active sentence visible. + +### 4. ReaderViewModel — Shared Logic + +Extract duplicated logic from `ReaderView` (iOS) and `MacReaderView` (macOS) into a shared `ReaderViewModel` (`@Observable` class). + +**Owns:** +- Book loading and parsing +- Synthesizer initialization +- Reference to `AudioEngine` +- Current chapter index, reading mode (scroll / paged), full attributed string +- Chapter offset table for jump navigation +- Highlight range computation + +**Platform views become thin shells:** +- Toolbar: chapter picker, mode toggle (scroll / book) +- `BookTextView` (platform-specific wrapper) +- `PlaybackControls` +- Wire to the view model + +### 5. Playback Controls and Mode Switching + +**Playback controls** unchanged (play/pause, skip forward/back at sentence granularity). Auto-follow keeps the active sentence visible regardless of mode. + +**Mode toggle:** +- Segmented control or toolbar button: scroll ↔ book mode. +- Switching modes preserves reading position (same character offset stays visible). +- Mode preference persisted on `StoredBook`. + +**Position persistence:** Existing `lastPosition` (character offset) continues to work — offsets are display-mode-independent. + +## Files Changed + +**VorleserKit package:** +- `AudioEngine.swift` — autoreleasepool, GPU drain +- `Book.swift` — cache sentences as stored property +- `Chapter.swift` — add `attributedText: NSAttributedString` +- `EPUBParser.swift` — build NSAttributedString from HTML DOM +- `PlainTextParser.swift` — build NSAttributedString from plain text +- New: `ReaderViewModel.swift` — shared reader logic + +**iOS app (`Vorleser-iOS/`):** +- New: `BookTextView.swift` — UIViewRepresentable wrapping UITextView +- New: paged mode view with UIPageViewController + TextKit 2 +- `ReaderView.swift` — rewrite as thin shell over ReaderViewModel +- Remove: `ReadingTextView.swift` + +**macOS app (`Vorleser-macOS/`):** +- New: `BookTextView.swift` — NSViewRepresentable wrapping NSTextView +- New: paged mode view with TextKit 2 pagination + swipe/arrows +- `MacReaderView.swift` — rewrite as thin shell over ReaderViewModel +- Remove: `MacReadingTextView.swift` + +## Not In Scope + +- Voice selection UI +- Bookmarks +- Table of contents sidebar +- Search within book +- Night mode / theme customization