diff --git a/docs/superpowers/specs/2026-03-14-memory-fix-and-reader-ui-design.md b/docs/superpowers/specs/2026-03-14-memory-fix-and-reader-ui-design.md new file mode 100644 index 0000000..aa6ecb2 --- /dev/null +++ b/docs/superpowers/specs/2026-03-14-memory-fix-and-reader-ui-design.md @@ -0,0 +1,125 @@ +# Memory Fix and Reader UI Redesign + +**Date:** 2026-03-14 +**Status:** Approved +**Platforms:** iOS and macOS + +## Problem Statement + +Two issues with the current Vorleser app: + +1. **Memory crash on iOS** — The TTS synthesis loop accumulates MLX tensors across sentences without releasing them. iOS hits its ~1-2 GB memory ceiling and crashes after a few sentences of playback. +2. **Reading UI is basic** — Text is shown one chapter at a time. Users want a proper ebook reading experience: continuous scrolling or paged (Apple Books-style) reading with chapter navigation, preserved formatting, and auto-follow during playback. + +## Design + +### 1. Memory Fix + +**Root cause:** Each `synthesizer.synthesize(text:)` call creates MLX tensors (model activations, attention matrices, audio output) that accumulate because: +- Swift ARC doesn't eagerly release autoreleased ObjC/C++ objects inside async loops +- MLX holds a GPU memory pool that grows unless explicitly drained + +**Changes:** +1. Wrap each `synthesizer.synthesize()` call in `AudioEngine.playbackLoop()` — including the prefetch task — in `autoreleasepool { }` to force release of ObjC-bridged temporaries. +2. After each synthesis call, invoke `MLX.GPU.drain()` (or equivalent MLXUtilsLibrary cleanup API) to release the GPU memory pool. +3. Cache `Book.sentences` — currently a computed property that re-segments the entire book on every access. Change to a stored property computed once at init. + +### 2. EPUB Parser — Preserving Structure and Formatting + +Currently `EPUBParser` strips all HTML to plain text and collapses whitespace. For a proper reading experience, output `NSAttributedString` instead. + +**Changes to BookParser module:** +- `Chapter` stores `attributedText: NSAttributedString` alongside the plain `text: String` property (derived via `.string`). +- `EPUBParser` walks the SwiftSoup DOM and builds an `NSAttributedString`: + - `
` → paragraph with spacing
+ - `
` → line break
+ - ``, `` → bold trait
+ - ``, `` → italic trait
+ - ``–`
` → bold + larger font size
+ - Everything else → body font
+- Use dynamic type (`UIFont.preferredFont` / `NSFont.preferredFont`) as the base, respecting system font size settings.
+- `PlainTextParser` similarly produces `NSAttributedString` with paragraph breaks on `\n\n`.
+
+**Offset compatibility:** `SentenceSegmenter` continues to operate on plain `String`. Character offsets remain valid because `NSAttributedString.string` matches the plain text used for segmentation.
+
+### 3. Reading UI — BookTextView (per platform)
+
+Replace `ReadingTextView` (iOS) and `MacReadingTextView` (macOS) with a single `BookTextView` per platform.
+
+**Responsibilities:**
+- Display the **full book** as one `NSAttributedString` (all chapters concatenated with chapter break spacing).
+- Sentence highlighting via temporary text attributes (yellow background on active sentence range).
+- Tap/click → character offset callback for tap-to-play.
+- Programmatic scrolling to a character offset (chapter jumps, auto-follow during playback).
+
+**Two reading modes:**
+
+1. **Scroll mode:** The text view uses standard scrolling (`UITextView` / `NSScrollView`). The full book is one tall scrollable column.
+
+2. **Book (paged) mode:** TextKit 2 pagination — the `NSTextLayoutManager` lays text into page-sized `NSTextContainer`s. Navigation via `UIPageViewController` (iOS) or horizontal swipe/arrow keys (macOS). Pages reflow dynamically on rotation / window resize.
+
+**Chapter navigation:**
+- Each `Chapter` knows its character offset within the full book text.
+- "Jump to chapter" computes the offset → scroll mode scrolls to it; paged mode finds which page contains that offset and navigates there.
+
+**Auto-follow during playback:**
+- Observe `engine.currentPosition` → compute highlighted sentence range → apply highlight attribute → scroll/page to keep the active sentence visible.
+
+### 4. ReaderViewModel — Shared Logic
+
+Extract duplicated logic from `ReaderView` (iOS) and `MacReaderView` (macOS) into a shared `ReaderViewModel` (`@Observable` class).
+
+**Owns:**
+- Book loading and parsing
+- Synthesizer initialization
+- Reference to `AudioEngine`
+- Current chapter index, reading mode (scroll / paged), full attributed string
+- Chapter offset table for jump navigation
+- Highlight range computation
+
+**Platform views become thin shells:**
+- Toolbar: chapter picker, mode toggle (scroll / book)
+- `BookTextView` (platform-specific wrapper)
+- `PlaybackControls`
+- Wire to the view model
+
+### 5. Playback Controls and Mode Switching
+
+**Playback controls** unchanged (play/pause, skip forward/back at sentence granularity). Auto-follow keeps the active sentence visible regardless of mode.
+
+**Mode toggle:**
+- Segmented control or toolbar button: scroll ↔ book mode.
+- Switching modes preserves reading position (same character offset stays visible).
+- Mode preference persisted on `StoredBook`.
+
+**Position persistence:** Existing `lastPosition` (character offset) continues to work — offsets are display-mode-independent.
+
+## Files Changed
+
+**VorleserKit package:**
+- `AudioEngine.swift` — autoreleasepool, GPU drain
+- `Book.swift` — cache sentences as stored property
+- `Chapter.swift` — add `attributedText: NSAttributedString`
+- `EPUBParser.swift` — build NSAttributedString from HTML DOM
+- `PlainTextParser.swift` — build NSAttributedString from plain text
+- New: `ReaderViewModel.swift` — shared reader logic
+
+**iOS app (`Vorleser-iOS/`):**
+- New: `BookTextView.swift` — UIViewRepresentable wrapping UITextView
+- New: paged mode view with UIPageViewController + TextKit 2
+- `ReaderView.swift` — rewrite as thin shell over ReaderViewModel
+- Remove: `ReadingTextView.swift`
+
+**macOS app (`Vorleser-macOS/`):**
+- New: `BookTextView.swift` — NSViewRepresentable wrapping NSTextView
+- New: paged mode view with TextKit 2 pagination + swipe/arrows
+- `MacReaderView.swift` — rewrite as thin shell over ReaderViewModel
+- Remove: `MacReadingTextView.swift`
+
+## Not In Scope
+
+- Voice selection UI
+- Bookmarks
+- Table of contents sidebar
+- Search within book
+- Night mode / theme customization