add design spec: memory fix, reader UI redesign

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-14 04:55:40 +01:00
parent 68eee5dd6d
commit 59f4d54cfe

View File

@@ -0,0 +1,125 @@
# Memory Fix and Reader UI Redesign
**Date:** 2026-03-14
**Status:** Approved
**Platforms:** iOS and macOS
## Problem Statement
Two issues with the current Vorleser app:
1. **Memory crash on iOS** — The TTS synthesis loop accumulates MLX tensors across sentences without releasing them. iOS hits its ~1-2 GB memory ceiling and crashes after a few sentences of playback.
2. **Reading UI is basic** — Text is shown one chapter at a time. Users want a proper ebook reading experience: continuous scrolling or paged (Apple Books-style) reading with chapter navigation, preserved formatting, and auto-follow during playback.
## Design
### 1. Memory Fix
**Root cause:** Each `synthesizer.synthesize(text:)` call creates MLX tensors (model activations, attention matrices, audio output) that accumulate because:
- Swift ARC doesn't eagerly release autoreleased ObjC/C++ objects inside async loops
- MLX holds a GPU memory pool that grows unless explicitly drained
**Changes:**
1. Wrap each `synthesizer.synthesize()` call in `AudioEngine.playbackLoop()` — including the prefetch task — in `autoreleasepool { }` to force release of ObjC-bridged temporaries.
2. After each synthesis call, invoke `MLX.GPU.drain()` (or equivalent MLXUtilsLibrary cleanup API) to release the GPU memory pool.
3. Cache `Book.sentences` — currently a computed property that re-segments the entire book on every access. Change to a stored property computed once at init.
### 2. EPUB Parser — Preserving Structure and Formatting
Currently `EPUBParser` strips all HTML to plain text and collapses whitespace. For a proper reading experience, output `NSAttributedString` instead.
**Changes to BookParser module:**
- `Chapter` stores `attributedText: NSAttributedString` alongside the plain `text: String` property (derived via `.string`).
- `EPUBParser` walks the SwiftSoup DOM and builds an `NSAttributedString`:
- `<p>` → paragraph with spacing
- `<br>` → line break
- `<b>`, `<strong>` → bold trait
- `<i>`, `<em>` → italic trait
- `<h1>``<h6>` → bold + larger font size
- Everything else → body font
- Use dynamic type (`UIFont.preferredFont` / `NSFont.preferredFont`) as the base, respecting system font size settings.
- `PlainTextParser` similarly produces `NSAttributedString` with paragraph breaks on `\n\n`.
**Offset compatibility:** `SentenceSegmenter` continues to operate on plain `String`. Character offsets remain valid because `NSAttributedString.string` matches the plain text used for segmentation.
### 3. Reading UI — BookTextView (per platform)
Replace `ReadingTextView` (iOS) and `MacReadingTextView` (macOS) with a single `BookTextView` per platform.
**Responsibilities:**
- Display the **full book** as one `NSAttributedString` (all chapters concatenated with chapter break spacing).
- Sentence highlighting via temporary text attributes (yellow background on active sentence range).
- Tap/click → character offset callback for tap-to-play.
- Programmatic scrolling to a character offset (chapter jumps, auto-follow during playback).
**Two reading modes:**
1. **Scroll mode:** The text view uses standard scrolling (`UITextView` / `NSScrollView`). The full book is one tall scrollable column.
2. **Book (paged) mode:** TextKit 2 pagination — the `NSTextLayoutManager` lays text into page-sized `NSTextContainer`s. Navigation via `UIPageViewController` (iOS) or horizontal swipe/arrow keys (macOS). Pages reflow dynamically on rotation / window resize.
**Chapter navigation:**
- Each `Chapter` knows its character offset within the full book text.
- "Jump to chapter" computes the offset → scroll mode scrolls to it; paged mode finds which page contains that offset and navigates there.
**Auto-follow during playback:**
- Observe `engine.currentPosition` → compute highlighted sentence range → apply highlight attribute → scroll/page to keep the active sentence visible.
### 4. ReaderViewModel — Shared Logic
Extract duplicated logic from `ReaderView` (iOS) and `MacReaderView` (macOS) into a shared `ReaderViewModel` (`@Observable` class).
**Owns:**
- Book loading and parsing
- Synthesizer initialization
- Reference to `AudioEngine`
- Current chapter index, reading mode (scroll / paged), full attributed string
- Chapter offset table for jump navigation
- Highlight range computation
**Platform views become thin shells:**
- Toolbar: chapter picker, mode toggle (scroll / book)
- `BookTextView` (platform-specific wrapper)
- `PlaybackControls`
- Wire to the view model
### 5. Playback Controls and Mode Switching
**Playback controls** unchanged (play/pause, skip forward/back at sentence granularity). Auto-follow keeps the active sentence visible regardless of mode.
**Mode toggle:**
- Segmented control or toolbar button: scroll ↔ book mode.
- Switching modes preserves reading position (same character offset stays visible).
- Mode preference persisted on `StoredBook`.
**Position persistence:** Existing `lastPosition` (character offset) continues to work — offsets are display-mode-independent.
## Files Changed
**VorleserKit package:**
- `AudioEngine.swift` — autoreleasepool, GPU drain
- `Book.swift` — cache sentences as stored property
- `Chapter.swift` — add `attributedText: NSAttributedString`
- `EPUBParser.swift` — build NSAttributedString from HTML DOM
- `PlainTextParser.swift` — build NSAttributedString from plain text
- New: `ReaderViewModel.swift` — shared reader logic
**iOS app (`Vorleser-iOS/`):**
- New: `BookTextView.swift` — UIViewRepresentable wrapping UITextView
- New: paged mode view with UIPageViewController + TextKit 2
- `ReaderView.swift` — rewrite as thin shell over ReaderViewModel
- Remove: `ReadingTextView.swift`
**macOS app (`Vorleser-macOS/`):**
- New: `BookTextView.swift` — NSViewRepresentable wrapping NSTextView
- New: paged mode view with TextKit 2 pagination + swipe/arrows
- `MacReaderView.swift` — rewrite as thin shell over ReaderViewModel
- Remove: `MacReadingTextView.swift`
## Not In Scope
- Voice selection UI
- Bookmarks
- Table of contents sidebar
- Search within book
- Night mode / theme customization