add design spec: memory fix, reader UI redesign
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,125 @@
|
||||
# Memory Fix and Reader UI Redesign
|
||||
|
||||
**Date:** 2026-03-14
|
||||
**Status:** Approved
|
||||
**Platforms:** iOS and macOS
|
||||
|
||||
## Problem Statement
|
||||
|
||||
Two issues with the current Vorleser app:
|
||||
|
||||
1. **Memory crash on iOS** — The TTS synthesis loop accumulates MLX tensors across sentences without releasing them. iOS hits its ~1-2 GB memory ceiling and crashes after a few sentences of playback.
|
||||
2. **Reading UI is basic** — Text is shown one chapter at a time. Users want a proper ebook reading experience: continuous scrolling or paged (Apple Books-style) reading with chapter navigation, preserved formatting, and auto-follow during playback.
|
||||
|
||||
## Design
|
||||
|
||||
### 1. Memory Fix
|
||||
|
||||
**Root cause:** Each `synthesizer.synthesize(text:)` call creates MLX tensors (model activations, attention matrices, audio output) that accumulate because:
|
||||
- Swift ARC doesn't eagerly release autoreleased ObjC/C++ objects inside async loops
|
||||
- MLX holds a GPU memory pool that grows unless explicitly drained
|
||||
|
||||
**Changes:**
|
||||
1. Wrap each `synthesizer.synthesize()` call in `AudioEngine.playbackLoop()` — including the prefetch task — in `autoreleasepool { }` to force release of ObjC-bridged temporaries.
|
||||
2. After each synthesis call, invoke `MLX.GPU.drain()` (or equivalent MLXUtilsLibrary cleanup API) to release the GPU memory pool.
|
||||
3. Cache `Book.sentences` — currently a computed property that re-segments the entire book on every access. Change to a stored property computed once at init.
|
||||
|
||||
### 2. EPUB Parser — Preserving Structure and Formatting
|
||||
|
||||
Currently `EPUBParser` strips all HTML to plain text and collapses whitespace. For a proper reading experience, output `NSAttributedString` instead.
|
||||
|
||||
**Changes to BookParser module:**
|
||||
- `Chapter` stores `attributedText: NSAttributedString` alongside the plain `text: String` property (derived via `.string`).
|
||||
- `EPUBParser` walks the SwiftSoup DOM and builds an `NSAttributedString`:
|
||||
- `<p>` → paragraph with spacing
|
||||
- `<br>` → line break
|
||||
- `<b>`, `<strong>` → bold trait
|
||||
- `<i>`, `<em>` → italic trait
|
||||
- `<h1>`–`<h6>` → bold + larger font size
|
||||
- Everything else → body font
|
||||
- Use dynamic type (`UIFont.preferredFont` / `NSFont.preferredFont`) as the base, respecting system font size settings.
|
||||
- `PlainTextParser` similarly produces `NSAttributedString` with paragraph breaks on `\n\n`.
|
||||
|
||||
**Offset compatibility:** `SentenceSegmenter` continues to operate on plain `String`. Character offsets remain valid because `NSAttributedString.string` matches the plain text used for segmentation.
|
||||
|
||||
### 3. Reading UI — BookTextView (per platform)
|
||||
|
||||
Replace `ReadingTextView` (iOS) and `MacReadingTextView` (macOS) with a single `BookTextView` per platform.
|
||||
|
||||
**Responsibilities:**
|
||||
- Display the **full book** as one `NSAttributedString` (all chapters concatenated with chapter break spacing).
|
||||
- Sentence highlighting via temporary text attributes (yellow background on active sentence range).
|
||||
- Tap/click → character offset callback for tap-to-play.
|
||||
- Programmatic scrolling to a character offset (chapter jumps, auto-follow during playback).
|
||||
|
||||
**Two reading modes:**
|
||||
|
||||
1. **Scroll mode:** The text view uses standard scrolling (`UITextView` / `NSScrollView`). The full book is one tall scrollable column.
|
||||
|
||||
2. **Book (paged) mode:** TextKit 2 pagination — the `NSTextLayoutManager` lays text into page-sized `NSTextContainer`s. Navigation via `UIPageViewController` (iOS) or horizontal swipe/arrow keys (macOS). Pages reflow dynamically on rotation / window resize.
|
||||
|
||||
**Chapter navigation:**
|
||||
- Each `Chapter` knows its character offset within the full book text.
|
||||
- "Jump to chapter" computes the offset → scroll mode scrolls to it; paged mode finds which page contains that offset and navigates there.
|
||||
|
||||
**Auto-follow during playback:**
|
||||
- Observe `engine.currentPosition` → compute highlighted sentence range → apply highlight attribute → scroll/page to keep the active sentence visible.
|
||||
|
||||
### 4. ReaderViewModel — Shared Logic
|
||||
|
||||
Extract duplicated logic from `ReaderView` (iOS) and `MacReaderView` (macOS) into a shared `ReaderViewModel` (`@Observable` class).
|
||||
|
||||
**Owns:**
|
||||
- Book loading and parsing
|
||||
- Synthesizer initialization
|
||||
- Reference to `AudioEngine`
|
||||
- Current chapter index, reading mode (scroll / paged), full attributed string
|
||||
- Chapter offset table for jump navigation
|
||||
- Highlight range computation
|
||||
|
||||
**Platform views become thin shells:**
|
||||
- Toolbar: chapter picker, mode toggle (scroll / book)
|
||||
- `BookTextView` (platform-specific wrapper)
|
||||
- `PlaybackControls`
|
||||
- Wire to the view model
|
||||
|
||||
### 5. Playback Controls and Mode Switching
|
||||
|
||||
**Playback controls** unchanged (play/pause, skip forward/back at sentence granularity). Auto-follow keeps the active sentence visible regardless of mode.
|
||||
|
||||
**Mode toggle:**
|
||||
- Segmented control or toolbar button: scroll ↔ book mode.
|
||||
- Switching modes preserves reading position (same character offset stays visible).
|
||||
- Mode preference persisted on `StoredBook`.
|
||||
|
||||
**Position persistence:** Existing `lastPosition` (character offset) continues to work — offsets are display-mode-independent.
|
||||
|
||||
## Files Changed
|
||||
|
||||
**VorleserKit package:**
|
||||
- `AudioEngine.swift` — autoreleasepool, GPU drain
|
||||
- `Book.swift` — cache sentences as stored property
|
||||
- `Chapter.swift` — add `attributedText: NSAttributedString`
|
||||
- `EPUBParser.swift` — build NSAttributedString from HTML DOM
|
||||
- `PlainTextParser.swift` — build NSAttributedString from plain text
|
||||
- New: `ReaderViewModel.swift` — shared reader logic
|
||||
|
||||
**iOS app (`Vorleser-iOS/`):**
|
||||
- New: `BookTextView.swift` — UIViewRepresentable wrapping UITextView
|
||||
- New: paged mode view with UIPageViewController + TextKit 2
|
||||
- `ReaderView.swift` — rewrite as thin shell over ReaderViewModel
|
||||
- Remove: `ReadingTextView.swift`
|
||||
|
||||
**macOS app (`Vorleser-macOS/`):**
|
||||
- New: `BookTextView.swift` — NSViewRepresentable wrapping NSTextView
|
||||
- New: paged mode view with TextKit 2 pagination + swipe/arrows
|
||||
- `MacReaderView.swift` — rewrite as thin shell over ReaderViewModel
|
||||
- Remove: `MacReadingTextView.swift`
|
||||
|
||||
## Not In Scope
|
||||
|
||||
- Voice selection UI
|
||||
- Bookmarks
|
||||
- Table of contents sidebar
|
||||
- Search within book
|
||||
- Night mode / theme customization
|
||||
Reference in New Issue
Block a user