feat: Improve book quality — stronger evaluator, more refinement attempts, quality-first model selection

- Fix: chapter quality evaluation now uses model_logic (free Pro) instead of model_writer (Flash). The model that wrote the chapter was also scoring it, causing circular, lenient grading. - Increase max_attempts in write_chapter from 2 to 3 for more refinement passes per chapter. - Update auto model selection prompt (ai/setup.py) to prioritize quality over budget framing: free/preview/exp models preferred by capability (Pro > Flash, 2.5 > 2.0 > 1.5), not just cost. Writer role now allowed to use best free Flash/Pro preview — not restricted to basic Flash only. - Bump version to 3.0. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 21:28:49 -05:00
parent f740174257
commit 6684ec2bf5
4 changed files with 20 additions and 16 deletions
--- a/README.md
+++ b/README.md
@@ -115,7 +115,7 @@ Open `http://localhost:5000`.
 - **Dynamic Pacing:** Monitors story progress during writing and inserts bridge chapters to slow a rushing plot or removes redundant ones detected mid-stream — without restarting.
 - **Series Continuity:** When generating Book 2+, carries forward character visual tracking, established relationships, plot threads, and a cumulative "Story So Far" summary.
 - **Persona Refinement Loop:** Every 5 chapters, analyzes actual written text to refine the author persona model, maintaining stylistic consistency throughout the book.
- **Consistency Checker (`editor.py`):** Scores chapters on 8 rubrics (engagement, voice, sensory detail, scene execution, etc.) and flags AI-isms ("tapestry", "palpable tension") and weak filter verbs ("felt", "realized").
+- **Consistency Checker (`editor.py`):** Scores chapters on 13 rubrics (engagement, voice, sensory detail, scene execution, dialogue, pacing, staging, prose dynamics, clarity, etc.) and flags AI-isms ("tapestry", "palpable tension") and weak filter verbs ("felt", "realized"). Chapter evaluation now uses the Logic model (free Pro) rather than the Writer model, ensuring stricter and more accurate scoring.
 - **Dynamic Character Injection (`writer.py`):** Only injects characters explicitly named in the chapter's `scene_beats` plus the POV character into the writer prompt. Eliminates token waste from unused characters and reduces hallucinated appearances.
 - **Smart Context Tail (`writer.py`):** Extracts the final ~1,000 tokens of the previous chapter (the actual ending) rather than blindly truncating from the front. Ensures the hand-off point — where characters are standing and what was last said — is always preserved.
 - **Stateful Scene Tracking (`bible_tracker.py`):** After each chapter, the tracker records each character's `current_location`, `time_of_day`, and `held_items` in addition to appearance and events. This scene state is injected into subsequent chapter prompts so the writer knows exactly where characters are, what time it is, and what they're carrying.
@@ -130,7 +130,7 @@ Open `http://localhost:5000`.

 ### AI Infrastructure (`ai/`)
 - **Resilient Model Wrapper:** Wraps every Gemini API call with up to 3 retries and exponential backoff, handles quota errors and rate limits, and can switch to an alternative model mid-stream.
- **Auto Model Selection:** On startup, a bootstrapper model queries the Gemini API and selects the optimal models for Logic, Writer, Artist, and Image roles. Selection is cached for 24 hours.
+- **Auto Model Selection:** On startup, a bootstrapper model queries the Gemini API and selects the optimal models for Logic, Writer, Artist, and Image roles. Selection is cached for 24 hours. The selection algorithm now prioritizes quality — free/preview/exp models are preferred by capability (Pro > Flash, 2.5 > 2.0 > 1.5) rather than by cost alone.
 - **Vertex AI Support:** If `GCP_PROJECT` is set and OAuth credentials are present, initializes Vertex AI automatically for Imagen image generation.
 - **Payload Guardrails:** Every generation call estimates the prompt token count before dispatch. If the payload exceeds 30,000 tokens, a warning is logged so runaway context injection is surfaced immediately.