Files

Mike Wichers 1f01fedf00 Auto-commit: v2.9 — Fix background task hangs (OAuth headless guard, SQLite timeouts, log touch)

- ai/setup.py: Added threading import; OAuth block now detects background/headless
  threads and skips run_local_server to prevent indefinite blocking. Logs a clear
  warning and falls back to ADC for Vertex AI. Token file only written when creds
  are not None.
- web/tasks.py: All sqlite3.connect() calls now use timeout=30, check_same_thread=False.
  OperationalError on the initial status update is caught and logged via utils.log.
  generate_book_task now touches initial_log immediately so the UI polling endpoint
  always finds an existing file even if the worker crashes on the next line.
- ai_blueprint.md: Bumped to v2.9; Section 12.D sub-items 1-3 marked ✅; item 13
  added to summary.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-02-21 10:50:00 -05:00

20 KiB

Raw Blame History

AI Context Optimization Blueprint (v2.9)

This blueprint outlines architectural improvements for how AI context is managed during the writing process. The goal is to provide the AI (Claude/Gemini) with better, highly-targeted context upfront, which will dramatically improve first-draft quality and reduce the reliance on expensive, time-consuming quality checks and rewrites (currently up to 5 attempts).

0. Model Selection & Review (New Step)

Current Process: Model selection logic exists in ai/setup.py (which determines optimal models based on API queries and fallbacks to defaults like gemini-2.0-flash), and the models are instantiated in ai/models.py. The active selection is cached in data/model_cache.json and viewed via templates/system_status.html.

Actionable Review Steps: Every time a change is made to this blueprint or related files, the following steps must be completed to review the models, update the version, and ensure changes are saved properly:

Check the System Status UI: Navigate to /system/status in the web application. This UI displays the "AI Model Selection" and "All Models Ranked".
Verify Cache (data/model_cache.json): Check this file to see the currently cached models for the roles (logic, writer, artist).
Review Selection Logic (ai/setup.py): Examine select_best_models() to understand the criteria and prompt used for model selection (e.g., favoring gemini-2.x over 1.5, using Flash for speed and Pro for complex reasoning).
Force Refresh: Use the "Refresh & Optimize" button in the System Status UI or call ai.init_models(force=True) to force a re-evaluation of available models from the Google API and update the cache.
Update Version & Commit: Ensure the ai_blueprint.md version is bumped and a git commit is made reflecting the changes.

1. Context Trimming & Relevance Filtering (The "Less is More" Approach)

Current Problem: story/writer.py injects the entire list of characters (chars_for_writer) into the prompt for every chapter. As the book grows, this wastes tokens, dilutes the AI's attention, and causes hallucinations where random characters appear in scenes they don't belong in.

Solution:

Dynamic Character Injection: ✅ Only inject characters who are explicitly mentioned in the chapter's scene_beats, plus the POV character. (Implemented v1.5.0)
RAG for Lore/Locations: ✅ Lightweight retrieval system implemented — chapter beats tagged with locations/key_items, lore index built via update_lore_index in bible_tracker.py, only relevant entries injected per chapter. (Implemented v2.5 — see Section 8)

2. Structured "Story So Far" (State Management)

Current Problem: prev_sum is likely a growing narrative blob. prev_content is truncated blindly to 2000 tokens, which might chop off the actual ending of the previous chapter (the most important part for continuity).

Solution:

Smart Truncation: ✅ Instead of truncating prev_content blindly, take the last 1000 tokens of the previous chapter, ensuring the immediate hand-off (where characters are standing, what they just said) is perfectly preserved. (Implemented v1.5.0 via utils.truncate_to_tokens tail logic)
Thread Tracking: ✅ Story So Far refactored into structured story_state.json via story/state.py — active_threads, immediate_handoff (3 sentences), and resolved_threads; injected as structured prompt context in engine.py, replacing the raw summary blob. (Implemented v2.5 — see Section 9)

3. Pre-Flight Scene Expansion (Fixing it before writing)

Current Problem: The system relies heavily on evaluate_chapter_quality to catch bad pacing, missing beats, or "tell not show" errors. This causes loops of rewriting.

Solution:

Beat Expansion Step: ✅ Before sending the prompt to the model_writer, use an inexpensive, fast model to expand the scene_beats into a "Director's Treatment." This treatment explicitly outlines the sensory details, emotional shifts, and entry/exit staging for the chapter. (Implemented v2.0 — expand_beats_to_treatment in story/writer.py)

4. Enhanced Bible Tracker (Stateful World)

Current Problem: bible_tracker.py updates character clothing, descriptors, and speech styles, but does not track location states, time of day, or inventory/items.

Solution:

✅ Expanded update_tracking to include current_location, time_of_day, and held_items. (Implemented v1.5.0)
✅ This explicit "Scene State" is passed to the writer prompt so the AI doesn't have to guess if it's day or night, or if a character is still holding a specific artifact from two chapters ago. (Implemented v1.5.0)

5. UI/UX: Asynchronous Model Optimization (Refresh & Optimize)

Current Problem: Clicking "Refresh & Optimize" in templates/system_status.html submits a form that blocks the UI and results in a full page refresh. This creates a clunky, blocking experience.

Solution:

✅ Frontend (templates/system_status.html): Converted the <form> submission into an asynchronous AJAX fetch() call with a spinner and disabled button state during processing. (Implemented v2.2)
✅ Backend (web/routes/admin.py): Updated the optimize_models route to detect AJAX requests and return a JSON status response instead of performing a hard redirect. (Implemented v2.2)

6. Eliminating AI-Isms and Enforcing Genre Authenticity (v2.3)

Current Problem: Despite the existing style_guidelines.json and basic prompts, the AI writing often falls back on predictable phrases ("testament to," "shiver down spine," "a sense of") and lacks true human-like voice, especially failing to deeply adapt to specific genre conventions.