# AI Context Optimization Blueprint (v2.9) This blueprint outlines architectural improvements for how AI context is managed during the writing process. The goal is to provide the AI (Claude/Gemini) with **better, highly-targeted context upfront**, which will dramatically improve first-draft quality and reduce the reliance on expensive, time-consuming quality checks and rewrites (currently up to 5 attempts). ## 0. Model Selection & Review (New Step) **Current Process:** Model selection logic exists in `ai/setup.py` (which determines optimal models based on API queries and fallbacks to defaults like `gemini-2.0-flash`), and the models are instantiated in `ai/models.py`. The active selection is cached in `data/model_cache.json` and viewed via `templates/system_status.html`. **Actionable Review Steps:** Every time a change is made to this blueprint or related files, the following steps must be completed to review the models, update the version, and ensure changes are saved properly: 1. **Check the System Status UI**: Navigate to `/system/status` in the web application. This UI displays the "AI Model Selection" and "All Models Ranked". 2. **Verify Cache (`data/model_cache.json`)**: Check this file to see the currently cached models for the roles (`logic`, `writer`, `artist`). 3. **Review Selection Logic (`ai/setup.py`)**: Examine `select_best_models()` to understand the criteria and prompt used for model selection (e.g., favoring `gemini-2.x` over `1.5`, using Flash for speed and Pro for complex reasoning). 4. **Force Refresh**: Use the "Refresh & Optimize" button in the System Status UI or call `ai.init_models(force=True)` to force a re-evaluation of available models from the Google API and update the cache. 5. **Update Version & Commit**: Ensure the `ai_blueprint.md` version is bumped and a git commit is made reflecting the changes. ## 1. Context Trimming & Relevance Filtering (The "Less is More" Approach) **Current Problem:** `story/writer.py` injects the *entire* list of characters (`chars_for_writer`) into the prompt for every chapter. As the book grows, this wastes tokens, dilutes the AI's attention, and causes hallucinations where random characters appear in scenes they don't belong in. **Solution:** - **Dynamic Character Injection:** ✅ Only inject characters who are explicitly mentioned in the chapter's `scene_beats`, plus the POV character. *(Implemented v1.5.0)* - **RAG for Lore/Locations:** ✅ Lightweight retrieval system implemented — chapter beats tagged with `locations`/`key_items`, lore index built via `update_lore_index` in `bible_tracker.py`, only relevant entries injected per chapter. *(Implemented v2.5 — see Section 8)* ## 2. Structured "Story So Far" (State Management) **Current Problem:** `prev_sum` is likely a growing narrative blob. `prev_content` is truncated blindly to 2000 tokens, which might chop off the actual ending of the previous chapter (the most important part for continuity). **Solution:** - **Smart Truncation:** ✅ Instead of truncating `prev_content` blindly, take the *last* 1000 tokens of the previous chapter, ensuring the immediate hand-off (where characters are standing, what they just said) is perfectly preserved. *(Implemented v1.5.0 via `utils.truncate_to_tokens` tail logic)* - **Thread Tracking:** ✅ `Story So Far` refactored into structured `story_state.json` via `story/state.py` — `active_threads`, `immediate_handoff` (3 sentences), and `resolved_threads`; injected as structured prompt context in `engine.py`, replacing the raw summary blob. *(Implemented v2.5 — see Section 9)* ## 3. Pre-Flight Scene Expansion (Fixing it before writing) **Current Problem:** The system relies heavily on `evaluate_chapter_quality` to catch bad pacing, missing beats, or "tell not show" errors. This causes loops of rewriting. **Solution:** - **Beat Expansion Step:** ✅ Before sending the prompt to the `model_writer`, use an inexpensive, fast model to expand the `scene_beats` into a "Director's Treatment." This treatment explicitly outlines the sensory details, emotional shifts, and entry/exit staging for the chapter. *(Implemented v2.0 — `expand_beats_to_treatment` in `story/writer.py`)* ## 4. Enhanced Bible Tracker (Stateful World) **Current Problem:** `bible_tracker.py` updates character clothing, descriptors, and speech styles, but does not track location states, time of day, or inventory/items. **Solution:** - ✅ Expanded `update_tracking` to include `current_location`, `time_of_day`, and `held_items`. *(Implemented v1.5.0)* - ✅ This explicit "Scene State" is passed to the writer prompt so the AI doesn't have to guess if it's day or night, or if a character is still holding a specific artifact from two chapters ago. *(Implemented v1.5.0)* ## 5. UI/UX: Asynchronous Model Optimization (Refresh & Optimize) **Current Problem:** Clicking "Refresh & Optimize" in `templates/system_status.html` submits a form that blocks the UI and results in a full page refresh. This creates a clunky, blocking experience. **Solution:** - ✅ **Frontend (`templates/system_status.html`):** Converted the `