Root cause: Consumer(huey, workers=1, worker_type='thread', loglevel=20) raised TypeError on every app start because Huey 2.6.0 does not accept a `loglevel` keyword argument. The exception was silently caught and only printed to stdout, so the consumer never ran and all tasks stayed 'queued' forever — causing the 'Preparing environment / Waiting for logs' hang. Fixes: - web/app.py: Remove invalid `loglevel=20` from Consumer(); configure Huey logging via logging.basicConfig(WARNING) instead. Add persistent error logging to data/consumer_error.log for future diagnosis. - core/config.py: Replace emoji print() calls with ASCII-safe equivalents to prevent UnicodeEncodeError on Windows cp1252 terminals at import time. - core/config.py: Update VERSION to 2.9 (was stale at 1.5.0). - ai_blueprint.md: Bump to v2.10, document root cause and fixes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
20 KiB
AI Context Optimization Blueprint (v2.10)
This blueprint outlines architectural improvements for how AI context is managed during the writing process. The goal is to provide the AI (Claude/Gemini) with better, highly-targeted context upfront, which will dramatically improve first-draft quality and reduce the reliance on expensive, time-consuming quality checks and rewrites (currently up to 5 attempts).
0. Model Selection & Review (New Step)
Current Process:
Model selection logic exists in ai/setup.py (which determines optimal models based on API queries and fallbacks to defaults like gemini-2.0-flash), and the models are instantiated in ai/models.py. The active selection is cached in data/model_cache.json and viewed via templates/system_status.html.
Actionable Review Steps: Every time a change is made to this blueprint or related files, the following steps must be completed to review the models, update the version, and ensure changes are saved properly:
- Check the System Status UI: Navigate to
/system/statusin the web application. This UI displays the "AI Model Selection" and "All Models Ranked". - Verify Cache (
data/model_cache.json): Check this file to see the currently cached models for the roles (logic,writer,artist). - Review Selection Logic (
ai/setup.py): Examineselect_best_models()to understand the criteria and prompt used for model selection (e.g., favoringgemini-2.xover1.5, using Flash for speed and Pro for complex reasoning). - Force Refresh: Use the "Refresh & Optimize" button in the System Status UI or call
ai.init_models(force=True)to force a re-evaluation of available models from the Google API and update the cache. - Update Version & Commit: Ensure the
ai_blueprint.mdversion is bumped and a git commit is made reflecting the changes.
1. Context Trimming & Relevance Filtering (The "Less is More" Approach)
Current Problem:
story/writer.py injects the entire list of characters (chars_for_writer) into the prompt for every chapter. As the book grows, this wastes tokens, dilutes the AI's attention, and causes hallucinations where random characters appear in scenes they don't belong in.
Solution:
- Dynamic Character Injection: ✅ Only inject characters who are explicitly mentioned in the chapter's
scene_beats, plus the POV character. (Implemented v1.5.0) - RAG for Lore/Locations: ✅ Lightweight retrieval system implemented — chapter beats tagged with
locations/key_items, lore index built viaupdate_lore_indexinbible_tracker.py, only relevant entries injected per chapter. (Implemented v2.5 — see Section 8)
2. Structured "Story So Far" (State Management)
Current Problem:
prev_sum is likely a growing narrative blob. prev_content is truncated blindly to 2000 tokens, which might chop off the actual ending of the previous chapter (the most important part for continuity).
Solution:
- Smart Truncation: ✅ Instead of truncating
prev_contentblindly, take the last 1000 tokens of the previous chapter, ensuring the immediate hand-off (where characters are standing, what they just said) is perfectly preserved. (Implemented v1.5.0 viautils.truncate_to_tokenstail logic) - Thread Tracking: ✅
Story So Farrefactored into structuredstory_state.jsonviastory/state.py—active_threads,immediate_handoff(3 sentences), andresolved_threads; injected as structured prompt context inengine.py, replacing the raw summary blob. (Implemented v2.5 — see Section 9)
3. Pre-Flight Scene Expansion (Fixing it before writing)
Current Problem:
The system relies heavily on evaluate_chapter_quality to catch bad pacing, missing beats, or "tell not show" errors. This causes loops of rewriting.
Solution:
- Beat Expansion Step: ✅ Before sending the prompt to the
model_writer, use an inexpensive, fast model to expand thescene_beatsinto a "Director's Treatment." This treatment explicitly outlines the sensory details, emotional shifts, and entry/exit staging for the chapter. (Implemented v2.0 —expand_beats_to_treatmentinstory/writer.py)
4. Enhanced Bible Tracker (Stateful World)
Current Problem:
bible_tracker.py updates character clothing, descriptors, and speech styles, but does not track location states, time of day, or inventory/items.
Solution:
- ✅ Expanded
update_trackingto includecurrent_location,time_of_day, andheld_items. (Implemented v1.5.0) - ✅ This explicit "Scene State" is passed to the writer prompt so the AI doesn't have to guess if it's day or night, or if a character is still holding a specific artifact from two chapters ago. (Implemented v1.5.0)
5. UI/UX: Asynchronous Model Optimization (Refresh & Optimize)
Current Problem:
Clicking "Refresh & Optimize" in templates/system_status.html submits a form that blocks the UI and results in a full page refresh. This creates a clunky, blocking experience.
Solution:
- ✅ Frontend (
templates/system_status.html): Converted the<form>submission into an asynchronous AJAXfetch()call with a spinner and disabled button state during processing. (Implemented v2.2) - ✅ Backend (
web/routes/admin.py): Updated theoptimize_modelsroute to detect AJAX requests and return a JSON status response instead of performing a hard redirect. (Implemented v2.2)
6. Eliminating AI-Isms and Enforcing Genre Authenticity (v2.3)
Current Problem:
Despite the existing style_guidelines.json and basic prompts, the AI writing often falls back on predictable phrases ("testament to," "shiver down spine," "a sense of") and lacks true human-like voice, especially failing to deeply adapt to specific genre conventions.
Solution & Implementation Plan:
- ✅ Genre-Specific Instructions:
story/writer.pynow callsget_genre_instructions(genre)to inject genre-tailored mandates (Thriller, Romance, Fantasy, Sci-Fi, Horror, Historical, General Fiction) into every draft prompt. (Implemented v2.3) - ✅ Deep POV Mandate: The draft prompt in
story/writer.pyincludes aDEEP_POV_MANDATEblock that explicitly bans summary mode and all filter words, with concrete rewrite examples. (Implemented v2.3) - ✅ Prose Filter Enhancements: The default
ai_ismslist instory/style_persona.pyexpanded from 12 to 33+ banned phrases. (Implemented v2.3) - ✅ Enforce Show, Don't Tell via Evaluation:
story/editor.pyevaluate_chapter_qualitynow includes aDEEP_POV_ENFORCEMENTblock with automatic fail conditions for filter word density and summary mode. (Implemented v2.3)
7. Regular Maintenance of AI-Isms (Continuous Improvement) — v2.4
Current Problem:
AI models evolve, and new overused phrases regularly emerge. The static list in data/style_guidelines.json will become outdated. The refresh_style_guidelines() function already exists in story/style_persona.py but has no UI or scheduled trigger.
Solution & Implementation Plan:
- ✅ Admin UI Trigger: Added "Refresh Style Rules" button to
templates/system_status.htmlusing the same async AJAX spinner pattern as "Refresh & Optimize". (Implemented v2.4) - ✅ Backend Route: Added
/admin/refresh-style-guidelinesroute inweb/routes/admin.pythat callsstyle_persona.refresh_style_guidelines(model_logic)and returns JSON status with counts. (Implemented v2.4) - ✅ Logging: Route logs the updated counts to
data/app.logviautils.log. (Implemented v2.4)
8. Lore & Location Context Retrieval (RAG-Lite) — v2.5
Current Problem:
The remaining half of Section 1 — prev_sum and the style_block carry all world-building as a monolithic blob. Locations, artifacts, and lore details not relevant to the current chapter waste tokens and dilute the AI's focus, causing it to hallucinate setting details or ignore established world rules.
Solution & Implementation Plan:
- ✅ Tag Beats with Locations/Items: Chapter schema supports optional
locationsandkey_itemsarrays.story/writer.pyreads these from the chapter dict. (Implemented v2.5) - ✅ Lore Index in Bible: Added
update_lore_index(folder, chapter_text, current_lore)tostory/bible_tracker.py. Index is stored intracking_lore.jsonand loaded intotracking['lore']. (Implemented v2.5) - ✅ Retrieval in
write_chapter:story/writer.pymatches chapterlocations/key_itemsagainst the lore index and injects aLORE_CONTEXTblock into the prompt. (Implemented v2.5) - ✅ Fallback: If chapter has no
locations/key_itemsor lore index is empty,lore_blockis empty and behaviour is unchanged. (Implemented v2.5) - ✅ Engine Wiring:
cli/engine.pyloadstracking_lore.jsonon resume, callsupdate_lore_indexafter each chapter, and saves totracking_lore.json. (Implemented v2.5)
9. Structured "Story So Far" — Thread Tracking — v2.5
Current Problem:
The remaining half of Section 2 — prev_sum is a growing unstructured narrative blob. As chapters accumulate, the AI receives an ever-longer wall of prose-summary as context, which dilutes attention, buries the most important recent state, and causes continuity drift.
Solution & Implementation Plan:
- ✅ Structured Summary Schema: New
story/state.pymodule. After each chapter,update_story_state()usesmodel_logicto extract and savestory_state.jsonwithactive_threads,immediate_handoff(exactly 3 sentences), andresolved_threads. (Implemented v2.5) - ✅ Prompt Injection:
cli/engine.pycallsstory_state.format_for_prompt(current_story_state, chapter_beats)before eachwrite_chaptercall. The formatted string replacesprev_sumas the context. Falls back to the rawsummaryblob if no structured state exists yet. (Implemented v2.5) - ✅ State Update Step:
cli/engine.pycallsstory_state.update_story_state()after each chapter is written and accepted, savingstory_state.jsonin the book folder. (Implemented v2.5) - ✅ Continuity Guard:
format_for_prompt()always placesIMMEDIATE STORY HANDOFFfirst, followed byACTIVE PLOT THREADS. Resolved threads are only included if referenced in the next chapter's beats. (Implemented v2.5)
10. Consistency Report Quick Fix (v2.6)
Current Problem:
The templates/consistency_report.html page displays issues found in the manuscript but does not provide a direct action to fix them. It only suggests using the "Read & Edit" or "Modify & Re-run" features.
Solution & Implementation Plan:
- ✅ Frontend Action: Added "Redo Book" form to
templates/consistency_report.htmlfooter with a text input for the revision instruction and a confirmation prompt on submit. (Implemented v2.6) - ✅ Backend Route: Added
/project/<run_id>/revise_book/<book_folder>route inweb/routes/run.py. Route creates a newRunrecord and queuesgenerate_book_taskwith the user's instruction asfeedbackandsource_run_idpointing to the original run. The existing bible refinement logic ingenerate_book_taskapplies the instruction to the bible before regenerating. (Implemented v2.6)
11. Series Continuity & Book Number Awareness (v2.7)
Current Problem:
The system generates books for a series, but the prompts in story/planner.py (specifically enrich and plan_structure) and the writing prompts do not explicitly pass the series_metadata (such as is_series, series_title, book_number, and total_books) to the LLM. The AI doesn't know if it's generating Book 1, Book 2, or Book 3, leading to inconsistent pacing and continuity across a series.
Solution & Implementation Plan:
- ✅ Planner Prompts Update: Modified
enrich()andplan_structure()instory/planner.pyto extractbp.get('series_metadata', {})and inject aSERIES_CONTEXTblock — "This is Book X of Y in the Z series" with position-aware guidance (Book 1 = establish, middle books = escalate, final book = resolve) — into the prompt whenis_seriesis true. (Implemented v2.7) - ✅ Writer Prompts Update:
story/writer.pywrite_chapter()builds and injects the sameSERIES_CONTEXTblock into the chapter writing prompt and passes it asseries_contexttoevaluate_chapter_quality()instory/editor.py.editor.pyevaluate_chapter_quality()now accepts an optionalseries_contextparameter and injects it into the evaluation METADATA so the editor scores arcs relative to the book's position in the series. (Implemented v2.7)
12. Infrastructure & UI Bug Fixes (v2.8)
Problems Found & Fixed:
A. API Timeout Hangs (Spinning Logs)
The Gemini SDK had no timeout configured on any network call, causing threads to hang indefinitely:
ai/models.pygenerate_content()had no timeout → runs spun forever on API errors.ai/setup.pyall threegenai.list_models()calls had no timeout → model init could hang.ai/models.pyretry handler calledinit_models(force=True)— a second network call during an existing failure, cascading the hang.
Fixes Applied:
- ✅
ai/models.py: Added_GENERATION_TIMEOUT = 180class variable; allgenerate_content()calls now mergerequest_options={"timeout": 180}. Removedinit_models(force=True)from retry handler. (Implemented v2.8) - ✅
ai/setup.py: Added_LIST_MODELS_TIMEOUT = {"timeout": 30}passed to all threegenai.list_models()call sites (get_optimal_model,select_best_models,init_models). (Implemented v2.8)
B. Huey Consumer Never Started (Tasks Queued But Never Executed)
web/app.py started the Huey background consumer inside if __name__ == "__main__":, which only runs when the script is executed directly. Under flask run, gunicorn, or any WSGI runner the block is never reached — tasks were queued in queue.db but never processed.
- ✅
web/app.py: Moved Huey consumer start to module level with a Werkzeug reloader guard (WERKZEUG_RUN_MAIN) and aFLASK_TESTINGguard to prevent duplicate/test-time consumers. Consumer runs as a daemon thread. (Implemented v2.8)
C. "Create New Book" Showing Nothing
Three bugs combined to produce a blank page or silent failure when creating a new project:
- ✅
templates/project_setup.html:{{ s.tropes|join(', ') }}and{{ s.formatting_rules|join(', ') }}raised Jinja2UndefinedErrorwhen AI analysis failed and the fallback dict lacked those keys → 500 blank page. Fixed to{{ (s.tropes or [])|join(', ') }}. (Implemented v2.8) - ✅
web/routes/project.py(project_setup_wizard): Whenmodel_logicwasNone, the route silently redirected to the dashboard with a flash the user missed. Now renders the setup form with a complete default suggestions dict (all fields populated, lists as[]) and a visible"warning"flash so the user can fill in details manually. (Implemented v2.8) - ✅
web/routes/project.py(create_project_final):planner.enrich()was called with the full project bible dict.enrich()readsbp.get('manual_instruction')from the top level (got'A generic story'fallback — the real concept was inbible['books'][0]['manual_instruction']), and wrote enriched data into a newbook_metadatakey instead of the bible'sbooks[0]. Fixed to build a proper per-book blueprint, call enrich, and mergecharacters,plot_beats, andstructure_promptback into the correct bible locations. (Implemented v2.8)
D. "Waiting for logs" / "Preparing environment" Background Task Hangs
The UI gets stuck indefinitely because the background Huey worker thread hangs before emitting the first "Starting Job" log, or fails to connect to the database.
Places that impact this and their fixes:
-
✅ OAuth Browser Prompt in Background Thread:
ai/setup.py— Addedimport threading; the OAuth block now checksthreading.current_thread() is not threading.main_thread(). If running headlessly,run_local_serveris skipped,credsis set toNone, and a clear warning is logged. Vertex AI falls back to ADC. Token is only written ifcredsis notNone. (Implemented v2.9) -
✅ SQLite Database Locking Timeout:
web/tasks.py— Allsqlite3.connect()calls now usetimeout=30, check_same_thread=False. The initial status-updateOperationalErroris caught and logged viautils.logso it appears in the log file rather than silently disappearing. (Implemented v2.9) -
✅ Missing Initial Log File Creation:
web/tasks.pygenerate_book_task— Theinitial_logpath is nowopen(…, 'a')-touched immediately after construction and beforeutils.set_log_file(), guaranteeing the file exists for UI polling even if the worker crashes on the very next line. (Implemented v2.9)
Summary of Actionable Changes for Implementation Mode:
- ✅ Modify
writer.pyto filterchars_for_writerbased on characters named inbeats. (Implemented in v1.5.0) - ✅ Modify
writer.pyprev_contentlogic to extract the tail of the chapter, not a blind slice. (Implemented in v1.5.0 viautils.truncate_to_tokenstail logic) - ✅ Update
bible_tracker.pyto track time of day and location states. (Implemented in v1.5.0) - ✅ Add a pre-processing function to expand chapter beats into staging directions before generating the prose draft. (Implemented in v2.0 —
expand_beats_to_treatmentinstory/writer.py) - ✅ (v2.2) Update "Refresh & Optimize" action in UI to be an async fetch call with a processing flag instead of a full page reload, and update
admin.pyto handle JSON responses. - ✅ (v2.3) Updated writing prompts and evaluation rubrics across
story/writer.py,story/editor.py, andstory/style_persona.pyto aggressively filter AI-isms, enforce Deep POV via a non-negotiable mandate, add genre-specific writing instructions, and fail chapters that rely on "telling" rather than "showing" via filter-word density checks in the evaluator. - ✅ (v2.4) Add "Refresh Style Rules" button to
system_status.htmland/admin/refresh-style-guidelinesroute inadmin.py. (Implemented v2.4) - ✅ (v2.5) Lore & Location RAG-Lite:
update_lore_indexinbible_tracker.py,tracking_lore.json, lore retrieval inwriter.py, wired inengine.py. (Implemented v2.5) - ✅ (v2.5) Structured Story State (Thread Tracking): new
story/state.py,story_state.json, structured prompt context replacing raw summary blob inengine.py. (Implemented v2.5) - ✅ (v2.6) "Redo Book" form in
consistency_report.html+revise_bookroute inrun.pythat creates a new run with the instruction applied as bible feedback. (Implemented v2.6) - ✅ (v2.7) Series Continuity Fix:
series_metadata(is_series, series_title, book_number, total_books) injected asSERIES_CONTEXTintostory/planner.py(enrich,plan_structure),story/writer.py(write_chapter), andstory/editor.py(evaluate_chapter_quality) prompts with position-aware guidance per book number. (Implemented v2.7) - ✅ (v2.8) Infrastructure & UI Bug Fixes: API timeouts (180s generation, 30s list_models) in
ai/models.py+ai/setup.py; Huey consumer moved to module level with reloader guard inweb/app.py; Jinja2UndefinedErrorfix fortropes/formatting_rulesinproject_setup.html;project_setup_wizardnow renders form instead of silent redirect when models fail;create_project_finalenrich()call fixed to use correct per-book blueprint structure. (Implemented v2.8) - ✅ (v2.9) Background Task Hang Fixes: OAuth headless guard in
ai/setup.py(skipsrun_local_serverin non-main threads, logs warning, falls back to ADC); SQLitetimeout=30, check_same_thread=Falseon all connections inweb/tasks.py; initial log file touched immediately ingenerate_book_taskso UI polling never sees an empty/missing file. (Implemented v2.9) - ✅ (v2.10) Huey Consumer Startup Fix:
Consumer.__init__()in Huey 2.6.0 does NOT accept aloglevelkeyword argument — the previous callConsumer(huey, workers=1, worker_type='thread', loglevel=20)raisedTypeErroron every app start, silently killing the consumer. All tasks stayedqueuedforever, causing the "Preparing environment / Waiting for logs" hang. Fixed by removingloglevel=20; Huey logging now configured vialogging.basicConfig. Consumer startup errors now written todata/consumer_error.logfor diagnosis. Also removed emoji characters fromprint()calls incore/config.pythat causedUnicodeEncodeErroron Windowscp1252terminals. UpdatedVERSIONto2.9inconfig.py. (Implemented v2.10)