Files

Mike Wichers 2100ca2312 feat: Implement ai_blueprint.md action plan — architectural review & optimisations

Steps 1–7 of the ai_blueprint.md action plan executed:

DOCUMENTATION (Steps 1–3, 6–7):
- docs/current_state_analysis.md: Phase-by-phase cost/quality mapping of existing pipeline
- docs/alternatives_analysis.md: 15 alternative approaches with testable hypotheses
- docs/experiment_design.md: 7 controlled A/B experiment specifications (CPC, HQS, CER metrics)
- ai_blueprint_v2.md: New recommended architecture with cost projections and experiment roadmap

CODE IMPROVEMENTS (Step 4 — Experiments 1–4 implemented):
- story/writer.py: Extract build_persona_info() — persona loaded once per book, not per chapter
- story/writer.py: Adaptive scoring thresholds — SCORE_PASSING scales 6.5→7.5 by chapter position
- story/writer.py: Beat expansion skip — if beats >100 words, skip Director's Treatment expansion
- story/planner.py: validate_outline() — pre-generation gate checks missing beats, continuity, pacing
- story/planner.py: Enrichment field validation — warn on missing title/genre after enrich()
- cli/engine.py: Wire persona cache, outline validation gate, chapter_position threading

Expected savings: ~285K tokens per 30-chapter novel (~7% cost reduction)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-02-22 22:01:30 -05:00

11 KiB

Raw Blame History

Current State Analysis: BookApp AI Pipeline

Date: 2026-02-22 Scope: Mapping existing codebase to the four phases defined in ai_blueprint.md Status: Completed — fulfills Action Plan Step 1

Overview

BookApp is an AI-powered novel generation engine using Google Gemini. The pipeline is structured into four phases that map directly to the review framework in ai_blueprint.md. This document catalogues the current implementation, identifies efficiency metrics, and surfaces limitations in each phase.

Phase 1: Foundation & Ideation ("The Seed")

Primary File: story/planner.py (lines 1–86) Supporting: story/style_persona.py (lines 81–104), core/config.py

What Happens

User provides a minimal manual_instruction (can be a single sentence).
enrich(bp, folder, context) calls the Logic model to expand this into:
- book_metadata: title, genre, tone, time period, structure type, formatting rules, content warnings
- characters: 2–8 named characters with roles and descriptions
- plot_beats: 5–7 concrete narrative beats
If the project is part of a series, context from previous books is injected.
create_initial_persona() generates a fictional author persona (name, bio, age, gender).

Costs (Per Book)

Task	Model	Input Tokens	Output Tokens	Cost (Pro-Exp)
`enrich()`	Logic	~10K	~3K	FREE
`create_initial_persona()`	Logic	~5.5K	~1.5K	FREE
Phase 1 Total	—	~15.5K	~4.5K	FREE

Known Limitations

ID	Issue	Impact
P1-L1	`enrich()` silently returns original BP on exception (line 84)	Invalid enrichment passes downstream without warning
P1-L2	`filter_characters()` blacklists keywords like "TBD", "protagonist" — can cull valid names	Characters named "The Protagonist" are silently dropped
P1-L3	Single-pass persona creation — no quality check on output	Generic personas produce poor voice throughout book
P1-L4	No validation that required `book_metadata` fields are non-null	Downstream crashes when title/genre are missing

Phase 2: Structuring & Outlining

Primary File: story/planner.py (lines 89–290) Supporting: story/style_persona.py

What Happens

plan_structure(bp, folder) maps plot beats to a structural framework (Hero's Journey, Three-Act, etc.) and produces ~10–15 events.
expand(events, pass_num, ...) iteratively enriches the outline. Called depth times (1–4 based on length preset). Each pass targets chapter count × 1.5 events as ceiling.
create_chapter_plan(events, bp, folder) converts events into concrete chapter objects with POV, pacing, and estimated word count.
get_style_guidelines() loads or refreshes the AI-ism blacklist and filter-word list.

Depth Strategy

Preset	Depth	Expand Calls	Approx Events
Flash Fiction	1	1	1
Short Story	1	1	5
Novella	2	2	15
Novel	3	3	30
Epic	4	4	50

Costs (30-Chapter Novel)

Task	Calls	Input Tokens	Cost (Pro-Exp)
`plan_structure`	1	~15K	FREE
`expand` × 3	3	~12K each	FREE
`create_chapter_plan`	1	~14K	FREE
`get_style_guidelines`	1	~8K	FREE
Phase 2 Total	6	~73K	FREE

Known Limitations

ID	Issue	Impact
P2-L1	Sequential `expand()` calls — each call unaware of final state	Redundant inter-call work; could be one multi-step prompt
P2-L2	No continuity validation on outline — character deaths/revivals not detected	Plot holes remain until expensive Phase 3 rewrite
P2-L3	Static chapter plan — cannot adapt if early chapters reveal pacing problem	Dynamic interventions in Phase 4 are costly workarounds
P2-L4	POV assignment is AI-generated, not validated against narrative logic	Wrong POV on key scenes; caught only during editing
P2-L5	Word count estimates are rough (~±30% actual variance)	Writer overshoots/undershoots target; word count normalization fails

Phase 3: The Writing Engine (Drafting)

Primary File: story/writer.py Orchestrated by: cli/engine.py

What Happens

For each chapter:

expand_beats_to_treatment() — Logic model expands sparse beats into a "Director's Treatment" (staging, sensory anchors, emotional arc, subtext).
write_chapter() constructs a ~310-line prompt injecting:
- Author persona (bio, sample text, sample files from disk)
- Filtered characters (only those named in beats + POV character)
- Character tracking state (location, clothing, held items)
- Lore context (relevant locations/items from tracking)
- Style guidelines + genre-specific mandates
- Smart context tail: last ~1000 tokens of previous chapter
- Director's Treatment
Writer model generates first draft.
Logic model evaluates on 13 rubrics (1–10 scale). Automatic fail conditions apply for filter-word density, summary mode, and labeled emotions.
Iterative quality loop (up to 3 attempts):
- Score ≥ 8.0 → Auto-accept
- Score ≥ 7.0 → Accept after max attempts
- Score < 7.0 → Refinement pass (Writer model)
- Score < 6.0 → Full rewrite (Pro model)
Every 5 chapters: refine_persona() updates author bio based on actual written text.

Key Innovations

Dynamic Character Injection: Only injects characters named in chapter beats (saves ~5K tokens/chapter).
Smart Context Tail: Takes last ~1000 tokens of previous chapter (not first 1000) — preserves handoff point.
Auto Model Escalation: Low-scoring drafts trigger switch to Pro model for full rewrite.

Costs (30-Chapter Novel, Mixed Model Strategy)

Task	Calls	Input Tokens	Output Tokens	Cost Estimate
`expand_beats_to_treatment` × 30	30	~5K	~2K	FREE (Logic)
`write_chapter` draft × 30	30	~25K	~3.5K	~$0.087 (Writer)
Evaluation × 30	30	~20K	~1.5K	FREE (Logic)
Refinement passes × 15 (est.)	15	~20K	~3K	~$0.090 (Writer)
`refine_persona` × 6	6	~6K	~1.5K	FREE (Logic)
Phase 3 Total	~111	~1.9M	~310K	~$0.18

Known Limitations

ID	Issue	Impact
P3-L1	Persona files re-read from disk on every chapter	I/O overhead; persona doesn't change between reads
P3-L2	Beat expansion called even when beats are already detailed (>100 words)	Wastes ~5K tokens/chapter on ~30% of chapters
P3-L3	Full rewrite triggered at score < 6.0 — discards entire draft	If draft scores 5.9, all 25K output tokens wasted
P3-L4	No priority weighting for climax chapters	Ch 28 (climax) uses same resources/attempts as Ch 3 (setup)
P3-L5	Previous chapter context hard-capped at 1000 tokens	For long chapters, might miss setup context from earlier pages
P3-L6	Scoring thresholds fixed regardless of book position	Strict standards in early chapters = expensive refinement for setup scenes

Phase 4: Review & Refinement (Editing)

Primary Files: story/editor.py, story/bible_tracker.py Orchestrated by: cli/engine.py

What Happens

During writing loop (every chapter):

update_tracking() refreshes character state (location, clothing, held items, speech style, events).
update_lore_index() extracts canonical descriptions of locations and items.

Every 2 chapters:

check_pacing() detects if story is rushing or repeating beats; triggers ADD_BRIDGE or CUT_NEXT interventions.

After writing completes:

analyze_consistency() scans entire manuscript for plot holes and contradictions.
harvest_metadata() extracts newly invented characters not in the original bible.
check_and_propagate() cascades chapter edits forward through the manuscript.

13 Evaluation Rubrics

Engagement & tension
Scene execution (no summaries)
Voice & tone
Sensory immersion
Show, Don't Tell / Deep POV (auto-fail trigger)
Character agency
Pacing
Genre appropriateness
Dialogue authenticity
Plot relevance
Staging & flow
Prose dynamics (sentence variety)
Clarity & readability

Automatic fail conditions: filter-word density > 1/120 words → cap at 5; summary mode detected → cap at 6; >3 labeled emotions → cap at 5.

Costs (30-Chapter Novel)

Task	Calls	Input Tokens	Cost (Pro-Exp)
`update_tracking` × 30	30	~18K	FREE
`update_lore_index` × 30	30	~15K	FREE
`check_pacing` × 15	15	~18K	FREE
`analyze_consistency`	1	~25K	FREE
`harvest_metadata`	1	~25K	FREE
Phase 4 Total	77	~1.34M	FREE

Known Limitations

ID	Issue	Impact
P4-L1	Consistency check is post-generation only	Plot holes caught too late to cheaply fix
P4-L2	Ripple propagation (`check_and_propagate`) has no cost ceiling	A single user edit in Ch 5 can trigger 100K+ tokens of cascading rewrites
P4-L3	`rewrite_chapter_content()` uses Logic model instead of Writer model	Less creative rewrite output — Logic model optimizes reasoning, not prose
P4-L4	`check_pacing()` sampling only looks at recent chapters, not cumulative arc	Slow-building issues across 10+ chapters not detected until critical
P4-L5	No quality metric for the evaluator itself	Can't confirm if 13-rubric scores are calibrated correctly

Cross-Phase Summary

Total Costs (30-Chapter Novel)

Phase	Token Budget	Cost Estimate
Phase 1: Ideation	~20K	FREE
Phase 2: Outline	~73K	FREE
Phase 3: Writing	~2.2M	~$0.18
Phase 4: Review	~1.34M	FREE
Imagen Cover (3 images)	—	~$0.12
Total	~3.63M	~$0.30

Assumes quality-first model selection (Pro-Exp for Logic, Flash for Writer)

Efficiency Frontier

Best case (all chapters pass first attempt): ~$0.18 text + $0.04 cover = ~$0.22
Worst case (30% rewrite rate with Pro escalations): ~$0.45 text + $0.12 cover = ~$0.57
Budget per blueprint goal: $2.00 total — current system is 15–29% of budget

Top 5 Immediate Optimization Opportunities

Priority	ID	Change	Savings
1	P3-L1	Cache persona per book (not per chapter)	~90K tokens
2	P3-L2	Skip beat expansion for detailed beats	~45K tokens
3	P2-L2	Add pre-generation outline validation	Prevent expensive rewrites
4	P1-L1	Fix silent failure in `enrich()`	Prevent silent corrupt state
5	P3-L6	Adaptive scoring thresholds by chapter position	~15% fewer refinement passes

11 KiB Raw Blame History Unescape Escape

Current State Analysis: BookApp AI Pipeline

Overview

Phase 1: Foundation & Ideation ("The Seed")

What Happens

Costs (Per Book)

Known Limitations

Phase 2: Structuring & Outlining

What Happens

Depth Strategy

Costs (30-Chapter Novel)

Known Limitations

Phase 3: The Writing Engine (Drafting)

What Happens

Key Innovations

Costs (30-Chapter Novel, Mixed Model Strategy)

Known Limitations

Phase 4: Review & Refinement (Editing)

What Happens

13 Evaluation Rubrics

Costs (30-Chapter Novel)

Known Limitations

Cross-Phase Summary

Total Costs (30-Chapter Novel)

Efficiency Frontier

Top 5 Immediate Optimization Opportunities

11 KiB

Raw Blame History