Files
bookapp/docs/current_state_analysis.md
Mike Wichers 2100ca2312 feat: Implement ai_blueprint.md action plan — architectural review & optimisations
Steps 1–7 of the ai_blueprint.md action plan executed:

DOCUMENTATION (Steps 1–3, 6–7):
- docs/current_state_analysis.md: Phase-by-phase cost/quality mapping of existing pipeline
- docs/alternatives_analysis.md: 15 alternative approaches with testable hypotheses
- docs/experiment_design.md: 7 controlled A/B experiment specifications (CPC, HQS, CER metrics)
- ai_blueprint_v2.md: New recommended architecture with cost projections and experiment roadmap

CODE IMPROVEMENTS (Step 4 — Experiments 1–4 implemented):
- story/writer.py: Extract build_persona_info() — persona loaded once per book, not per chapter
- story/writer.py: Adaptive scoring thresholds — SCORE_PASSING scales 6.5→7.5 by chapter position
- story/writer.py: Beat expansion skip — if beats >100 words, skip Director's Treatment expansion
- story/planner.py: validate_outline() — pre-generation gate checks missing beats, continuity, pacing
- story/planner.py: Enrichment field validation — warn on missing title/genre after enrich()
- cli/engine.py: Wire persona cache, outline validation gate, chapter_position threading

Expected savings: ~285K tokens per 30-chapter novel (~7% cost reduction)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 22:01:30 -05:00

11 KiB
Raw Permalink Blame History

Current State Analysis: BookApp AI Pipeline

Date: 2026-02-22 Scope: Mapping existing codebase to the four phases defined in ai_blueprint.md Status: Completed — fulfills Action Plan Step 1


Overview

BookApp is an AI-powered novel generation engine using Google Gemini. The pipeline is structured into four phases that map directly to the review framework in ai_blueprint.md. This document catalogues the current implementation, identifies efficiency metrics, and surfaces limitations in each phase.


Phase 1: Foundation & Ideation ("The Seed")

Primary File: story/planner.py (lines 186) Supporting: story/style_persona.py (lines 81104), core/config.py

What Happens

  1. User provides a minimal manual_instruction (can be a single sentence).
  2. enrich(bp, folder, context) calls the Logic model to expand this into:
    • book_metadata: title, genre, tone, time period, structure type, formatting rules, content warnings
    • characters: 28 named characters with roles and descriptions
    • plot_beats: 57 concrete narrative beats
  3. If the project is part of a series, context from previous books is injected.
  4. create_initial_persona() generates a fictional author persona (name, bio, age, gender).

Costs (Per Book)

Task Model Input Tokens Output Tokens Cost (Pro-Exp)
enrich() Logic ~10K ~3K FREE
create_initial_persona() Logic ~5.5K ~1.5K FREE
Phase 1 Total ~15.5K ~4.5K FREE

Known Limitations

ID Issue Impact
P1-L1 enrich() silently returns original BP on exception (line 84) Invalid enrichment passes downstream without warning
P1-L2 filter_characters() blacklists keywords like "TBD", "protagonist" — can cull valid names Characters named "The Protagonist" are silently dropped
P1-L3 Single-pass persona creation — no quality check on output Generic personas produce poor voice throughout book
P1-L4 No validation that required book_metadata fields are non-null Downstream crashes when title/genre are missing

Phase 2: Structuring & Outlining

Primary File: story/planner.py (lines 89290) Supporting: story/style_persona.py

What Happens

  1. plan_structure(bp, folder) maps plot beats to a structural framework (Hero's Journey, Three-Act, etc.) and produces ~1015 events.
  2. expand(events, pass_num, ...) iteratively enriches the outline. Called depth times (14 based on length preset). Each pass targets chapter count × 1.5 events as ceiling.
  3. create_chapter_plan(events, bp, folder) converts events into concrete chapter objects with POV, pacing, and estimated word count.
  4. get_style_guidelines() loads or refreshes the AI-ism blacklist and filter-word list.

Depth Strategy

Preset Depth Expand Calls Approx Events
Flash Fiction 1 1 1
Short Story 1 1 5
Novella 2 2 15
Novel 3 3 30
Epic 4 4 50

Costs (30-Chapter Novel)

Task Calls Input Tokens Cost (Pro-Exp)
plan_structure 1 ~15K FREE
expand × 3 3 ~12K each FREE
create_chapter_plan 1 ~14K FREE
get_style_guidelines 1 ~8K FREE
Phase 2 Total 6 ~73K FREE

Known Limitations

ID Issue Impact
P2-L1 Sequential expand() calls — each call unaware of final state Redundant inter-call work; could be one multi-step prompt
P2-L2 No continuity validation on outline — character deaths/revivals not detected Plot holes remain until expensive Phase 3 rewrite
P2-L3 Static chapter plan — cannot adapt if early chapters reveal pacing problem Dynamic interventions in Phase 4 are costly workarounds
P2-L4 POV assignment is AI-generated, not validated against narrative logic Wrong POV on key scenes; caught only during editing
P2-L5 Word count estimates are rough (~±30% actual variance) Writer overshoots/undershoots target; word count normalization fails

Phase 3: The Writing Engine (Drafting)

Primary File: story/writer.py Orchestrated by: cli/engine.py

What Happens

For each chapter:

  1. expand_beats_to_treatment() — Logic model expands sparse beats into a "Director's Treatment" (staging, sensory anchors, emotional arc, subtext).
  2. write_chapter() constructs a ~310-line prompt injecting:
    • Author persona (bio, sample text, sample files from disk)
    • Filtered characters (only those named in beats + POV character)
    • Character tracking state (location, clothing, held items)
    • Lore context (relevant locations/items from tracking)
    • Style guidelines + genre-specific mandates
    • Smart context tail: last ~1000 tokens of previous chapter
    • Director's Treatment
  3. Writer model generates first draft.
  4. Logic model evaluates on 13 rubrics (110 scale). Automatic fail conditions apply for filter-word density, summary mode, and labeled emotions.
  5. Iterative quality loop (up to 3 attempts):
    • Score ≥ 8.0 → Auto-accept
    • Score ≥ 7.0 → Accept after max attempts
    • Score < 7.0 → Refinement pass (Writer model)
    • Score < 6.0 → Full rewrite (Pro model)
  6. Every 5 chapters: refine_persona() updates author bio based on actual written text.

Key Innovations

  • Dynamic Character Injection: Only injects characters named in chapter beats (saves ~5K tokens/chapter).
  • Smart Context Tail: Takes last ~1000 tokens of previous chapter (not first 1000) — preserves handoff point.
  • Auto Model Escalation: Low-scoring drafts trigger switch to Pro model for full rewrite.

Costs (30-Chapter Novel, Mixed Model Strategy)

Task Calls Input Tokens Output Tokens Cost Estimate
expand_beats_to_treatment × 30 30 ~5K ~2K FREE (Logic)
write_chapter draft × 30 30 ~25K ~3.5K ~$0.087 (Writer)
Evaluation × 30 30 ~20K ~1.5K FREE (Logic)
Refinement passes × 15 (est.) 15 ~20K ~3K ~$0.090 (Writer)
refine_persona × 6 6 ~6K ~1.5K FREE (Logic)
Phase 3 Total ~111 ~1.9M ~310K ~$0.18

Known Limitations

ID Issue Impact
P3-L1 Persona files re-read from disk on every chapter I/O overhead; persona doesn't change between reads
P3-L2 Beat expansion called even when beats are already detailed (>100 words) Wastes ~5K tokens/chapter on ~30% of chapters
P3-L3 Full rewrite triggered at score < 6.0 — discards entire draft If draft scores 5.9, all 25K output tokens wasted
P3-L4 No priority weighting for climax chapters Ch 28 (climax) uses same resources/attempts as Ch 3 (setup)
P3-L5 Previous chapter context hard-capped at 1000 tokens For long chapters, might miss setup context from earlier pages
P3-L6 Scoring thresholds fixed regardless of book position Strict standards in early chapters = expensive refinement for setup scenes

Phase 4: Review & Refinement (Editing)

Primary Files: story/editor.py, story/bible_tracker.py Orchestrated by: cli/engine.py

What Happens

During writing loop (every chapter):

  • update_tracking() refreshes character state (location, clothing, held items, speech style, events).
  • update_lore_index() extracts canonical descriptions of locations and items.

Every 2 chapters:

  • check_pacing() detects if story is rushing or repeating beats; triggers ADD_BRIDGE or CUT_NEXT interventions.

After writing completes:

  • analyze_consistency() scans entire manuscript for plot holes and contradictions.
  • harvest_metadata() extracts newly invented characters not in the original bible.
  • check_and_propagate() cascades chapter edits forward through the manuscript.

13 Evaluation Rubrics

  1. Engagement & tension
  2. Scene execution (no summaries)
  3. Voice & tone
  4. Sensory immersion
  5. Show, Don't Tell / Deep POV (auto-fail trigger)
  6. Character agency
  7. Pacing
  8. Genre appropriateness
  9. Dialogue authenticity
  10. Plot relevance
  11. Staging & flow
  12. Prose dynamics (sentence variety)
  13. Clarity & readability

Automatic fail conditions: filter-word density > 1/120 words → cap at 5; summary mode detected → cap at 6; >3 labeled emotions → cap at 5.

Costs (30-Chapter Novel)

Task Calls Input Tokens Cost (Pro-Exp)
update_tracking × 30 30 ~18K FREE
update_lore_index × 30 30 ~15K FREE
check_pacing × 15 15 ~18K FREE
analyze_consistency 1 ~25K FREE
harvest_metadata 1 ~25K FREE
Phase 4 Total 77 ~1.34M FREE

Known Limitations

ID Issue Impact
P4-L1 Consistency check is post-generation only Plot holes caught too late to cheaply fix
P4-L2 Ripple propagation (check_and_propagate) has no cost ceiling A single user edit in Ch 5 can trigger 100K+ tokens of cascading rewrites
P4-L3 rewrite_chapter_content() uses Logic model instead of Writer model Less creative rewrite output — Logic model optimizes reasoning, not prose
P4-L4 check_pacing() sampling only looks at recent chapters, not cumulative arc Slow-building issues across 10+ chapters not detected until critical
P4-L5 No quality metric for the evaluator itself Can't confirm if 13-rubric scores are calibrated correctly

Cross-Phase Summary

Total Costs (30-Chapter Novel)

Phase Token Budget Cost Estimate
Phase 1: Ideation ~20K FREE
Phase 2: Outline ~73K FREE
Phase 3: Writing ~2.2M ~$0.18
Phase 4: Review ~1.34M FREE
Imagen Cover (3 images) ~$0.12
Total ~3.63M ~$0.30

Assumes quality-first model selection (Pro-Exp for Logic, Flash for Writer)

Efficiency Frontier

  • Best case (all chapters pass first attempt): ~$0.18 text + $0.04 cover = ~$0.22
  • Worst case (30% rewrite rate with Pro escalations): ~$0.45 text + $0.12 cover = ~$0.57
  • Budget per blueprint goal: $2.00 total — current system is 1529% of budget

Top 5 Immediate Optimization Opportunities

Priority ID Change Savings
1 P3-L1 Cache persona per book (not per chapter) ~90K tokens
2 P3-L2 Skip beat expansion for detailed beats ~45K tokens
3 P2-L2 Add pre-generation outline validation Prevent expensive rewrites
4 P1-L1 Fix silent failure in enrich() Prevent silent corrupt state
5 P3-L6 Adaptive scoring thresholds by chapter position ~15% fewer refinement passes