Files

Mike Wichers 2100ca2312 feat: Implement ai_blueprint.md action plan — architectural review & optimisations

Steps 1–7 of the ai_blueprint.md action plan executed:

DOCUMENTATION (Steps 1–3, 6–7):
- docs/current_state_analysis.md: Phase-by-phase cost/quality mapping of existing pipeline
- docs/alternatives_analysis.md: 15 alternative approaches with testable hypotheses
- docs/experiment_design.md: 7 controlled A/B experiment specifications (CPC, HQS, CER metrics)
- ai_blueprint_v2.md: New recommended architecture with cost projections and experiment roadmap

CODE IMPROVEMENTS (Step 4 — Experiments 1–4 implemented):
- story/writer.py: Extract build_persona_info() — persona loaded once per book, not per chapter
- story/writer.py: Adaptive scoring thresholds — SCORE_PASSING scales 6.5→7.5 by chapter position
- story/writer.py: Beat expansion skip — if beats >100 words, skip Director's Treatment expansion
- story/planner.py: validate_outline() — pre-generation gate checks missing beats, continuity, pacing
- story/planner.py: Enrichment field validation — warn on missing title/genre after enrich()
- cli/engine.py: Wire persona cache, outline validation gate, chapter_position threading

Expected savings: ~285K tokens per 30-chapter novel (~7% cost reduction)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-02-22 22:01:30 -05:00

19 KiB

Raw Blame History

BookApp: AI-Powered Series Engine

An automated pipeline for planning, drafting, and publishing novels using Google Gemini. Supports both a browser-based Web UI and an interactive CLI Wizard.

Quick Start

Install core dependencies: pip install -r requirements.txt
Install web dependencies: pip install -r web/requirements_web.txt
Configure: Copy .env.example to .env and add your GEMINI_API_KEY.
Launch: python -m web.app
Open: http://localhost:5000

CLI Mode (No Browser)

python -m cli.wizard

The wizard guides you through creating or loading a project, defining characters and plot beats, and launching a generation run directly from the terminal. It auto-detects incomplete runs and offers to resume them.

Admin Access

The /admin panel allows managing users and performing factory resets. It is restricted to accounts with the Admin role.

Via environment variables (recommended for Docker): Set ADMIN_USERNAME and ADMIN_PASSWORD — the account is auto-created on startup.

Via manual promotion: Register a normal account, then set is_admin = 1 in the database.

Docker Setup (Recommended for Servers)

1. Git Setup

Push this project to a Git repository (GitHub, GitLab, or a self-hosted Gitea). Ensure .env, token.json, credentials.json, and data/ are in .gitignore.

2. Server Preparation (One-Time)

Place secrets on the server manually — they must not be in Git.

Run python -m cli.wizard locally to generate token.json (Google OAuth).
SSH into your server and create a data folder:
```
mkdir -p /opt/bookapp
```
Upload token.json and credentials.json to /opt/bookapp. The data/ subfolder is created automatically on first run.

3. Portainer Stack Setup

Go to Stacks > Add stack > Repository.
Set Repository URL and Compose path (docker-compose.yml).
Enable Authentication and supply a Gitea Personal Access Token if your repo is private.

Add Environment variables:

Variable	Description
`HOST_PATH`	Server folder for persistent data (e.g., `/opt/bookapp`)
`GEMINI_API_KEY`	Your Google Gemini API key (required)
`ADMIN_USERNAME`	Admin account username
`ADMIN_PASSWORD`	Admin account password
`FLASK_SECRET_KEY`	Random string for session encryption
`FLASK_DEBUG`	`False` in production
`GCP_PROJECT`	Google Cloud Project ID (required for Imagen / Vertex AI)
`GCP_LOCATION`	GCP region (default: `us-central1`)
`MODEL_LOGIC`	Override the reasoning model (e.g., `models/gemini-1.5-pro-latest`)
`MODEL_WRITER`	Override the writing model
`MODEL_ARTIST`	Override the visual-prompt model
`MODEL_IMAGE`	Override the image generation model

Click Deploy the stack.

4. Updating the App

Make changes locally and push to Git.
In Portainer: Stack > Editor > Pull and redeploy.

Managing Files

The Docker volume maps /app/data in the container to HOST_PATH on your server.

Add personas/fonts: Drop files into ${HOST_PATH}/data/personas/ or ${HOST_PATH}/data/fonts/.
Download books: Use the Web Dashboard download links.
Backup: Archive the entire ${HOST_PATH} directory.

Native Setup (No Docker)

pip install -r requirements.txt
pip install -r web/requirements_web.txt
python -m web.app

Open http://localhost:5000.

Features

Web UI (`web/`)

Project Dashboard: Create and monitor generation jobs from the browser.
Real-time Logs: Console output is streamed to the browser and stored in the database.
Chapter Editor: Edit chapters directly in the browser; manual edits are preserved across artifact regenerations and synced back to character/plot tracking state.
Chapter Navigation: Prev/Next buttons on every chapter card in the manuscript reader let you jump between chapters without scrolling.
Download Bible: Download the project's bible.json directly from any run's detail page for offline review or cloning.
Run Tagging: Label runs with comma-separated tags (e.g. dark-ending, v2, favourite) to organise and track experiments.
Run Deletion: Delete completed or failed runs and their filesystem data from the run detail page.
Cover Regeneration: Submit written feedback to regenerate the cover image iteratively.
Admin Panel: Manage all users, view spend, and perform factory resets at /admin.
Per-User API Keys: Each user can supply their own Gemini API key; costs are tracked per account.

CLI Wizard (`cli/`)

Interactive Setup: Menu-driven interface (via Rich) for creating projects, managing personas, and defining characters and plot beats.
Smart Resume: Detects in-progress runs via lock files and prompts to resume.
Interactive Mode: Optionally review and approve/reject each chapter before generation continues.
Stop Signal: Create a .stop file in the run directory to gracefully abort a long run without corrupting state.

Story Generation (`story/`)

Adaptive Structure: Chooses a narrative framework (Hero's Journey, Three-Act, Single Scene, etc.) based on the selected length preset and expands it through multiple depth levels.
Dynamic Pacing: Monitors story progress during writing and inserts bridge chapters to slow a rushing plot or removes redundant ones detected mid-stream — without restarting.
Series Continuity: When generating Book 2+, carries forward character visual tracking, established relationships, plot threads, and a cumulative "Story So Far" summary.
Persona Refinement Loop: Every 5 chapters, analyzes actual written text to refine the author persona model, maintaining stylistic consistency throughout the book.
Persona Cache: The author persona (including writing sample files) is loaded once at the start of the writing phase and reused for every chapter, eliminating redundant file I/O. The cache is refreshed whenever the persona is refined.
Outline Validation Gate (planner.py): Before the writing phase begins, a Logic-model pass checks the chapter plan for missing required beats, character continuity issues, pacing imbalances, and POV logic errors. Issues are logged as warnings so the writer can review them before generation begins.
Adaptive Scoring Thresholds (writer.py): Quality passing thresholds scale with chapter position — setup chapters use a lower bar (6.5) to avoid over-spending refinement tokens on early exposition, while climax chapters use a stricter bar (7.5) to ensure the most important scenes receive maximum effort.
Smart Beat Expansion Skip (writer.py): If a chapter's scene beats are already detailed (>100 words total), the Director's Treatment expansion step is skipped, saving ~5K tokens per chapter.
Consistency Checker (editor.py): Scores chapters on 13 rubrics (engagement, voice, sensory detail, scene execution, dialogue, pacing, staging, prose dynamics, clarity, etc.) and flags AI-isms ("tapestry", "palpable tension") and weak filter verbs ("felt", "realized"). Chapter evaluation now uses the Logic model (free Pro) rather than the Writer model, ensuring stricter and more accurate scoring.
Dynamic Character Injection (writer.py): Only injects characters explicitly named in the chapter's scene_beats plus the POV character into the writer prompt. Eliminates token waste from unused characters and reduces hallucinated appearances.
Smart Context Tail (writer.py): Extracts the final ~1,000 tokens of the previous chapter (the actual ending) rather than blindly truncating from the front. Ensures the hand-off point — where characters are standing and what was last said — is always preserved.
Stateful Scene Tracking (bible_tracker.py): After each chapter, the tracker records each character's current_location, time_of_day, and held_items in addition to appearance and events. This scene state is injected into subsequent chapter prompts so the writer knows exactly where characters are, what time it is, and what they're carrying.

Marketing Assets (`marketing/`)

Cover Art: Generates a visual prompt from book themes and tracking data, then calls Imagen (Gemini or Vertex AI) to produce the cover. Evaluates image quality with multimodal AI critique before accepting.
Back-Cover Blurb: Writes 150–200 word marketing copy in a 4-part structure (Hook, Stakes, Tension, Close) with genre-specific tone (thriller=urgent, romance=emotional, etc.).

Export (`export/`)

EPUB: eBook file with cover image, chapter structure, and formatted text (bold, italics, headers). Ready for Kindle / Apple Books.
DOCX: Word document for manual editing.

AI Infrastructure (`ai/`)

Resilient Model Wrapper: Wraps every Gemini API call with up to 3 retries and exponential backoff, handles quota errors and rate limits, and can switch to an alternative model mid-stream.
Auto Model Selection: On startup, a bootstrapper model queries the Gemini API and selects the optimal models for Logic, Writer, Artist, and Image roles. Selection is cached for 24 hours. The selection algorithm now prioritizes quality — free/preview/exp models are preferred by capability (Pro > Flash, 2.5 > 2.0 > 1.5) rather than by cost alone.
Vertex AI Support: If GCP_PROJECT is set and OAuth credentials are present, initializes Vertex AI automatically for Imagen image generation.
Payload Guardrails: Every generation call estimates the prompt token count before dispatch. If the payload exceeds 30,000 tokens, a warning is logged so runaway context injection is surfaced immediately.

AI Context Optimization (`core/utils.py`)

System Status Model Optimization (templates/system_status.html, web/routes/admin.py): Refreshing models operates via an async fetch request, preventing page freezes during the re-evaluation of available models.
Context Truncation: truncate_to_tokens(text, max_tokens) enforces hard caps on large context variables — previous chapter text, story summaries, and character data — before they are injected into prompts, preventing token overflows on large manuscripts.
AI Response Cache: An in-memory cache (_AI_CACHE) keyed by MD5 hash of inputs prevents redundant API calls for deterministic tasks such as persona analysis. Results are reused for identical inputs within the same session.

Cost Tracking

Every AI call logs input/output token counts and estimated USD cost (using cached pricing per model). Cumulative project cost is stored in the database and displayed per user and per run.

Project Structure

BookApp/
├── ai/                  # Gemini/Vertex AI authentication and resilient model wrapper
│   ├── models.py        # ResilientModel class with retry logic
│   └── setup.py         # Model initialization and auto-selection
├── cli/                 # Terminal interface and generation orchestrator
│   ├── engine.py        # Full generation pipeline (plan → write → export)
│   └── wizard.py        # Interactive menu-driven setup wizard
├── core/                # Central configuration and shared utilities
│   ├── config.py        # Environment variable loading, presets, AI safety settings
│   └── utils.py         # Logging, JSON cleaning, usage tracking, filename utils
├── export/              # Manuscript compilation
│   └── exporter.py      # EPUB and DOCX generation
├── marketing/           # Post-generation asset creation
│   ├── assets.py        # Orchestrates cover + blurb creation
│   ├── blurb.py         # Back-cover marketing copy generation
│   ├── cover.py         # Cover art generation and iterative refinement
│   └── fonts.py         # Google Fonts downloader/cache
├── story/               # Core creative AI pipeline
│   ├── bible_tracker.py # Character state and plot event tracking
│   ├── editor.py        # Chapter quality scoring and AI-ism detection
│   ├── planner.py       # Story structure and chapter plan generation
│   ├── style_persona.py # Author persona creation and refinement
│   └── writer.py        # Chapter-by-chapter writing with persona/context injection
├── templates/           # Jinja2 HTML templates for the web application
├── web/                 # Flask web application
│   ├── app.py           # App factory, blueprint registration, admin auto-creation
│   ├── db.py            # SQLAlchemy models: User, Project, Run, LogEntry
│   ├── helpers.py       # admin_required decorator, project lock check, CSRF utils
│   ├── tasks.py         # Huey background task queue (generate, rewrite, regenerate)
│   ├── requirements_web.txt
│   └── routes/
│       ├── admin.py     # User management and factory reset
│       ├── auth.py      # Login, register, session management
│       ├── persona.py   # Author persona CRUD and sample file upload
│       ├── project.py   # Project creation wizard and job queuing
│       └── run.py       # Run status, logs, downloads, chapter editing, cover regen
├── docker-compose.yml
├── Dockerfile
├── requirements.txt     # Core AI/generation dependencies
└── README.md

Environment Variables

All variables are loaded from a .env file in the project root (never commit this file).

Variable	Required	Description
`GEMINI_API_KEY`	Yes	Google Gemini API key
`FLASK_SECRET_KEY`	No	Session encryption key (default: insecure dev value — change in production)
`ADMIN_USERNAME`	No	Auto-creates an admin account on startup
`ADMIN_PASSWORD`	No	Password for the auto-created admin account
`GCP_PROJECT`	No	Google Cloud Project ID (enables Vertex AI / Imagen)
`GCP_LOCATION`	No	GCP region (default: `us-central1`)
`GOOGLE_APPLICATION_CREDENTIALS`	No	Path to OAuth2 credentials JSON for Vertex AI
`MODEL_LOGIC`	No	Override the reasoning model
`MODEL_WRITER`	No	Override the creative writing model
`MODEL_ARTIST`	No	Override the visual-prompt model
`MODEL_IMAGE`	No	Override the image generation model
`FLASK_DEBUG`	No	Enable Flask debug mode (`True`/`False`)

Length Presets

The Length setting controls structural complexity, not just word count. It determines the narrative framework, chapter count, and the number of depth-expansion passes the planner performs.

Preset	Approx Words	Chapters	Depth	Description
Flash Fiction	500 – 1.5k	1	1	A single scene or moment.
Short Story	5k – 10k	5	1	One conflict, few characters.
Novella	20k – 40k	15	2	Developed plot, A & B stories.
Novel	60k – 80k	30	3	Deep subplots, slower pacing.
Epic	100k+	50	4	Massive scope, world-building focus.

Note: This engine is designed for linear fiction. Branching narratives ("Choose Your Own Adventure") are not currently supported.

Data Structure & File Dictionary

All data is stored in data/, making backup and migration simple.

Folder Hierarchy

data/
├── users/
│   └── {user_id}/
│       └── {Project_Name}/
│           ├── bible.json              # Project source of truth
│           └── runs/
│               └── run_{id}/
│                   ├── web_console.log
│                   └── Book_{N}_{Title}/
│                       ├── manuscript.json
│                       ├── tracking_events.json
│                       ├── tracking_characters.json
│                       ├── chapters.json
│                       ├── events.json
│                       ├── final_blueprint.json
│                       ├── usage_log.json
│                       ├── cover_art_prompt.txt
│                       ├── {Title}.epub
│                       └── {Title}.docx
├── personas/
│   └── personas.json
├── fonts/                              # Cached Google Fonts
└── style_guidelines.json              # Global AI writing rules

File Dictionary

File	Scope	Description
`bible.json`	Project	Master plan: series title, author metadata, character list, and high-level plot outline for every book.
`manuscript.json`	Book	Every written chapter in order. Used to resume generation if interrupted.
`events.json`	Book	Structural outline (e.g., Hero's Journey beats) produced by the planner.
`chapters.json`	Book	Detailed writing plan: title, POV character, pacing, estimated word count per chapter.
`tracking_events.json`	Book	Cumulative plot summary and chronological event log for continuity.
`tracking_characters.json`	Book	Current state of every character (appearance, clothing, location, injuries, speech patterns).
`final_blueprint.json`	Book	Post-generation metadata snapshot: captures new characters and plot points invented during writing.
`usage_log.json`	Book	AI token counts and estimated USD cost per call, per book.
`cover_art_prompt.txt`	Book	Exact prompt submitted to Imagen / Vertex AI for cover generation.
`{Title}.epub`	Book	Compiled eBook, ready for Kindle / Apple Books.
`{Title}.docx`	Book	Compiled Word document for manual editing.

JSON Data Schemas

`bible.json`

{
  "project_metadata": {
    "title": "Series Title",
    "author": "Author Name",
    "genre": "Sci-Fi",
    "is_series": true,
    "style": {
      "tone": "Dark",
      "pov_style": "Third Person Limited"
    }
  },
  "characters": [
    {
      "name": "Jane Doe",
      "role": "Protagonist",
      "description": "Physical and personality details..."
    }
  ],
  "books": [
    {
      "book_number": 1,
      "title": "Book One Title",
      "manual_instruction": "High-level plot summary...",
      "plot_beats": ["Beat 1", "Beat 2"]
    }
  ]
}

`manuscript.json`

[
  {
    "num": 1,
    "title": "Chapter Title",
    "pov_character": "Jane Doe",
    "content": "# Chapter 1\n\nThe raw markdown text of the chapter..."
  }
]

`tracking_characters.json`

{
  "Jane Doe": {
    "descriptors": ["Blue eyes", "Tall"],
    "likes_dislikes": ["Loves coffee"],
    "last_worn": "Red dress (Ch 4)",
    "major_events": ["Injured leg in Ch 2"],
    "current_location": "The King's Throne Room",
    "time_of_day": "Late afternoon",
    "held_items": ["Iron sword", "Stolen ledger"]
  }
}

19 KiB Raw Blame History Unescape Escape