feat: Improve revision pipeline quality — 6 targeted enhancements (v3.1)

1. editor.py — Fix rewrite_chapter_content to use model_writer (was model_logic). Chapter rewrites now use the creative writing model, not the cheaper analysis model. 2. editor.py — evaluate_chapter_quality now uses keep_head=True so the evaluator sees the chapter opening (engagement hook, sensory anchoring) as well as the ending; long chapters no longer scored on tail only. 3. editor.py — Consistency analysis sampling upgraded to head+middle+tail (was head+tail), giving the LLM a complete view of each chapter's events. 4. writer.py — max_attempts is now adaptive: climax/resolution chapters (position >= 0.75) receive 3 refinement attempts; others keep 2. 5. writer.py — Polish-skip threshold tightened from 0.012 to 0.008 (1 filter word per 125 words vs. 1 per 83 words), so more borderline drafts are cleaned. 6. style_persona.py — Persona validation sample increased from 200 to 400 words for more reliable voice quality assessment. Version bumped: 3.0 → 3.1 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 07:51:31 -05:00
parent dc39930da4
commit d2c65f010a
5 changed files with 35 additions and 11 deletions
@@ -104,6 +104,15 @@ Open `http://localhost:5000`.
 - **Admin Panel:** Manage all users, view spend, and perform factory resets at `/admin`.
 - **Per-User API Keys:** Each user can supply their own Gemini API key; costs are tracked per account.
 ### Cost-Effective by Design
 This engine was built with the goal of producing high-quality fiction at the lowest possible cost. This is achieved through several architectural optimizations:
 *   **Tiered AI Models**: The system uses cheaper, faster models (like Gemini Pro) for structural and analytical tasks—planning the plot, scoring chapter quality, and ensuring consistency. The more powerful and expensive creative models are reserved for the actual writing process.
 *   **Intelligent Context Management**: To minimize the number of tokens sent to the AI, the system is very selective about the data it includes in each request. For example, when writing a chapter, it only injects data for the characters who are currently in the scene, rather than the entire cast.
 *   **Adaptive Workflows**: The engine avoids unnecessary work. If a user provides a detailed outline for a chapter, the system skips the AI step that would normally expand on a basic idea, saving both time and money. It also adjusts its quality standards based on the chapter's importance, spending more effort on a climactic scene than on a simple transition.
 *   **Caching**: The system caches the results of deterministic AI tasks. If it needs to perform the same analysis twice, it reuses the original result instead of making a new API call.
 ### CLI Wizard (`cli/`)
 - **Interactive Setup:** Menu-driven interface (via Rich) for creating projects, managing personas, and defining characters and plot beats.
 - **Smart Resume:** Detects in-progress runs via lock files and prompts to resume.
@@ -118,8 +127,13 @@ Open `http://localhost:5000`.
 - **Persona Cache:** The author persona (including writing sample files) is loaded once at the start of the writing phase and reused for every chapter, eliminating redundant file I/O. The cache is refreshed whenever the persona is refined.
 - **Outline Validation Gate (`planner.py`):** Before the writing phase begins, a Logic-model pass checks the chapter plan for missing required beats, character continuity issues, pacing imbalances, and POV logic errors. Issues are logged as warnings so the writer can review them before generation begins.
 - **Adaptive Scoring Thresholds (`writer.py`):** Quality passing thresholds scale with chapter position — setup chapters use a lower bar (6.5) to avoid over-spending refinement tokens on early exposition, while climax chapters use a stricter bar (7.5) to ensure the most important scenes receive maximum effort.
 - **Adaptive Refinement Attempts (`writer.py`):** Climax and resolution chapters (position ≥ 75% through the book) receive up to 3 refinement attempts; earlier chapters keep 2. This concentrates quality effort on the scenes readers remember most.
 - **Stricter Polish Pass (`writer.py`):** The filter-word threshold for skipping the two-pass polish has been tightened from 1-per-83-words to 1-per-125-words, so more borderline drafts are cleaned before evaluation.
 - **Smart Beat Expansion Skip (`writer.py`):** If a chapter's scene beats are already detailed (>100 words total), the Director's Treatment expansion step is skipped, saving ~5K tokens per chapter.
- **Consistency Checker (`editor.py`):** Scores chapters on 13 rubrics (engagement, voice, sensory detail, scene execution, dialogue, pacing, staging, prose dynamics, clarity, etc.) and flags AI-isms ("tapestry", "palpable tension") and weak filter verbs ("felt", "realized"). Chapter evaluation now uses the Logic model (free Pro) rather than the Writer model, ensuring stricter and more accurate scoring.
+- **Consistency Checker (`editor.py`):** Scores chapters on 13 rubrics (engagement, voice, sensory detail, scene execution, dialogue, pacing, staging, prose dynamics, clarity, etc.) and flags AI-isms ("tapestry", "palpable tension") and weak filter verbs ("felt", "realized"). Chapter evaluation now uses head+tail sampling (`keep_head=True`) ensuring the evaluator sees the chapter opening (hooks, sensory anchoring) as well as the ending — long chapters no longer receive scores based only on their tail.
 - **Rewrite Model Upgrade (`editor.py`):** Manual chapter rewrites and user-triggered edits now use `model_writer` (the creative writing model) instead of `model_logic`, producing significantly better prose quality on rewritten content.
 - **Improved Consistency Sampling (`editor.py`):** The mid-generation consistency analysis now samples head + middle + tail of each chapter (instead of head + tail only), giving the continuity LLM a complete picture of each chapter's events for more accurate contradiction detection.
 - **Larger Persona Validation Sample (`style_persona.py`):** The persona validation test passage has been increased from 200 words to 400 words, giving the scorer enough material to reliably assess sentence rhythm, filter-word habits, and deep POV quality before accepting a persona.
 - **Dynamic Character Injection (`writer.py`):** Only injects characters explicitly named in the chapter's `scene_beats` plus the POV character into the writer prompt. Eliminates token waste from unused characters and reduces hallucinated appearances.
 - **Smart Context Tail (`writer.py`):** Extracts the final ~1,000 tokens of the previous chapter (the actual ending) rather than blindly truncating from the front. Ensures the hand-off point — where characters are standing and what was last said — is always preserved.
 - **Stateful Scene Tracking (`bible_tracker.py`):** After each chapter, the tracker records each character's `current_location`, `time_of_day`, and `held_items` in addition to appearance and events. This scene state is injected into subsequent chapter prompts so the writer knows exactly where characters are, what time it is, and what they're carrying.
@@ -66,4 +66,4 @@ LENGTH_DEFINITIONS = {
 }
 # --- SYSTEM ---
-VERSION = "3.0"
+VERSION = "3.1"
@@ -67,7 +67,7 @@ def evaluate_chapter_quality(text, chapter_title, genre, model, folder, series_c
    }}
    """
    try:
-        response = model.generate_content([prompt, utils.truncate_to_tokens(text, 7500)])
+        response = model.generate_content([prompt, utils.truncate_to_tokens(text, 7500, keep_head=True)])
        model_name = getattr(model, 'name', ai_models.logic_model_name)
        utils.log_usage(folder, model_name, response.usage_metadata)
        data = json.loads(utils.clean_json(response.text))
@@ -129,7 +129,13 @@ def analyze_consistency(bp, manuscript, folder):
    chapter_summaries = []
    for ch in manuscript:
        text = ch.get('content', '')
-        excerpt = text[:1000] + "\n...\n" + text[-1000:] if len(text) > 2000 else text
+        if len(text) > 3000:
            mid = len(text) // 2
            excerpt = text[:800] + "\n...\n" + text[mid - 200:mid + 200] + "\n...\n" + text[-800:]
        elif len(text) > 1600:
            excerpt = text[:800] + "\n...\n" + text[-800:]
        else:
            excerpt = text
        chapter_summaries.append(f"Ch {ch.get('num')}: {excerpt}")
    context = "\n".join(chapter_summaries)
@@ -236,8 +242,8 @@ def rewrite_chapter_content(bp, manuscript, chapter_num, instruction, folder):
    """
    try:
-        response = ai_models.model_logic.generate_content(prompt)
+        response = ai_models.model_writer.generate_content(prompt)
-        utils.log_usage(folder, ai_models.model_logic.name, response.usage_metadata)
+        utils.log_usage(folder, ai_models.model_writer.name, response.usage_metadata)
        try:
            data = json.loads(utils.clean_json(response.text))
            return data.get('content'), data.get('summary')
@@ -121,7 +121,7 @@ def validate_persona(bp, persona_details, folder):
    sample_prompt = f"""
    ROLE: Fiction Writer
-    TASK: Write a 200-word opening scene that perfectly demonstrates this author's voice.
+    TASK: Write a 400-word opening scene that perfectly demonstrates this author's voice.
    AUTHOR_PERSONA:
    Name: {name}
@@ -131,7 +131,7 @@ def validate_persona(bp, persona_details, folder):
    TONE: {tone}
    RULES:
-    - Exactly ~200 words of prose (no chapter header, no commentary)
+    - Exactly ~400 words of prose (no chapter header, no commentary)
    - Must reflect the persona's stated sentence structure, vocabulary, and voice
    - Show, don't tell — no filter words (felt, saw, heard, realized, noticed)
    - Deep POV: immerse the reader in a character's immediate experience
@@ -380,7 +380,7 @@ def write_chapter(chap, bp, folder, prev_sum, tracking=None, prev_content=None,
    _draft_word_list = current_text.lower().split() if current_text else []
    _fw_hit_count = sum(1 for w in _draft_word_list if w in _fw_set)
    _fw_density = _fw_hit_count / max(len(_draft_word_list), 1)
-    _skip_polish = _fw_density < 0.012  # < ~1 filter word per 83 words → draft already clean
+    _skip_polish = _fw_density < 0.008  # < ~1 filter word per 125 words → draft already clean
    if current_text and not _skip_polish:
        utils.log("WRITER", f"  -> Two-pass polish (Pro model, FW density {_fw_density:.3f})...")
@@ -427,7 +427,11 @@ def write_chapter(chap, bp, folder, prev_sum, tracking=None, prev_content=None,
    elif current_text:
        utils.log("WRITER", f"  -> Draft clean (FW density {_fw_density:.3f}). Skipping polish pass.")
-    # Reduced from 3 → 2 attempts since polish pass already refines prose before evaluation
+    # Adaptive attempts: climax/resolution chapters (position >= 0.75) get 3 passes;
    # earlier chapters keep 2 (polish pass already refines prose before evaluation).
    if chapter_position is not None and chapter_position >= 0.75:
        max_attempts = 3
    else:
        max_attempts = 2
    SCORE_AUTO_ACCEPT = 8
    # Adaptive passing threshold: lenient for early setup chapters, strict for climax/resolution.