feat: Improve book quality — stronger evaluator, more refinement attempts, quality-first model selection
- Fix: chapter quality evaluation now uses model_logic (free Pro) instead of model_writer (Flash). The model that wrote the chapter was also scoring it, causing circular, lenient grading. - Increase max_attempts in write_chapter from 2 to 3 for more refinement passes per chapter. - Update auto model selection prompt (ai/setup.py) to prioritize quality over budget framing: free/preview/exp models preferred by capability (Pro > Flash, 2.5 > 2.0 > 1.5), not just cost. Writer role now allowed to use best free Flash/Pro preview — not restricted to basic Flash only. - Bump version to 3.0. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
26
ai/setup.py
26
ai/setup.py
@@ -76,14 +76,14 @@ def select_best_models(force_refresh=False):
|
||||
prompt = f"""
|
||||
ROLE: AI Model Architect
|
||||
TASK: Select the optimal Gemini models for a book-writing application.
|
||||
PRIMARY OBJECTIVE: Keep total book generation cost under $2.00. Quality is secondary to this budget.
|
||||
PRIMARY OBJECTIVE: Maximize book quality. Free/preview/exp models are $0.00 — use the BEST quality free model available for every role. Only fall back to paid Flash when no free alternative exists, and only if it fits within the budget cap.
|
||||
|
||||
AVAILABLE_MODELS:
|
||||
{json.dumps(compatible)}
|
||||
|
||||
PRICING_CONTEXT (USD per 1M tokens — use these to calculate actual book cost):
|
||||
- FREE TIER: Any model with 'exp', 'beta', or 'preview' in name = $0.00. Always prefer these.
|
||||
e.g. gemini-2.0-pro-exp = FREE, gemini-2.5-pro-preview = FREE.
|
||||
e.g. gemini-2.0-pro-exp = FREE, gemini-2.5-pro-preview = FREE, gemini-2.5-flash-preview = FREE.
|
||||
- gemini-2.5-flash / gemini-2.5-flash-preview: ~$0.075 Input / $0.30 Output.
|
||||
- gemini-2.0-flash: ~$0.10 Input / $0.40 Output.
|
||||
- gemini-1.5-flash: ~$0.075 Input / $0.30 Output.
|
||||
@@ -92,9 +92,9 @@ def select_best_models(force_refresh=False):
|
||||
|
||||
BOOK TOKEN BUDGET (30-chapter novel — use this to calculate real cost before deciding):
|
||||
Logic role total: ~265,000 input tokens + ~55,000 output tokens
|
||||
(planning, state tracking, consistency checks, director treatments per chapter)
|
||||
(planning, state tracking, consistency checks, director treatments, chapter evaluation per chapter)
|
||||
Writer role total: ~450,000 input tokens + ~135,000 output tokens
|
||||
(drafting, evaluation, refinement per chapter — 2 passes max)
|
||||
(drafting, refinement per chapter — 3 passes max)
|
||||
Artist role total: ~30,000 input tokens + ~8,000 output tokens
|
||||
(cover art prompt design, cover layout, blurb, image quality evaluation — text calls only)
|
||||
|
||||
@@ -107,19 +107,23 @@ def select_best_models(force_refresh=False):
|
||||
(leaving $0.15 headroom for Imagen cover generation, total book target: $2.00).
|
||||
|
||||
SELECTION RULES (apply in order):
|
||||
1. FREE FIRST: If a free/exp model exists (any tier, any quality), pick it for Logic. Cost = $0.
|
||||
2. FLASH FOR WRITER: Flash is sufficient for fiction prose. Never pick a paid Pro for Writer.
|
||||
1. FREE/PREVIEW ALWAYS WINS: Always pick the highest-quality free/exp/preview model for each role.
|
||||
Free models cost $0 regardless of tier — a free Pro beats a paid Flash every time.
|
||||
2. QUALITY FOR WRITER: The Writer role produces all fiction prose. Prefer the best free Flash or
|
||||
free Pro variant available. If no free model exists for Writer, use the cheapest paid Flash
|
||||
that keeps the total budget under $1.85. Never use a paid stable Pro for Writer.
|
||||
3. CALCULATE: For non-free models, compute the actual book cost using the token budget above.
|
||||
Reject any combination that exceeds $2.00 total.
|
||||
4. QUALITY TIEBREAK: Among models with similar cost, prefer newer generation (2.x > 1.5).
|
||||
4. QUALITY TIEBREAK: Among models with identical cost (e.g. both free), prefer the highest
|
||||
generation and capability: Pro > Flash, 2.5 > 2.0 > 1.5, stable > exp only if cost equal.
|
||||
5. NO THINKING MODELS: Too slow and expensive for any role.
|
||||
|
||||
ROLES:
|
||||
- LOGIC: Planning, JSON adherence, plot consistency. Free/exp Pro ideal; Flash acceptable.
|
||||
- WRITER: Creative prose, chapter drafting. Flash 2.x is sufficient — do NOT use paid Pro.
|
||||
- ARTIST: Visual prompts for cover art. Cheapest capable Flash model.
|
||||
- LOGIC: Planning, JSON adherence, plot consistency, AND chapter quality evaluation. Best free/exp Pro is ideal; free Flash preview acceptable if no free Pro exists.
|
||||
- WRITER: Creative prose, chapter drafting and refinement. Best available free Flash or free Pro variant. Never use a paid stable Pro.
|
||||
- ARTIST: Visual prompts for cover art. Cheapest capable Flash model (free preferred).
|
||||
- PRO_REWRITE: Emergency full-chapter rewrite (rare, ~1-2x per book). Best free/exp Pro available.
|
||||
If no free Pro exists, use best Flash — do not use paid Pro even here.
|
||||
If no free Pro exists, use best free Flash preview — do not use paid models here.
|
||||
|
||||
OUTPUT_FORMAT (JSON only, no markdown):
|
||||
{{
|
||||
|
||||
Reference in New Issue
Block a user