bookapp

thethreemagi/bookapp

Fork 0

Commit Graph

Author	SHA1	Message	Date
Mike Wichers	f869700070	feat: Add evaluation report pipeline for prompt tuning feedback Adds a full per-chapter evaluation logging system that captures every score, critique, and quality decision made during writing, then renders a self-contained HTML report shareable with critics or prompt engineers. New file — story/eval_logger.py: - append_eval_entry(folder, entry): writes per-chapter eval data to eval_log.json in the book folder (called from write_chapter() at every return point). - generate_html_report(folder, bp): reads eval_log.json and produces a self-contained HTML file (no external deps) with: • Summary cards (avg score, auto-accepted, rewrites, below-threshold) • Score timeline bar chart (one bar per chapter, colour-coded) • Score distribution histogram • Chapter breakdown table with expand-on-click critique details (attempt number, score, decision badge, full critique text) • Critique pattern frequency table (keyword mining across all critiques) • Auto-generated prompt tuning observations (systemic issues, POV character weak spots, pacing type analysis, climax vs. early chapter comparison) story/writer.py: - Imports time and eval_logger. - Initialises _eval_entry dict (chapter metadata + polish flags + thresholds) after all threshold variables are set. - Records each evaluation attempt's score, critique (truncated to 700 chars), and decision (auto_accepted / full_rewrite / refinement / accepted / below_threshold / eval_error / refinement_failed) before every return. web/routes/run.py: - Imports story_eval_logger. - New route GET /project/<run_id>/eval_report/<book_folder>: loads eval_log.json, calls generate_html_report(), returns the HTML as a downloadable attachment named eval_report_<title>.html. Returns a user-friendly "not yet available" page if no log exists. templates/run_details.html: - Adds "Eval Report" (btn-outline-info) button next to "Check Consistency" in each book's artifact section. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-24 08:03:32 -05:00

Author

SHA1

Message

Date

Mike Wichers

f869700070

feat: Add evaluation report pipeline for prompt tuning feedback

Adds a full per-chapter evaluation logging system that captures every
score, critique, and quality decision made during writing, then renders
a self-contained HTML report shareable with critics or prompt engineers.

New file — story/eval_logger.py:
- append_eval_entry(folder, entry): writes per-chapter eval data to
  eval_log.json in the book folder (called from write_chapter() at
  every return point).
- generate_html_report(folder, bp): reads eval_log.json and produces a
  self-contained HTML file (no external deps) with:
    • Summary cards (avg score, auto-accepted, rewrites, below-threshold)
    • Score timeline bar chart (one bar per chapter, colour-coded)
    • Score distribution histogram
    • Chapter breakdown table with expand-on-click critique details
      (attempt number, score, decision badge, full critique text)
    • Critique pattern frequency table (keyword mining across all critiques)
    • Auto-generated prompt tuning observations (systemic issues, POV
      character weak spots, pacing type analysis, climax vs. early
      chapter comparison)

story/writer.py:
- Imports time and eval_logger.
- Initialises _eval_entry dict (chapter metadata + polish flags + thresholds)
  after all threshold variables are set.
- Records each evaluation attempt's score, critique (truncated to 700 chars),
  and decision (auto_accepted / full_rewrite / refinement / accepted /
  below_threshold / eval_error / refinement_failed) before every return.

web/routes/run.py:
- Imports story_eval_logger.
- New route GET /project/<run_id>/eval_report/<book_folder>: loads
  eval_log.json, calls generate_html_report(), returns the HTML as a
  downloadable attachment named eval_report_<title>.html.
  Returns a user-friendly "not yet available" page if no log exists.

templates/run_details.html:
- Adds "Eval Report" (btn-outline-info) button next to "Check Consistency"
  in each book's artifact section.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-02-24 08:03:32 -05:00

1 Commits