feat: Add evaluation report pipeline for prompt tuning feedback
Adds a full per-chapter evaluation logging system that captures every
score, critique, and quality decision made during writing, then renders
a self-contained HTML report shareable with critics or prompt engineers.
New file — story/eval_logger.py:
- append_eval_entry(folder, entry): writes per-chapter eval data to
eval_log.json in the book folder (called from write_chapter() at
every return point).
- generate_html_report(folder, bp): reads eval_log.json and produces a
self-contained HTML file (no external deps) with:
• Summary cards (avg score, auto-accepted, rewrites, below-threshold)
• Score timeline bar chart (one bar per chapter, colour-coded)
• Score distribution histogram
• Chapter breakdown table with expand-on-click critique details
(attempt number, score, decision badge, full critique text)
• Critique pattern frequency table (keyword mining across all critiques)
• Auto-generated prompt tuning observations (systemic issues, POV
character weak spots, pacing type analysis, climax vs. early
chapter comparison)
story/writer.py:
- Imports time and eval_logger.
- Initialises _eval_entry dict (chapter metadata + polish flags + thresholds)
after all threshold variables are set.
- Records each evaluation attempt's score, critique (truncated to 700 chars),
and decision (auto_accepted / full_rewrite / refinement / accepted /
below_threshold / eval_error / refinement_failed) before every return.
web/routes/run.py:
- Imports story_eval_logger.
- New route GET /project/<run_id>/eval_report/<book_folder>: loads
eval_log.json, calls generate_html_report(), returns the HTML as a
downloadable attachment named eval_report_<title>.html.
Returns a user-friendly "not yet available" page if no log exists.
templates/run_details.html:
- Adds "Eval Report" (btn-outline-info) button next to "Check Consistency"
in each book's artifact section.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -208,6 +208,9 @@
|
||||
<a href="{{ url_for('run.check_consistency', run_id=run.id, book_folder=book.folder) }}" class="btn btn-outline-warning ms-2">
|
||||
<i class="fas fa-search me-2"></i>Check Consistency
|
||||
</a>
|
||||
<a href="{{ url_for('run.eval_report', run_id=run.id, book_folder=book.folder) }}" class="btn btn-outline-info ms-2" title="Download evaluation report (scores, critiques, prompt tuning notes)">
|
||||
<i class="fas fa-chart-bar me-2"></i>Eval Report
|
||||
</a>
|
||||
<button class="btn btn-warning ms-2" data-bs-toggle="modal" data-bs-target="#reviseBookModal{{ loop.index }}" title="Regenerate this book with changes, keeping others.">
|
||||
<i class="fas fa-pencil-alt me-2"></i>Revise
|
||||
</button>
|
||||
|
||||
Reference in New Issue
Block a user