- web/db.py: Add last_heartbeat column to Run model
- core/utils.py: Add set_heartbeat_callback() and send_heartbeat()
- web/tasks.py: Add _robust_update_run_status() with 5-retry exponential backoff;
add db_heartbeat_callback(); remove all bare except:pass on DB status updates;
set start_time + last_heartbeat when marking run as 'running'
- web/app.py: Add last_heartbeat column migration; add _stale_job_watcher()
background thread (checks every 5 min, 15-min heartbeat threshold, 2-hr start_time threshold)
- cli/engine.py: Add phase-level logging banners and try/except wrappers in
process_book(); add utils.send_heartbeat() after each chapter save;
add start/finish logging in run_generation()
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- web/tasks.py: db_log_callback now writes non-OperationalError exceptions to data/app.log for visibility
- web/tasks.py: generate_book_task restructured with try...finally to guarantee final status update — run can never be left in 'running' state if worker crashes
- templates/project.html: added .catch() to fetchLog() with console.error + polling resume on failure; added manual Refresh button to status bar
- templates/run_details.html: improved .catch() in updateLog() with descriptive message + 5s retry; added manual Refresh button to status bar
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- web/tasks.py: db_log_callback bare `except: break` replaced with
explicit `except Exception as _e: print(...)` so insertion failures
are visible in Docker logs. Also fixed datetime.utcnow() → .isoformat()
for clean string storage in SQLite.
Same fix applied to db_progress_callback.
- web/routes/run.py (run_status): added db.session.expire_all() to
force fresh reads; raw sqlite3 bypass query when ORM returns no rows;
file fallback wrapped in try/except with stdout error reporting;
secondary check for web_console.log inside the run directory;
utf-8 encoding on all file opens.
- ai_blueprint.md: bumped to v2.11, documented root causes and fixes.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- web/app.py: Startup banner to docker logs (Python version, platform,
Huey version, DB paths). All print() calls now flush=True so Docker
captures them immediately. Emoji-free for robust stdout encoding.
Startup now detects orphaned queued runs (queue empty but DB queued)
and resets them to 'failed' so the UI does not stay stuck on reload.
Huey logging configured at INFO level so task pick-up/completion
appears in `docker logs`. Consumer skip reason logged explicitly.
- web/tasks.py: generate_book_task now emits [TASK run=N] lines to
stdout (docker logs) at pick-up, log-file creation, DB status update,
and on error (with full traceback) so failures are always visible.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- ai/setup.py: Added threading import; OAuth block now detects background/headless
threads and skips run_local_server to prevent indefinite blocking. Logs a clear
warning and falls back to ADC for Vertex AI. Token file only written when creds
are not None.
- web/tasks.py: All sqlite3.connect() calls now use timeout=30, check_same_thread=False.
OperationalError on the initial status update is caught and logged via utils.log.
generate_book_task now touches initial_log immediately so the UI polling endpoint
always finds an existing file even if the worker crashes on the next line.
- ai_blueprint.md: Bumped to v2.9; Section 12.D sub-items 1-3 marked ✅; item 13
added to summary.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>