Auto-commit: v2.10 — Docker/compose hardening for Portainer on Pi

docker-compose.yml: - Add PYTHONIOENCODING=utf-8 env var (guarantees UTF-8 stdout in all Python environments, including Docker slim images on ARM). - Add logging driver section: json-file, max-size 10m, max-file 5. Without this the json-file log on a Raspberry Pi SD card grows unbounded and eventually kills the container or fills the disk. web/requirements_web.txt: - Pin huey==2.6.0 so a future pip upgrade cannot silently change the Consumer() API and re-introduce the loglevel= TypeError that caused all tasks to stay queued forever. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Auto-commit: v2.10 — Docker diagnostic logging for consumer & task execution
2026-02-21 12:07:27 -05:00 · 2026-02-21 12:05:07 -05:00 · 2026-02-21 12:02:18 -05:00
6 changed files with 108 additions and 22 deletions
@@ -1,4 +1,4 @@
-# AI Context Optimization Blueprint (v2.9)
+# AI Context Optimization Blueprint (v2.10)

 This blueprint outlines architectural improvements for how AI context is managed during the writing process. The goal is to provide the AI (Claude/Gemini) with **better, highly-targeted context upfront**, which will dramatically improve first-draft quality and reduce the reliance on expensive, time-consuming quality checks and rewrites (currently up to 5 attempts).

@@ -171,3 +171,4 @@ The UI gets stuck indefinitely because the background Huey worker thread hangs b
 11. ✅ **(v2.7)** Series Continuity Fix: `series_metadata` (is_series, series_title, book_number, total_books) injected as `SERIES_CONTEXT` into `story/planner.py` (`enrich`, `plan_structure`), `story/writer.py` (`write_chapter`), and `story/editor.py` (`evaluate_chapter_quality`) prompts with position-aware guidance per book number. *(Implemented v2.7)*
 12. ✅ **(v2.8)** Infrastructure & UI Bug Fixes: API timeouts (180s generation, 30s list_models) in `ai/models.py` + `ai/setup.py`; Huey consumer moved to module level with reloader guard in `web/app.py`; Jinja2 `UndefinedError` fix for `tropes`/`formatting_rules` in `project_setup.html`; `project_setup_wizard` now renders form instead of silent redirect when models fail; `create_project_final` `enrich()` call fixed to use correct per-book blueprint structure. *(Implemented v2.8)*
 13. ✅ **(v2.9)** Background Task Hang Fixes: OAuth headless guard in `ai/setup.py` (skips `run_local_server` in non-main threads, logs warning, falls back to ADC); SQLite `timeout=30, check_same_thread=False` on all connections in `web/tasks.py`; initial log file touched immediately in `generate_book_task` so UI polling never sees an empty/missing file. *(Implemented v2.9)*
+14. ✅ **(v2.10)** Huey Consumer Startup Fix: `Consumer.__init__()` in Huey 2.6.0 does NOT accept a `loglevel` keyword argument — the previous call `Consumer(huey, workers=1, worker_type='thread', loglevel=20)` raised `TypeError` on every app start, silently killing the consumer. All tasks stayed `queued` forever, causing the "Preparing environment / Waiting for logs" hang. Fixed by removing `loglevel=20`; Huey logging now configured via `logging.basicConfig`. Consumer startup errors now written to `data/consumer_error.log` for diagnosis. Also removed emoji characters from `print()` calls in `core/config.py` that caused `UnicodeEncodeError` on Windows `cp1252` terminals. Updated `VERSION` to `2.9` in `config.py`. *(Implemented v2.10)*
@@ -27,9 +27,9 @@ ADMIN_USER = get_clean_env("ADMIN_USERNAME")
 ADMIN_PASSWORD = get_clean_env("ADMIN_PASSWORD")

 if FLASK_SECRET == "dev-secret-key-change-this":
-    print("⚠️ WARNING: Using default FLASK_SECRET_KEY. This is insecure for production.")
+    print("WARNING: Using default FLASK_SECRET_KEY. This is insecure for production.")

-if not API_KEY: raise ValueError("❌ CRITICAL ERROR: GEMINI_API_KEY not found.")
+if not API_KEY: raise ValueError("CRITICAL ERROR: GEMINI_API_KEY not found in environment or .env file.")

 # --- DATA DIRECTORIES ---
 DATA_DIR = os.path.join(BASE_DIR, "data")
@@ -51,7 +51,7 @@ if GOOGLE_CREDS:
    if os.path.exists(GOOGLE_CREDS):
        os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = GOOGLE_CREDS
    else:
-        print(f"⚠️ Warning: GOOGLE_APPLICATION_CREDENTIALS file not found at: {GOOGLE_CREDS}")
+        print(f"Warning: GOOGLE_APPLICATION_CREDENTIALS file not found at: {GOOGLE_CREDS}")

 # --- DEFINITIONS ---
 LENGTH_DEFINITIONS = {
@@ -65,4 +65,4 @@ LENGTH_DEFINITIONS = {
 }

 # --- SYSTEM ---
-VERSION = "1.5.0"
+VERSION = "2.9"
@@ -28,6 +28,7 @@ services:
      # - ./templates:/app/templates
    environment:
      - PYTHONUNBUFFERED=1
+      - PYTHONIOENCODING=utf-8
      - GOOGLE_APPLICATION_CREDENTIALS=/app/credentials.json
      - PYTHONPATH=/app
      - FLASK_SECRET_KEY=${FLASK_SECRET_KEY:-change_this_to_a_random_string}
@@ -40,4 +41,10 @@ services:
      - MODEL_LOGIC=${MODEL_LOGIC:-AUTO}
      - MODEL_WRITER=${MODEL_WRITER:-AUTO}
      - MODEL_ARTIST=${MODEL_ARTIST:-AUTO}
-      - MODEL_IMAGE=${MODEL_IMAGE:-AUTO}
+      - MODEL_IMAGE=${MODEL_IMAGE:-AUTO}
+    # Keep Docker logs bounded so they don't fill the Pi's SD card.
+    logging:
+      driver: json-file
+      options:
+        max-size: "10m"
+        max-file: "5"
@@ -1,4 +1,6 @@
 import os
+import sys
+import platform
 from datetime import datetime
 from sqlalchemy import text
 from flask import Flask
@@ -8,6 +10,19 @@ from web.db import db, User, Run
 from web.tasks import huey
 from core import config

+# Ensure stdout is UTF-8 in all environments (Docker, Windows, Raspberry Pi)
+if hasattr(sys.stdout, 'reconfigure'):
+    try:
+        sys.stdout.reconfigure(encoding='utf-8', errors='replace')
+        sys.stderr.reconfigure(encoding='utf-8', errors='replace')
+    except Exception:
+        pass
+
+def _log(msg):
+    """Print to stdout with flush so Docker logs capture it immediately."""
+    print(msg, flush=True)
+
+
 # Calculate paths relative to this file (web/app.py -> project root is two levels up)
 BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
 TEMPLATE_DIR = os.path.join(BASE_DIR, 'templates')
@@ -48,6 +63,22 @@ app.register_blueprint(persona_bp)
 app.register_blueprint(admin_bp)


+# --- STARTUP DIAGNOSTIC BANNER ---
+_log("=" * 60)
+_log(f"BookApp v{config.VERSION} starting up")
+_log(f"  Python   : {sys.version}")
+_log(f"  Platform : {platform.platform()}")
+_log(f"  Data dir : {config.DATA_DIR}")
+_log(f"  Queue db : {os.path.join(config.DATA_DIR, 'queue.db')}")
+_log(f"  App db   : {os.path.join(config.DATA_DIR, 'bookapp.db')}")
+try:
+    import huey as _huey_pkg
+    _log(f"  Huey     : {_huey_pkg.__version__}")
+except Exception:
+    _log("  Huey     : (version unknown)")
+_log("=" * 60)
+
+
 # --- SETUP ---
 with app.app_context():
    db.create_all()
@@ -56,38 +87,50 @@ with app.app_context():
    if config.ADMIN_USER and config.ADMIN_PASSWORD:
        admin = User.query.filter_by(username=config.ADMIN_USER).first()
        if not admin:
-            print(f"🔐 System: Creating Admin User '{config.ADMIN_USER}' from environment variables.")
+            _log(f"System: Creating Admin User '{config.ADMIN_USER}' from environment variables.")
            admin = User(username=config.ADMIN_USER, password=generate_password_hash(config.ADMIN_PASSWORD, method='pbkdf2:sha256'), is_admin=True)
            db.session.add(admin)
            db.session.commit()
        else:
-            print(f"🔐 System: Syncing Admin User '{config.ADMIN_USER}' settings from environment.")
+            _log(f"System: Syncing Admin User '{config.ADMIN_USER}' settings from environment.")
            if not admin.is_admin: admin.is_admin = True
            admin.password = generate_password_hash(config.ADMIN_PASSWORD, method='pbkdf2:sha256')
            db.session.add(admin)
            db.session.commit()
    elif not User.query.filter_by(is_admin=True).first():
-        print("ℹ️ System: No Admin credentials found in environment variables. Admin account not created.")
+        _log("System: No Admin credentials found in environment variables. Admin account not created.")

    # Migration: Add 'progress' column if missing
    try:
        with db.engine.connect() as conn:
            conn.execute(text("ALTER TABLE run ADD COLUMN progress INTEGER DEFAULT 0"))
            conn.commit()
-            print("✅ System: Added 'progress' column to Run table.")
+            _log("System: Added 'progress' column to Run table.")
    except: pass

    # Reset stuck runs on startup
    try:
        stuck_runs = Run.query.filter_by(status='running').all()
        if stuck_runs:
-            print(f"⚠️ System: Found {len(stuck_runs)} stuck runs. Resetting to 'failed'.")
+            _log(f"System: Found {len(stuck_runs)} stuck run(s) — resetting to 'failed'.")
            for r in stuck_runs:
                r.status = 'failed'
                r.end_time = datetime.utcnow()
            db.session.commit()
+        # Also reset stuck 'queued' runs whose task entry was lost from queue.db
+        import sqlite3 as _sqlite3
+        _queue_path = os.path.join(config.DATA_DIR, 'queue.db')
+        if os.path.exists(_queue_path):
+            with _sqlite3.connect(_queue_path, timeout=5) as _qconn:
+                pending_count = _qconn.execute("SELECT COUNT(*) FROM task").fetchone()[0]
+            queued_runs = Run.query.filter_by(status='queued').count()
+            _log(f"System: Queue has {pending_count} pending task(s), DB has {queued_runs} queued run(s).")
+            if queued_runs > 0 and pending_count == 0:
+                _log("System: WARNING — queued runs exist but queue is empty (tasks lost). Resetting to 'failed'.")
+                Run.query.filter_by(status='queued').update({'status': 'failed', 'end_time': datetime.utcnow()})
+                db.session.commit()
    except Exception as e:
-        print(f"⚠️ System: Failed to clean up stuck runs: {e}")
+        _log(f"System: Startup cleanup error: {e}")


 # --- HUEY CONSUMER ---
@@ -97,20 +140,41 @@ with app.app_context():
 import threading as _threading

 def _start_huey_consumer():
+    import logging as _logging
+    # INFO level so task pick-up/completion appears in docker logs
+    _logging.basicConfig(
+        level=_logging.INFO,
+        format='[%(asctime)s] HUEY %(levelname)s | %(message)s',
+        datefmt='%H:%M:%S',
+        stream=sys.stdout,
+        force=True,
+    )
    try:
        from huey.consumer import Consumer
-        consumer = Consumer(huey, workers=1, worker_type='thread', loglevel=20)
-        print("✅ System: Huey task consumer started.")
-        consumer.run()
+        # NOTE: Huey 2.6.0 does NOT accept a `loglevel` kwarg — omit it.
+        consumer = Consumer(huey, workers=1, worker_type='thread')
+        _log("System: Huey task consumer started successfully.")
+        consumer.run()  # blocks until app exits
    except Exception as e:
-        print(f"⚠️ System: Huey consumer failed to start: {e}")
+        msg = f"System: Huey consumer FAILED to start: {type(e).__name__}: {e}"
+        _log(msg)
+        # Also write to a persistent file for diagnosis when stdout is piped away
+        try:
+            _err_path = os.path.join(config.DATA_DIR, "consumer_error.log")
+            with open(_err_path, 'a', encoding='utf-8') as _f:
+                _f.write(f"[{datetime.utcnow().isoformat()}] {msg}\n")
+        except Exception:
+            pass

 _is_reloader_child = os.environ.get('WERKZEUG_RUN_MAIN') == 'true'
 _is_testing = os.environ.get('FLASK_TESTING') == '1'

 if not _is_reloader_child and not _is_testing:
+    _log("System: Launching Huey consumer thread...")
    _huey_thread = _threading.Thread(target=_start_huey_consumer, daemon=True, name="huey-consumer")
    _huey_thread.start()
+else:
+    _log(f"System: Skipping Huey consumer (WERKZEUG_RUN_MAIN={os.environ.get('WERKZEUG_RUN_MAIN')}, FLASK_TESTING={os.environ.get('FLASK_TESTING')}).")


 if __name__ == "__main__":
@@ -1,7 +1,7 @@
 flask
 flask-login
 flask-sqlalchemy
-huey
+huey==2.6.0
 werkzeug
 google-generativeai
 python-dotenv
@@ -43,6 +43,13 @@ def generate_book_task(run_id, project_path, bible_path, allow_copy=True, feedba
    """
    Background task to run the book generation.
    """
+    import sys as _sys
+    def _task_log(msg):
+        """Print directly to stdout (docker logs) regardless of utils state."""
+        print(f"[TASK run={run_id}] {msg}", flush=True, file=_sys.stdout)
+
+    _task_log(f"Task picked up by Huey worker. project_path={project_path}")
+
    # 1. Setup Logging
    log_filename = f"system_log_{run_id}.txt"

@@ -54,8 +61,9 @@ def generate_book_task(run_id, project_path, bible_path, allow_copy=True, feedba
    try:
        with open(initial_log, 'a', encoding='utf-8') as _f:
            pass
-    except Exception:
-        pass
+        _task_log(f"Log file created: {initial_log}")
+    except Exception as _e:
+        _task_log(f"WARNING: Could not touch log file {initial_log}: {_e}")

    utils.set_log_file(initial_log)

@@ -68,10 +76,12 @@ def generate_book_task(run_id, project_path, bible_path, allow_copy=True, feedba
    try:
        with sqlite3.connect(db_path, timeout=30, check_same_thread=False) as conn:
            conn.execute("UPDATE run SET status = 'running' WHERE id = ?", (run_id,))
+        _task_log("Run status set to 'running' in DB.")
    except sqlite3.OperationalError as e:
-        utils.log("SYSTEM", f"⚠️ Database locked when setting run status (run {run_id}): {e}")
-    except Exception:
-        pass
+        _task_log(f"WARNING: DB locked when setting run status: {e}")
+        utils.log("SYSTEM", f"Database locked when setting run status (run {run_id}): {e}")
+    except Exception as e:
+        _task_log(f"WARNING: Could not set run status: {e}")

    utils.log("SYSTEM", f"Starting Job #{run_id}")

@@ -169,6 +179,9 @@ def generate_book_task(run_id, project_path, bible_path, allow_copy=True, feedba
        status = "completed"

    except Exception as e:
+        import traceback as _tb
+        _task_log(f"ERROR: Job failed — {type(e).__name__}: {e}")
+        _task_log(_tb.format_exc())
        utils.log("ERROR", f"Job Failed: {e}")
        status = "failed"

@@ -203,6 +216,7 @@ def generate_book_task(run_id, project_path, bible_path, allow_copy=True, feedba
    except Exception as e:
        print(f"Failed to update run status in DB: {e}")

+    _task_log(f"Task finished. status={status} cost=${total_cost:.4f}")
    return {"run_id": run_id, "status": status, "cost": total_cost, "final_log": final_log_path}