Files
kjvstudy.org/kjvstudy_org/utils/pdf.py
T
kennethreitz 9ec7885dce Make PDF generation async to prevent event loop blocking
Convert all PDF generation endpoints from synchronous to async to
prevent blocking FastAPI's event loop during CPU-intensive operations.

Changes:
- Add render_html_to_pdf_async() using ThreadPoolExecutor (2 workers)
- Convert all PDF endpoints to async def
- Use await render_html_to_pdf_async() instead of blocking calls
- Keep render_html_to_pdf() for backward compatibility

Performance impact:
- Prevents event loop blocking during PDF generation
- Allows other requests to be processed while PDFs are rendering
- Limits concurrent PDF generation to 2 workers to control CPU usage

Files updated:
- kjvstudy_org/utils/pdf.py (new async implementation)
- kjvstudy_org/server.py (5 PDF endpoints)
- kjvstudy_org/routes/api.py (4 PDF endpoints)
- kjvstudy_org/routes/resources.py (7 PDF endpoints)
- kjvstudy_org/routes/stories.py (2 PDF endpoints)
- kjvstudy_org/routes/study_guides.py (1 PDF endpoint)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-27 11:44:35 -05:00

50 lines
1.7 KiB
Python

"""Utility helpers for HTML-to-PDF generation."""
import io
from typing import BinaryIO
from concurrent.futures import ThreadPoolExecutor
import asyncio
try: # pragma: no cover - optional dependency
from weasyprint import HTML # type: ignore
WEASYPRINT_AVAILABLE = True
except (ImportError, OSError): # pragma: no cover - handled gracefully elsewhere
HTML = None
WEASYPRINT_AVAILABLE = False
# Thread pool for CPU-intensive PDF generation
_pdf_executor = ThreadPoolExecutor(max_workers=2, thread_name_prefix="pdf_worker")
def _render_pdf_sync(html_content: str) -> BinaryIO:
"""Internal synchronous PDF rendering function."""
if not WEASYPRINT_AVAILABLE or HTML is None:
raise RuntimeError("WeasyPrint is not available for PDF generation")
pdf_buffer = io.BytesIO()
HTML(string=html_content).write_pdf(pdf_buffer)
pdf_buffer.seek(0)
return pdf_buffer
def render_html_to_pdf(html_content: str) -> BinaryIO:
"""Synchronous wrapper for backward compatibility.
NOTE: Use render_html_to_pdf_async() in async contexts to avoid blocking.
Returns a BytesIO instance positioned at the beginning of the generated PDF.
Raises RuntimeError if WeasyPrint isn't available at runtime.
"""
return _render_pdf_sync(html_content)
async def render_html_to_pdf_async(html_content: str) -> BinaryIO:
"""Async-compatible PDF rendering that won't block the event loop.
Runs PDF generation in a thread pool to prevent blocking FastAPI.
Returns a BytesIO instance positioned at the beginning of the generated PDF.
Raises RuntimeError if WeasyPrint isn't available at runtime.
"""
loop = asyncio.get_event_loop()
return await loop.run_in_executor(_pdf_executor, _render_pdf_sync, html_content)