Files
kennethreitz 59de5e4d3a Sync CLAUDE.md infrastructure section with reality
The infra section had drifted: claimed two processes (web + worker)
and 4GB dedicated Postgres, neither of which has been true for weeks.
Also documents the Tigris CORS config and the daily-restart workflow
so future sessions know they exist.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 21:00:05 -04:00

5.9 KiB

CLAUDE.md

Project

photos.kennethreitz.org (codename: ExifTree) — a personal photography portfolio organized by gear, places, and subjects. AI-powered metadata, EXIF-based discovery, infinite scroll.

Live: https://photos.kennethreitz.org Repo: github.com/kennethreitz/photos.kennethreitz.org Stack: Django 6.x · Python 3.14 · PostgreSQL · Celery · django-bolt · Tigris (S3) · OpenAI Deploy: Fly.io with GitHub Actions auto-deploy on push to main

Architecture

Single-tenant. One owner account. No public registration, no multi-user features.

Apps

  • core — All models (User, Image, ExifData, Camera, Lens, Tag, City, SiteConfig). Other apps import from here but never the reverse.
  • tree — Browse pages: cameras, lenses, tags, cities. No models, reads from core.
  • gallery — Collections for organizing photos into curated sets.
  • ingest — Upload pipeline, EXIF extraction, thumbnail generation, AI description, geocoding.
  • search — Full-text search across titles, descriptions, AI fields, and tags.

Image Pipeline (ingest)

  1. Validate format/size
  2. Extract EXIF
  3. Normalize camera/lens (deduplicate manufacturer strings)
  4. Compute perceptual hash (visual dedup)
  5. Generate thumbnails (small 300px, medium 800px, large 1600px)
  6. Create ExifData record
  7. Reverse geocode GPS to city (offline, blocks invalid countries)
  8. Apply cleanup rules (delete/fix based on date, country)
  9. Mark processed
  10. Dispatch AI description task (async via Celery)

Cleanup Rules

Defined in core/management/commands/cleanup.py and enforced inline in ingest/pipeline.py:

Delete: years 2008, 2019, 2020. Dates: Dec 26 2014, Dec 22 2017. Fix: clear date_taken for years before 2008 and 2021+ (incorrect EXIF dates). Cities: block CN, JP, KG, MN, RU entirely. India allows only Bangalore/Mysore.

Invalid countries are blocked at four levels: City.from_coordinates(), ingest/pipeline.py, geocode command, and cleanup command.

AI Metadata

GPT-4o-mini with structured output generates per-image:

  • Title — short, evocative (3-7 words)
  • Description — 2-3 sentences
  • Tags — 5-10 single-word lowercase tags

Configured via SiteConfig admin (OpenAI key + custom prompt).

Code Style

  • Python: PEP 8, type hints on function signatures
  • Django: fat models, thin views — logic lives on the model or in service functions
  • Imports: stdlib → third-party → django → local apps, separated by blank lines
  • Strings: double quotes for user-facing, single quotes for identifiers
  • Templates: HTMX for interactivity, vanilla JS only where required (upload drag-drop, manage multi-select)
  • Tests: use pytest + pytest-django

Models

  • UUIDField primary keys everywhere (not auto-increment)
  • created_at/updated_at timestamps on every model
  • SlugField on anything in a URL
  • ExifData stores raw EXIF as JSONField — never throw away the raw data
  • Camera/Lens are canonical: raw EXIF strings normalized via core/normalization.py

Frontend

Django templates + HTMX. No frontend framework. Session auth. Minimal vanilla JS.

  • Infinite scroll via HTMX hx-trigger="revealed" with stable shuffle per session
  • CSS cache-busting via content hash in context processor
  • Analytics snippet configurable in SiteConfig admin

URLs

  • / — home with infinite scroll, year filter
  • /cameras/, /cameras/<slug>/ — gear browsing
  • /lenses/, /lenses/<slug>/
  • /tags/, /tags/<slug>/ — AI-generated tag cloud
  • /cities/, /cities/<slug>/ — GPS-based location browsing
  • /collections/, /collections/<slug>/
  • /images/<uuid>/ — detail with EXIF bar, prev/next, keyboard nav
  • /manage/ — photo manager with multi-select, bulk actions, faceted filters
  • /upload/ — drag-drop upload with progress
  • /dashboard/ — owner dashboard
  • /search/ — full-text search with EXIF filters
  • /admin/ — Django admin (SiteConfig, models)

Infrastructure

  • Fly.io: single web process (django-bolt). The worker process group is currently absent in fly.toml — Celery tasks (AI metadata, etc.) have no consumer in production. Re-add when needed.
  • PostgreSQL: Fly Postgres on exiftree-db, single shared-cpu-1x:2048MB machine (bumped from 1GB after an OOM in mid-April 2026). Also Celery broker via sqla+postgresql://.
  • Tigris: S3-compatible object storage for all images (used locally and in prod). CORS is configured on the exiftree-media bucket for photos.kennethreitz.org + localhost:8000 + 127.0.0.1:8000.
  • Redis: local-only Celery broker (brew service). Not used in production.
  • GitHub Actions:
    • deploy.yml — auto-deploy on push to main via flyctl deploy --remote-only
    • daily-restart.ymlfly apps restart exiftree at 08:00 UTC. Originally added as a band-aid for an app wedge that surfaced after Postgres restarts; v118 (April 19 2026) appears to have fixed the wedge but the daily restart is still in place as a safety net.
  • python-dotenv: .env loaded automatically in manage.py

Management Commands

import_folder /path   # Bulk import with auto-seek, dedup, concurrent workers
import_flickr <user>  # Import from Flickr via API
ai_describe           # Backfill AI metadata (--tail for continuous watch)
geocode               # Batch reverse geocode GPS to cities
cleanup               # Run all cleanup rules
dedupe                # Remove visual duplicates via perceptual hash
reprocess             # Re-process stuck images

When Working on This

  • Don't add dependencies without discussing tradeoffs
  • Prefer Django builtins over third-party packages
  • Write reversible migrations
  • Keep core minimal — if logic could live in core or a feature app, default to the feature app
  • Cleanup rules must be mirrored in both the cleanup command and pipeline.py
  • Restart Celery workers after code changes (they cache old Python modules)
  • conn_max_age=60 and CELERY_BROKER_POOL_LIMIT=1 to prevent DB connection exhaustion