mirror of
https://github.com/kennethreitz/photos.kennethreitz.org.git
synced 2026-06-05 06:46:13 +00:00
Comprehensive CLAUDE.md update for current state of project
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -2,10 +2,12 @@
|
||||
|
||||
## Project
|
||||
|
||||
ExifTree — a personal photography portfolio organized by the gear, places, and subjects that define it. AI-powered metadata, EXIF-based discovery, infinite scroll.
|
||||
**photos.kennethreitz.org** (codename: ExifTree) — a personal photography portfolio organized by gear, places, and subjects. AI-powered metadata, EXIF-based discovery, infinite scroll.
|
||||
|
||||
**Live:** photos.kennethreitz.org
|
||||
**Live:** https://photos.kennethreitz.org
|
||||
**Repo:** github.com/kennethreitz/photos.kennethreitz.org
|
||||
**Stack:** Django 6.x · Python 3.14 · PostgreSQL · Celery · django-bolt · Tigris (S3) · OpenAI
|
||||
**Deploy:** Fly.io with GitHub Actions auto-deploy on push to main
|
||||
|
||||
## Architecture
|
||||
|
||||
@@ -13,75 +15,111 @@ Single-tenant. One owner account. No public registration, no multi-user features
|
||||
|
||||
### Apps
|
||||
|
||||
- `core` — Models (User, Image, ExifData, Camera, Lens, Tag, City, SiteConfig). Other apps import from here.
|
||||
- `tree` — Browse pages: cameras, lenses, tags, cities. No models.
|
||||
- `gallery` — Collections for organizing photos.
|
||||
- `core` — All models (User, Image, ExifData, Camera, Lens, Tag, City, SiteConfig). Other apps import from here but never the reverse.
|
||||
- `tree` — Browse pages: cameras, lenses, tags, cities. No models, reads from core.
|
||||
- `gallery` — Collections for organizing photos into curated sets.
|
||||
- `ingest` — Upload pipeline, EXIF extraction, thumbnail generation, AI description, geocoding.
|
||||
- `search` — EXIF-powered search across all metadata including AI fields.
|
||||
|
||||
### Key Models
|
||||
|
||||
- **Image** — photos with thumbnails, AI title/description, tags (M2M), city (FK), visibility
|
||||
- **ExifData** — parsed EXIF + raw JSON blob, linked to Camera/Lens
|
||||
- **Tag** — AI-generated, used for word cloud browsing
|
||||
- **City** — reverse-geocoded from GPS, grouped by continent/country/state
|
||||
- **SiteConfig** — singleton for site title, tagline, analytics code, OpenAI key, AI prompt
|
||||
- `search` — Full-text search across titles, descriptions, AI fields, and tags.
|
||||
|
||||
### Image Pipeline (ingest)
|
||||
|
||||
1. Validate → 2. Extract EXIF → 3. Normalize camera/lens → 4. Perceptual hash → 5. Generate thumbnails → 6. Create ExifData → 7. Reverse geocode to city → 8. Apply cleanup rules → 9. Mark processed
|
||||
|
||||
AI description happens async after processing via Celery task.
|
||||
1. Validate format/size
|
||||
2. Extract EXIF
|
||||
3. Normalize camera/lens (deduplicate manufacturer strings)
|
||||
4. Compute perceptual hash (visual dedup)
|
||||
5. Generate thumbnails (small 300px, medium 800px, large 1600px)
|
||||
6. Create ExifData record
|
||||
7. Reverse geocode GPS to city (offline, blocks invalid countries)
|
||||
8. Apply cleanup rules (delete/fix based on date, country)
|
||||
9. Mark processed
|
||||
10. Dispatch AI description task (async via Celery)
|
||||
|
||||
### Cleanup Rules
|
||||
|
||||
Defined in `core/management/commands/cleanup.py` and `ingest/pipeline.py`:
|
||||
- Delete: 2008, 2019, 2020, Dec 26 2014, Dec 22 2017
|
||||
- Fix: clear dates before 2008 and 2021+
|
||||
- Cities: block CN, JP, KG, MN, RU. India only allows Bangalore/Mysore.
|
||||
Defined in `core/management/commands/cleanup.py` and enforced inline in `ingest/pipeline.py`:
|
||||
|
||||
**Delete:** years 2008, 2019, 2020. Dates: Dec 26 2014, Dec 22 2017.
|
||||
**Fix:** clear `date_taken` for years before 2008 and 2021+ (incorrect EXIF dates).
|
||||
**Cities:** block CN, JP, KG, MN, RU entirely. India allows only Bangalore/Mysore.
|
||||
|
||||
Invalid countries are blocked at four levels: `City.from_coordinates()`, `ingest/pipeline.py`, `geocode` command, and `cleanup` command.
|
||||
|
||||
### AI Metadata
|
||||
|
||||
GPT-4o-mini with structured output generates per-image:
|
||||
- **Title** — short, evocative (3-7 words)
|
||||
- **Description** — 2-3 sentences
|
||||
- **Tags** — 5-10 single-word lowercase tags
|
||||
|
||||
Configured via SiteConfig admin (OpenAI key + custom prompt).
|
||||
|
||||
## Code Style
|
||||
|
||||
- Python: PEP 8, type hints on function signatures
|
||||
- Django: fat models, thin views
|
||||
- Imports: stdlib → third-party → django → local apps
|
||||
- Django: fat models, thin views — logic lives on the model or in service functions
|
||||
- Imports: stdlib → third-party → django → local apps, separated by blank lines
|
||||
- Strings: double quotes for user-facing, single quotes for identifiers
|
||||
- Templates: HTMX for interactivity, vanilla JS only where required (upload, manage multi-select)
|
||||
- Templates: HTMX for interactivity, vanilla JS only where required (upload drag-drop, manage multi-select)
|
||||
- Tests: use pytest + pytest-django
|
||||
|
||||
## Models
|
||||
|
||||
- UUIDField primary keys everywhere
|
||||
- UUIDField primary keys everywhere (not auto-increment)
|
||||
- created_at/updated_at timestamps on every model
|
||||
- SlugField on anything in a URL
|
||||
- ExifData keeps raw JSON — never discard it
|
||||
- ExifData stores raw EXIF as JSONField — never throw away the raw data
|
||||
- Camera/Lens are canonical: raw EXIF strings normalized via `core/normalization.py`
|
||||
|
||||
## Frontend
|
||||
|
||||
Django templates + HTMX. No frontend framework. Minimal JS. Session auth (not JWT).
|
||||
Django templates + HTMX. No frontend framework. Session auth. Minimal vanilla JS.
|
||||
|
||||
- Infinite scroll via HTMX `hx-trigger="revealed"` with stable shuffle per session
|
||||
- CSS cache-busting via content hash in context processor
|
||||
- Analytics snippet configurable in SiteConfig admin
|
||||
|
||||
## URLs
|
||||
|
||||
- `/cameras/`, `/cameras/<slug>/`
|
||||
- `/` — home with infinite scroll, year filter
|
||||
- `/cameras/`, `/cameras/<slug>/` — gear browsing
|
||||
- `/lenses/`, `/lenses/<slug>/`
|
||||
- `/tags/`, `/tags/<slug>/`
|
||||
- `/cities/`, `/cities/<slug>/`
|
||||
- `/tags/`, `/tags/<slug>/` — AI-generated tag cloud
|
||||
- `/cities/`, `/cities/<slug>/` — GPS-based location browsing
|
||||
- `/collections/`, `/collections/<slug>/`
|
||||
- `/images/<uuid>/`
|
||||
- `/manage/`, `/upload/`, `/dashboard/`, `/search/`
|
||||
- `/images/<uuid>/` — detail with EXIF bar, prev/next, keyboard nav
|
||||
- `/manage/` — photo manager with multi-select, bulk actions, faceted filters
|
||||
- `/upload/` — drag-drop upload with progress
|
||||
- `/dashboard/` — owner dashboard
|
||||
- `/search/` — full-text search with EXIF filters
|
||||
- `/admin/` — Django admin (SiteConfig, models)
|
||||
|
||||
## Infrastructure
|
||||
|
||||
- **Fly.io**: web (runbolt) + worker (celery) processes
|
||||
- **PostgreSQL**: Fly Postgres, also Celery broker via SQLAlchemy transport
|
||||
- **Tigris**: S3-compatible object storage for images (used locally and in prod)
|
||||
- **Redis**: local Celery broker (brew service)
|
||||
- **python-dotenv**: .env loaded automatically via manage.py
|
||||
- **Fly.io**: two processes — `web` (django-bolt) + `worker` (celery -c 2)
|
||||
- **PostgreSQL**: Fly Postgres (4GB dedicated). Also Celery broker via `sqla+postgresql://`
|
||||
- **Tigris**: S3-compatible object storage for all images (used locally and in prod)
|
||||
- **Redis**: local-only Celery broker (brew service). Not used in production.
|
||||
- **GitHub Actions**: auto-deploy on push to main via `flyctl deploy --remote-only`
|
||||
- **python-dotenv**: `.env` loaded automatically in `manage.py`
|
||||
|
||||
## Management Commands
|
||||
|
||||
```
|
||||
import_folder /path # Bulk import with auto-seek, dedup, concurrent workers
|
||||
import_flickr <user> # Import from Flickr via API
|
||||
ai_describe # Backfill AI metadata (--tail for continuous watch)
|
||||
geocode # Batch reverse geocode GPS to cities
|
||||
cleanup # Run all cleanup rules
|
||||
dedupe # Remove visual duplicates via perceptual hash
|
||||
reprocess # Re-process stuck images
|
||||
```
|
||||
|
||||
## When Working on This
|
||||
|
||||
- Don't add dependencies without discussing tradeoffs
|
||||
- Prefer Django builtins over third-party packages
|
||||
- Write reversible migrations
|
||||
- Keep cleanup rules in the cleanup command, mirrored in pipeline.py
|
||||
- Invalid GPS countries are blocked in City.from_coordinates, pipeline, geocode command, AND cleanup
|
||||
- The `ai_describe --tail` command watches for new images continuously
|
||||
- Restart Celery workers after code changes (`kill` + re-launch)
|
||||
- Keep core minimal — if logic could live in core or a feature app, default to the feature app
|
||||
- Cleanup rules must be mirrored in both the cleanup command and pipeline.py
|
||||
- Restart Celery workers after code changes (they cache old Python modules)
|
||||
- `conn_max_age=60` and `CELERY_BROKER_POOL_LIMIT=1` to prevent DB connection exhaustion
|
||||
|
||||
Reference in New Issue
Block a user