# Server: mercury *Last verified: 2026-06-04* ## Access | | | |---|---| | Hostname | `mercury` | | Domain | `mercury.kennethreitz.org` | | IP | `5.161.122.181` (Hetzner) | | SSH | `ssh root@mercury.kennethreitz.org` (key auth) | | Dokploy UI | https://mercury.kennethreitz.org | ## Specs | | | |---|---| | OS | Ubuntu 26.04 LTS | | Kernel | 7.0.0-15-generic | | Server type | Hetzner Cloud CPX41 (upgraded from CPX31 2026-06-05; 8 vCPU, 15.6 GiB; root disk kept at 150 GB for downgrade flexibility) (id `136742397`, dc `ash-dc1`) | | CPU | 8 vCPU | | RAM | 15.6 GiB | | Disk | 150 GB (`/dev/sda1`) | | Volume | `mercury-objects` (id `105925944`), 750 GB ext4 at `/mnt/objects` (fstab, nofail) — MinIO data | | Swap | 4 GB swapfile (`/swapfile`, swappiness 10) — added 2026-06-05 after Immich's arrival OOM-killed the Dokploy service (exit 137 loop → bad gateway). Immich server/ML now carry 2g/1.5g mem_limits. | | Firewall | Hetzner Cloud Firewall `mercury-web` (id `11085164`): inbound 22/80/443tcp+udp/2222 + ICMP only. Blocks the otherwise-public Dokploy :3000 and Traefik dashboard :8080 (api.insecure). Manage via Hetzner API. | ## Stack Docker **29.5.3** running in single-node **Swarm** mode (node `mercury`, manager/leader). ### Core services (Dokploy platform) | Service | Image | Notes | |---|---|---| | `dokploy` | `dokploy/dokploy:v0.29.7` | Swarm service, port 3000 | | `dokploy-postgres` | `postgres:16` | Swarm service, Dokploy's own DB | | `dokploy-redis` | `redis:7` | Swarm service | | `dokploy-traefik` | `traefik:v3.6.7` | Plain container; ports 80/443 (+443/udp for HTTP/3), 8080 | Traefik terminates TLS for `mercury.kennethreitz.org` and proxies to the Dokploy UI. ## Deployed applications See [inventory.md](inventory.md). Currently: - **httpbin** — https://httpbin.kennethreitz.org (`kennethreitz/httpbin`) - **poemsbysarah** — https://poemsbysarah.com (built from `kennethreitz/sarah-poems`) - **kjvstudy** — https://kjvstudy.org (built from `kennethreitz/kjvstudy.org`) - **kennethreitz.org** — https://kennethreitz.org (built from `kennethreitz/kennethreitz.org`) - **interpretations** — https://interpretations.kennethreitz.org (built from `kennethreitz/interpretations`) - **photos** — https://photos.kennethreitz.org (compose: web + celery worker + postgres17 db, from `kennethreitz/photos.kennethreitz.org`) ## Deploys All Swarm applications use `start-first` update ordering with rollback on failure (set via `application.update` → `updateConfigSwarm`). kennethreitz.org and kjvstudy additionally have Swarm healthchecks on `/health` (60s start period) so traffic only moves to a warmed container — deploys are zero-downtime (verified by probing during a live deploy). Durations in these API fields are nanoseconds. ## TLS / ACME Traefik's `letsencrypt` resolver uses the **HTTP-01 challenge**. All certs issued. Lessons from the 2026-06-05 Fly migration (cost ~1.5h of cert warnings): - While DNS still pointed at Fly, every validation failed; 5 failed authorizations/hour/domain trips Let's Encrypt's rate limiter, and **each retry during the stale window extends it** — when this happens, stop retrying and wait out the window (exact expiry is in the 429 in Traefik logs). - After a rate-limit stall, Traefik does not retry on its own — restart the `dokploy-traefik` container to trigger fresh orders. - DNS-01 via DNSimple is **not possible on this account**: lego requires an account token (`dnsimple_a_…`) and those aren't available at the current DNSimple plan level. HTTP-01 is the permanent strategy. - **Doctrine for new domains** (avoids every cert failure we've had): create the DNS record FIRST, verify all four `ns*.dnsimple-edge.*` nameservers serve it (their edge propagation can lag many minutes), and only THEN attach the domain in Dokploy. If a validation still fails, wait out any rate-limit window and restart `dokploy-traefik` exactly once.