CLUSTER · prod-east · 24/7
k8s · autoscale · systemd timers · deployment tier
Web Users
browsers · HTTPS
Mobile App
iOS · Android · REST
Internal Admin
CLI · dashboards
CDN
Fastly · edge cache
TLS 1.3 · HTTP/3
Identity Provider
Auth0 · OIDC
JWT · PKCE
Observability
Datadog · Sentry
OTEL · traces · logs
Container Registry
GHCR · image pulls
deploy-time only
http-gateway proc
nginx + envoy
routing · rate-limit · TLS
3 replicas · HPA
websocket-gateway proc
sticky sessions
auth · fanout · heartbeat
2 replicas
api-gateway
GraphQL + REST
schema · dataloader
4 replicas
queue-listener
consumer group
delivery · retry · DLQ
2 replicas
Event Bus · Kafka
events.ingress.* · events.domain.* · events.outbound.* · 3-broker cluster · replication 3
service layer · hub
Router
request dispatch
circuit breaker
Validator
schema + RBAC
pydantic · zod
Executor
business logic
idempotency · retry
Domain Service
aggregate roots
tx · event-emit
Cache Layer
read-through + TTL
write-invalidate
Compute Pool
async workers
CPU-bound · queue
Auth · RBAC · trust
background workers
bg-workers
cron · delayed jobs
celery / sidekiq · CB
otel-agent · log-shipper
metrics · traces · logs
sidecar pattern
PostgreSQL
primary + 2 replicas
partitioned · WAL
Redis
cluster mode · AOF
cache · queue · locks
Object Storage
S3 · versioned
lifecycle · archival
ML cluster · GPU · async
Inference Service
Triton · TensorRT
Training Pipeline
Airflow · GPU-nodes
Runtime · Gateway Layer
- http-gateway, websocket-gateway, api-gateway as k8s deployments
- HPA scales on p95 latency + CPU
- nginx + envoy sidecar pattern
- queue-listener consumer group
Data · Storage
- Postgres primary + 2 streaming replicas (WAL)
- Redis cluster — cache, rate-limit, distributed locks
- S3 versioned buckets with lifecycle rules
- Kafka 3-broker cluster, replication factor 3
Transport · Integrations
- Event topics: events.{ingress,domain,outbound}.*
- Auth0 OIDC / JWT + PKCE
- Datadog + Sentry for telemetry
- ML cluster async via dedicated topic