Capacity Planning
Planning Dimensions
Section titled “Planning Dimensions”| Resource | Driven by | Main signals |
|---|---|---|
| HTTP/gRPC | Login, lists, Provider browse, polling | http_requests_total, latency, errors |
| WebSocket | Online users, rooms, multi-tab clients | websocket_connections_active, message rate |
| PostgreSQL | Rooms, members, playlists, permissions | Pool usage, query latency, waiting connections |
| Redis | Rate limits, OAuth2/WebAuthn state, cache, cluster | Latency, errors, pub/sub health |
| Provider/proxy | Upstream requests, Range, bandwidth, cache | Provider errors, proxy latency, cache hit ratio |
| Livestream | RTMP publish, HLS/FLV, storage | Publishers, viewers, bytes, pull errors |
Deployment Shapes
Section titled “Deployment Shapes”| Shape | Fits | Required action |
|---|---|---|
| Single SyncTV + PostgreSQL | Small evaluation or personal production | Persistent secrets, DB backups, TLS |
| Single node + Redis | Recommended production baseline | Redis HA or accepted short-state loss |
| Multi-replica SyncTV | Rolling update or horizontal scale | Shared PostgreSQL, Redis, cluster secret, drain |
| Multi-replica + livestream | Large live or highly available entry | Clear HLS backend and publisher proxy/storage validation |
Do not add replicas before database, Redis, secrets, and HLS storage boundaries are clear.
Estimate Capacity
Section titled “Estimate Capacity”- Estimate concurrent users and connections per user.
- Estimate active rooms, members per room, and message frequency.
- Estimate login, refresh, Provider browse, playback info, and list requests per minute.
- Estimate media mode: direct, proxy, livestream, average bitrate.
- Calculate proxy bandwidth: proxy viewers times average bitrate times peak factor.
- Calculate database pool total: replicas times
database.max_connections. - Set independent alerts for Redis, PostgreSQL, Ingress, and SyncTV.
Important Configuration
Section titled “Important Configuration”| Configuration | Purpose |
|---|---|
database.max_connections | Per-replica database pool limit |
redis.* | L2 cache, rate limits, short-lived state, cluster coordination |
connection_limits.* | WebSocket user, room, global, lifetime, and per-connection message limits |
request_rate_limits.websocket_* | WebSocket connection-attempt limits |
request_rate_limits.* | HTTP login, API, media, admin, and streaming limits |
request_rate_limits.* | gRPC API and verification limits |
messaging_rate_limits.* | Chat message limits |
proxy_slice_cache.* | Range slice cache and file backend |
server.shutdown_drain_timeout_seconds | Connection drain during rolling updates |
cluster.* | Discovery, leader election, catch-up |
livestream.* | RTMP/FLV/HLS and backend |
See Configuration Index.
Database and Redis
Section titled “Database and Redis”Database guidance:
- Keep total pool size below the database limit, leaving room for migrations and operations.
- In multi-replica mode, multiply pool size by replica count.
- Use pagination for large lists.
- Validate migrations against production-like data.
Redis guidance:
- Multi-replica mode requires shared Redis and a consistent
redis.key_prefix. - Redis loss affects OAuth2 state, WebAuthn challenge, email codes, rate limits, token blacklist, and cluster short-lived state.
- If strong token revocation matters, use Redis HA and shorter JWT access token lifetime.
Media Bandwidth
Section titled “Media Bandwidth”Proxy playback puts SyncTV in the media data path:
proxy egress bandwidth = concurrent_proxy_viewers * average_bitrate * peak_factorExample:
40 * 6 Mbps * 1.3 = 312 MbpsSlice cache can reduce upstream fetches for shared content, but it does not reduce SyncTV-to-client egress bandwidth.
Alert Starting Points
Section titled “Alert Starting Points”| Alert | Threshold idea |
|---|---|
| Sustained HTTP 5xx | Route-level growth for 5-10 minutes |
| p95 HTTP latency | Separate API, Provider, and proxy paths |
| WebSocket active near limit | Warn around 70%-80% |
| DB connection waiting | Sustained nonzero waiting is actionable |
| Redis pub/sub unhealthy | Multi-replica realtime risk |
| Provider timeout/5xx | Separate upstream and local network |
| Livestream pull errors | Check publisher, HLS backend, slow clients |
Metric names are listed in Metrics Catalog.
Validation Tests
Section titled “Validation Tests”Test real login, room lists, member lists, playback info, and Provider browse paths. Do not only load test /health/ready.
Simulate connections, heartbeat, chat, playback state, and resource observation.
Test direct, proxy, Range seek, and slice cache hits. Record upstream and SyncTV egress bandwidth.
Deploy while WebSocket and live connections exist; confirm readiness removes traffic and drain time is enough.