Skip to content

Capacity Planning

ResourceDriven byMain signals
HTTP/gRPCLogin, lists, Provider browse, pollinghttp_requests_total, latency, errors
WebSocketOnline users, rooms, multi-tab clientswebsocket_connections_active, message rate
PostgreSQLRooms, members, playlists, permissionsPool usage, query latency, waiting connections
RedisRate limits, OAuth2/WebAuthn state, cache, clusterLatency, errors, pub/sub health
Provider/proxyUpstream requests, Range, bandwidth, cacheProvider errors, proxy latency, cache hit ratio
LivestreamRTMP publish, HLS/FLV, storagePublishers, viewers, bytes, pull errors
ShapeFitsRequired action
Single SyncTV + PostgreSQLSmall evaluation or personal productionPersistent secrets, DB backups, TLS
Single node + RedisRecommended production baselineRedis HA or accepted short-state loss
Multi-replica SyncTVRolling update or horizontal scaleShared PostgreSQL, Redis, cluster secret, drain
Multi-replica + livestreamLarge live or highly available entryClear HLS backend and publisher proxy/storage validation

Do not add replicas before database, Redis, secrets, and HLS storage boundaries are clear.

  1. Estimate concurrent users and connections per user.
  2. Estimate active rooms, members per room, and message frequency.
  3. Estimate login, refresh, Provider browse, playback info, and list requests per minute.
  4. Estimate media mode: direct, proxy, livestream, average bitrate.
  5. Calculate proxy bandwidth: proxy viewers times average bitrate times peak factor.
  6. Calculate database pool total: replicas times database.max_connections.
  7. Set independent alerts for Redis, PostgreSQL, Ingress, and SyncTV.
ConfigurationPurpose
database.max_connectionsPer-replica database pool limit
redis.*L2 cache, rate limits, short-lived state, cluster coordination
connection_limits.*WebSocket user, room, global, lifetime, and per-connection message limits
request_rate_limits.websocket_*WebSocket connection-attempt limits
request_rate_limits.*HTTP login, API, media, admin, and streaming limits
request_rate_limits.*gRPC API and verification limits
messaging_rate_limits.*Chat message limits
proxy_slice_cache.*Range slice cache and file backend
server.shutdown_drain_timeout_secondsConnection drain during rolling updates
cluster.*Discovery, leader election, catch-up
livestream.*RTMP/FLV/HLS and backend

See Configuration Index.

Database guidance:

  • Keep total pool size below the database limit, leaving room for migrations and operations.
  • In multi-replica mode, multiply pool size by replica count.
  • Use pagination for large lists.
  • Validate migrations against production-like data.

Redis guidance:

  • Multi-replica mode requires shared Redis and a consistent redis.key_prefix.
  • Redis loss affects OAuth2 state, WebAuthn challenge, email codes, rate limits, token blacklist, and cluster short-lived state.
  • If strong token revocation matters, use Redis HA and shorter JWT access token lifetime.

Proxy playback puts SyncTV in the media data path:

proxy egress bandwidth = concurrent_proxy_viewers * average_bitrate * peak_factor

Example:

40 * 6 Mbps * 1.3 = 312 Mbps

Slice cache can reduce upstream fetches for shared content, but it does not reduce SyncTV-to-client egress bandwidth.

AlertThreshold idea
Sustained HTTP 5xxRoute-level growth for 5-10 minutes
p95 HTTP latencySeparate API, Provider, and proxy paths
WebSocket active near limitWarn around 70%-80%
DB connection waitingSustained nonzero waiting is actionable
Redis pub/sub unhealthyMulti-replica realtime risk
Provider timeout/5xxSeparate upstream and local network
Livestream pull errorsCheck publisher, HLS backend, slow clients

Metric names are listed in Metrics Catalog.

Test real login, room lists, member lists, playback info, and Provider browse paths. Do not only load test /health/ready.