Skip to content

Cluster Configuration

Single-process deployments usually do not need cluster mode.

Enable it when:

  • Multiple SyncTV replicas serve the same instance.
  • Multiple servers share one PostgreSQL and Redis backend.
  • Cross-node room sync, kicks, cache invalidation, livestream coordination, or leader election are required.
cluster:
enabled: true

Redis and server.cluster_secret become required.

Cluster mode solves runtime consistency when multiple SyncTV processes serve the same instance. It is not just a load-balancing switch, and it does not replace PostgreSQL, Redis, Ingress, or a deliberate livestream storage/proxy model.

Single Source of Truth

PostgreSQL stores durable state: users, rooms, permissions, providers, preferences, and audit data. Every node must use the same database.

Runtime Coordination

Redis stores ephemeral shared state: node registration, pub/sub, Redis Stream catch-up, leader election, rate limits, and short-lived auth state.

Inter-Node Trust

server.cluster_secret authenticates inter-node gRPC calls. It must be identical across replicas and must not be exposed to clients.

Livestream Reachability

RTMP publishers can land on any node, and HLS segments can be requested from any node; livestreaming therefore needs a publisher registry and a deliberate choice between local and shared HLS backends.

SyncTV cluster runtime architecture showing clients reaching multiple nodes through HTTP/gRPC, with nodes sharing PostgreSQL and Redis, and reading livestream segments through an HLS backend or publisher-node proxy. SyncTV cluster runtime architecture showing clients reaching multiple nodes through HTTP/gRPC, with nodes sharing PostgreSQL and Redis, and reading livestream segments through an HLS backend or publisher-node proxy.
In cluster mode, durable business state goes to PostgreSQL and cross-node runtime state goes to Redis. HLS can use local backends with publisher-node proxying, or shared filesystem/OSS backends for direct segment reads from every replica.
  1. Each node starts with the same PostgreSQL and Redis backends and derives or reads its node identity.
  2. Nodes register and discover peers through cluster.discovery_mode. The default redis mode fits most environments; Kubernetes deployments can use k8s_dns to assist Pod discovery.
  3. Room events, permission changes, cache invalidations, and kicks are distributed through Redis pub/sub. After a short disconnect, nodes replay recent Redis Stream events within cluster.catchup_window_secs.
  4. Background work uses cluster.leader_election_mode so only one replica performs global tasks at a time.
  5. Livestream publishers are recorded in a shared registry so another node can determine which node owns a room/media publisher.
  6. If an HLS request lands on a non-publisher node and the segment is not directly readable there, the node proxies playlist/segment reads to the publisher node through the HLS gRPC proxy.
ShapeUse caseRecommended configuration
Single processSmall self-hosted instance, development, testingcluster.enabled=false; Redis is optional but recommended in production
Fixed multi-nodeStable VM or bare-metal node countcluster.enabled=true, discovery_mode=static or redis
Kubernetes replicasHorizontal scaling, rolling updates, Ingress exposurecluster.enabled=true, discovery_mode=redis or k8s_dns, separate HTTP/gRPC Services
Low-traffic multi-replica livestreamHLS/FLV playback must survive cross-node routing, but segment request volume is lowcluster.enabled=true; memory or local file can rely on publisher-node proxying
High-traffic multi-replica livestreamHigh HLS request volume or cleaner rolling-upgrade boundariesAdd file shared storage or the oss HLS backend on top of cluster configuration

All nodes must:

  • Connect to the same PostgreSQL database.
  • Connect to the same Redis deployment.
  • Use the same server.cluster_secret.
  • Be able to reach each other’s API/gRPC address.
FieldDefaultPurpose
cluster.critical_channel_capacity1000High-priority events such as kicks and permission changes
cluster.publish_channel_capacity10000Normal Redis publish events

Critical events apply backpressure when full. Normal events may be dropped under extreme pressure with a warning to protect the main flow.

Default mode. Nodes register and discover each other through Redis.

Use it for most deployments because it works in Docker, servers, and Kubernetes.

Use explicit peer addresses:

cluster:
discovery_mode: "static"
peers:
- "node2.example.com:8080"
- "node3.example.com"

This works for small fixed-size clusters. If a peer omits the port, SyncTV tries server.port.

Uses Kubernetes headless service DNS to discover Pods.

Required environment variables:

  • HEADLESS_SERVICE_NAME
  • POD_NAMESPACE

Kubernetes DNS does not replace Redis. Redis is still required for health monitoring, load balancing state, pub/sub, and catch-up.

ModeUse case
redisDefault; works across Docker, servers, and Kubernetes
k8s_leaseKubernetes-native Lease resource

k8s_lease requires:

  • POD_NAME
  • POD_NAMESPACE
  • RBAC permissions for coordination.k8s.io/v1 Lease resources

The Helm chart can render the required Kubernetes resources when configured.

cluster.catchup_window_secs default: 300.

When a node joins or reconnects, it replays recent Redis Stream events within this window. Increase it if startup is slow or you want more conservative replay. Decrease it if event volume is high and fast startup matters more.

cluster.stream_max_length default: 10000.

This controls approximate Redis Stream retention. If traffic is high and nodes disconnect, too small a value can trim events before a node catches up. Larger values use more Redis memory.

Clustered HLS has two supported models. SyncTV records the publisher owner in the publisher registry and can proxy playlist/segment reads from non-publisher nodes to the publisher node through the HLS gRPC proxy.

Applicable backends:

  • memory
  • file with hls_shared_storage=false

This model is simple and does not require a shared segment directory. The tradeoff is that remote HLS segment requests go through the publisher node; if that node restarts, becomes unreachable, or is partitioned, remote nodes may be unable to read the stream’s segments.

Example:

livestream:
hls_storage_backend: "memory"

or:

livestream:
hls_storage_backend: "file"
hls_shared_storage: false
hls_storage_path: "/var/lib/synctv/hls"

Filesystem option:

livestream:
hls_storage_backend: "file"
hls_shared_storage: true
hls_storage_path: "/var/lib/synctv/hls"

All replicas must read and write the same path, for example through NFS, an RWX PVC, or a CSI volume.

Object storage option:

livestream:
hls_storage_backend: "oss"
hls_oss:
endpoint: "https://s3.example.com"
bucket: "synctv-hls"
base_path: "synctv/hls/"

The oss backend uses S3-compatible object storage and does not use hls_shared_storage. hls_shared_storage=true is valid only with hls_storage_backend=file; configuration validation rejects it with memory or oss.

The Helm chart does not enable cluster mode by default. Before scaling replicas, explicitly set config.cluster.enabled=true. HLS can start with the publisher-node proxy model; for high-traffic production HLS, configure shared file-backed HLS or OSS-backed HLS.

Before scaling replicas:

  • Redis is configured and reachable.
  • server.cluster_secret is stable and shared by every replica.
  • The HLS model is explicit: local backend with publisher-node proxying for small deployments, or file + hls_shared_storage=true + RWX/PVC / oss for high-traffic production.
  • HTTP and gRPC Services/Ingresses match your network design.
  • Leader election mode matches your platform.