Files
video-setup/docs/relay.md
mikael-lovqvists-claude-agent 4e40223478 docs: split architecture.md into focused sub-documents
architecture.md is now a concise overview (~155 lines) with a
Documentation section linking to all sub-docs.

New sub-docs in docs/:
  transport.md        — wire modes, frame header, serialization, web peer
  relay.md            — delivery modes, memory model, congestion, scheduler
  codec.md            — stream metadata, format negotiation, codec backends
  xorg.md             — screen grab, viewer sink, render loop, overlays
  discovery.md        — multicast announcements, multi-site, site gateways
  node-state.md       — wanted/current state, reconciler, stats, queries
  device-resilience.md — device loss handling, stream events, audio (future)

All cross-references updated to file links. Every sub-doc links back
to architecture.md. docs/transport.md links to docs/protocol.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-28 23:23:54 +00:00

5.9 KiB

Relay Design

See Architecture Overview.

A relay receives frames from one or more upstream sources and distributes them to any number of outputs. Each output is independently configured with a delivery mode that determines how it handles the tension between latency and completeness.

Output Delivery Modes

Low-latency mode — minimize delay, accept loss

The output holds at most one pending frame. When a new frame arrives:

  • If the slot is empty, the frame occupies it and is sent as soon as the transport allows
  • If the slot is already occupied (transport not ready), the incoming frame is dropped — the pending frame is already stale enough

The consumer always receives the most recent frame the transport could deliver. Frame loss is expected and acceptable.

Completeness mode — minimize loss, accept delay

The output maintains a queue. When a new frame arrives it is enqueued. The transport drains the queue in order. When the queue is full, a drop policy is applied — either drop the oldest frame (preserve recency) or drop the newest (preserve continuity). Which policy fits depends on the consumer: an archiver may prefer continuity; a scrubber may prefer recency.

Memory Model

Compressed frames have variable sizes (I-frames vs P-frames, quality settings, scene complexity), so fixed-slot buffers waste memory unpredictably. The preferred model is per-frame allocation with explicit bookkeeping.

Each allocated frame is tracked with at minimum:

  • Byte size
  • Sequence number or timestamp
  • Which outputs still hold a reference

Limits are enforced per output independently — not as a shared pool — so a slow completeness output cannot starve a low-latency output or exhaust global memory. Per-output limits have two axes:

  • Frame count — cap on number of queued frames
  • Byte budget — cap on total bytes in flight for that output

Both limits should be configurable. Either limit being reached triggers the drop policy.

Congestion: Two Sides

Congestion can arise at both ends of the relay and must be handled explicitly on each.

Inbound congestion (upstream → relay)

If the upstream source produces frames faster than any output can dispatch them:

  • Low-latency outputs are unaffected by design — they always hold at most one frame
  • Completeness outputs will see their queues grow; limits and drop policy absorb the excess

The relay never signals backpressure to the upstream. It is the upstream's concern to produce frames at a sustainable rate; the relay's concern is only to handle whatever arrives without blocking.

Outbound congestion (relay → downstream transport)

If the transport layer cannot accept a frame immediately:

  • Low-latency mode: the pending frame is dropped when the next frame arrives; the transport sends the newest frame it can when it becomes ready
  • Completeness mode: the frame stays in the queue; the queue grows until the transport catches up or limits are reached

The interaction between outbound congestion and the byte budget is important: a transport that is consistently slow will fill the completeness queue to its byte budget limit, at which point the drop policy engages. This is the intended safety valve — the budget defines the maximum acceptable latency inflation before the system reverts to dropping.

Congestion Signals

Even though the relay does not apply backpressure, it should emit observable congestion signals — drop counts, queue depth, byte utilization — on the control plane so that the controller can make decisions: reduce upstream quality, reroute, alert, or adjust budgets dynamically.

Multi-Input Scheduling

When a relay has multiple input sources feeding the same output, it needs a policy for which source's frame to forward next when the link is under pressure or when frames from multiple sources are ready simultaneously. This policy is the scheduler.

The scheduler is a separate concern from delivery mode (low-latency vs completeness) — delivery mode governs buffering and drop behaviour per output; the scheduler governs which input is served when multiple compete.

Candidate policies (not exhaustive — the design should keep the scheduler pluggable):

Policy Behaviour
Strict priority Always prefer the highest-priority source; lower-priority sources are only forwarded when no higher-priority frame is pending
Round-robin Cycle evenly across all active inputs — one frame from each in turn
Weighted round-robin Each input has a weight; forwarding interleaves at the given ratio (e.g. 1:3 means one frame from source A per three from source B)
Deficit round-robin Byte-fair rather than frame-fair variant of weighted round-robin; useful when sources have very different frame sizes
Source suppression A congested or degraded link simply stops forwarding from a given input entirely until conditions improve

Priority remains a property of the path (set at connection time). The scheduler uses those priorities plus runtime state (queue depths, drop rates) to make per-frame decisions.

The relay module should expose a scheduler interface so policies are interchangeable without touching routing logic. Which policies to implement first is an open question — see Open Questions.

graph TD
    UP1[Upstream Source A] -->|encapsulated stream| RELAY[Relay]
    UP2[Upstream Source B] -->|encapsulated stream| RELAY

    RELAY --> LS[Low-latency Output<br>single-slot<br>drop on collision]
    RELAY --> CS[Completeness Output<br>queued<br>drop on budget exceeded]
    RELAY --> OB[Opaque Output<br>byte pipe<br>no frame awareness]

    LS -->|encapsulated| LC[Low-latency Consumer<br>eg. preview display]
    CS -->|encapsulated| CC[Completeness Consumer<br>eg. archiver]
    OB -->|opaque| RAW[Raw Consumer<br>eg. disk writer]

    RELAY -.->|drop count<br>queue depth<br>byte utilization| CTRL[Controller node]