architecture.md is now a concise overview (~155 lines) with a Documentation section linking to all sub-docs. New sub-docs in docs/: transport.md — wire modes, frame header, serialization, web peer relay.md — delivery modes, memory model, congestion, scheduler codec.md — stream metadata, format negotiation, codec backends xorg.md — screen grab, viewer sink, render loop, overlays discovery.md — multicast announcements, multi-site, site gateways node-state.md — wanted/current state, reconciler, stats, queries device-resilience.md — device loss handling, stream events, audio (future) All cross-references updated to file links. Every sub-doc links back to architecture.md. docs/transport.md links to docs/protocol.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
95 lines
5.9 KiB
Markdown
95 lines
5.9 KiB
Markdown
# Relay Design
|
|
|
|
See [Architecture Overview](../architecture.md).
|
|
|
|
A relay receives frames from one or more upstream sources and distributes them to any number of outputs. Each output is independently configured with a **delivery mode** that determines how it handles the tension between latency and completeness.
|
|
|
|
## Output Delivery Modes
|
|
|
|
**Low-latency mode** — minimize delay, accept loss
|
|
|
|
The output holds at most one pending frame. When a new frame arrives:
|
|
- If the slot is empty, the frame occupies it and is sent as soon as the transport allows
|
|
- If the slot is already occupied (transport not ready), the incoming frame is dropped — the pending frame is already stale enough
|
|
|
|
The consumer always receives the most recent frame the transport could deliver. Frame loss is expected and acceptable.
|
|
|
|
**Completeness mode** — minimize loss, accept delay
|
|
|
|
The output maintains a queue. When a new frame arrives it is enqueued. The transport drains the queue in order. When the queue is full, a drop policy is applied — either drop the oldest frame (preserve recency) or drop the newest (preserve continuity). Which policy fits depends on the consumer: an archiver may prefer continuity; a scrubber may prefer recency.
|
|
|
|
## Memory Model
|
|
|
|
Compressed frames have variable sizes (I-frames vs P-frames, quality settings, scene complexity), so fixed-slot buffers waste memory unpredictably. The preferred model is **per-frame allocation** with explicit bookkeeping.
|
|
|
|
Each allocated frame is tracked with at minimum:
|
|
- Byte size
|
|
- Sequence number or timestamp
|
|
- Which outputs still hold a reference
|
|
|
|
Limits are enforced per output independently — not as a shared pool — so a slow completeness output cannot starve a low-latency output or exhaust global memory. Per-output limits have two axes:
|
|
- **Frame count** — cap on number of queued frames
|
|
- **Byte budget** — cap on total bytes in flight for that output
|
|
|
|
Both limits should be configurable. Either limit being reached triggers the drop policy.
|
|
|
|
## Congestion: Two Sides
|
|
|
|
Congestion can arise at both ends of the relay and must be handled explicitly on each.
|
|
|
|
**Inbound congestion (upstream → relay)**
|
|
|
|
If the upstream source produces frames faster than any output can dispatch them:
|
|
- Low-latency outputs are unaffected by design — they always hold at most one frame
|
|
- Completeness outputs will see their queues grow; limits and drop policy absorb the excess
|
|
|
|
The relay never signals backpressure to the upstream. It is the upstream's concern to produce frames at a sustainable rate; the relay's concern is only to handle whatever arrives without blocking.
|
|
|
|
**Outbound congestion (relay → downstream transport)**
|
|
|
|
If the transport layer cannot accept a frame immediately:
|
|
- Low-latency mode: the pending frame is dropped when the next frame arrives; the transport sends the newest frame it can when it becomes ready
|
|
- Completeness mode: the frame stays in the queue; the queue grows until the transport catches up or limits are reached
|
|
|
|
The interaction between outbound congestion and the byte budget is important: a transport that is consistently slow will fill the completeness queue to its byte budget limit, at which point the drop policy engages. This is the intended safety valve — the budget defines the maximum acceptable latency inflation before the system reverts to dropping.
|
|
|
|
## Congestion Signals
|
|
|
|
Even though the relay does not apply backpressure, it should emit **observable congestion signals** — drop counts, queue depth, byte utilization — on the control plane so that the controller can make decisions: reduce upstream quality, reroute, alert, or adjust budgets dynamically.
|
|
|
|
## Multi-Input Scheduling
|
|
|
|
When a relay has multiple input sources feeding the same output, it needs a policy for which source's frame to forward next when the link is under pressure or when frames from multiple sources are ready simultaneously. This policy is the **scheduler**.
|
|
|
|
The scheduler is a separate concern from delivery mode (low-latency vs completeness) — delivery mode governs buffering and drop behaviour per output; the scheduler governs which input is served when multiple compete.
|
|
|
|
Candidate policies (not exhaustive — the design should keep the scheduler pluggable):
|
|
|
|
| Policy | Behaviour |
|
|
|---|---|
|
|
| **Strict priority** | Always prefer the highest-priority source; lower-priority sources are only forwarded when no higher-priority frame is pending |
|
|
| **Round-robin** | Cycle evenly across all active inputs — one frame from each in turn |
|
|
| **Weighted round-robin** | Each input has a weight; forwarding interleaves at the given ratio (e.g. 1:3 means one frame from source A per three from source B) |
|
|
| **Deficit round-robin** | Byte-fair rather than frame-fair variant of weighted round-robin; useful when sources have very different frame sizes |
|
|
| **Source suppression** | A congested or degraded link simply stops forwarding from a given input entirely until conditions improve |
|
|
|
|
Priority remains a property of the path (set at connection time). The scheduler uses those priorities plus runtime state (queue depths, drop rates) to make per-frame decisions.
|
|
|
|
The `relay` module should expose a scheduler interface so policies are interchangeable without touching routing logic. Which policies to implement first is an open question — see [Open Questions](../architecture.md#open-questions).
|
|
|
|
```mermaid
|
|
graph TD
|
|
UP1[Upstream Source A] -->|encapsulated stream| RELAY[Relay]
|
|
UP2[Upstream Source B] -->|encapsulated stream| RELAY
|
|
|
|
RELAY --> LS[Low-latency Output<br>single-slot<br>drop on collision]
|
|
RELAY --> CS[Completeness Output<br>queued<br>drop on budget exceeded]
|
|
RELAY --> OB[Opaque Output<br>byte pipe<br>no frame awareness]
|
|
|
|
LS -->|encapsulated| LC[Low-latency Consumer<br>eg. preview display]
|
|
CS -->|encapsulated| CC[Completeness Consumer<br>eg. archiver]
|
|
OB -->|opaque| RAW[Raw Consumer<br>eg. disk writer]
|
|
|
|
RELAY -.->|drop count<br>queue depth<br>byte utilization| CTRL[Controller node]
|
|
```
|