Files
video-setup/architecture.md
mikael-lovqvists-claude-agent 8579bece57 Plan multi-site support; add site_id to discovery announcement
site_id (u16) is reserved in the announcement payload from day one,
always 0 in single-site deployments. Documents the site gateway node
concept and fully-qualified addressing (site_id:namespace:instance) so
multi-site can be added later without wire format changes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 22:33:30 +00:00

21 KiB

Video Routing System — Architecture

Concept

A graph-based multi-peer video routing system where nodes are media processes and edges are transport connections. The graph carries video streams between sources, relay nodes, and sinks, with priority as a first-class property on paths — so that a low-latency monitoring feed and a high-quality archival feed can coexist and be treated differently by the system.


Design Rationale

Get It on the Wire First

A key principle driving the architecture is that capture devices should not be burdened with processing.

A Raspberry Pi attached to a camera (V4L2 source) is capable of pulling raw or MJPEG frames off the device, but it is likely too resource-constrained to also transcode, mux, or perform any non-trivial stream manipulation. Doing so would add latency and compete with the capture process itself.

The preferred model is:

  1. Pi captures and transmits raw — reads frames directly from V4L2 (MJPEG or raw Bayer/YUV) and puts them on the wire over TCP as fast as possible, with no local transcoding
  2. A more capable machine receives and defines the stream — a downstream node with proper CPU/GPU resources receives the raw feed and produces well-formed, containerized, or re-encoded output appropriate for the intended consumers (display, archive, relay)

This separation means the Pi's job is purely ingestion and forwarding. It keeps the capture loop tight and latency minimal. The downstream node then becomes the "source" of record for the rest of the graph.

This is also why the V4L2 remote control protocol is useful — the Pi doesn't need to run any control logic locally. It exposes its camera parameters over TCP, and the controlling machine adjusts exposure, white balance, codec settings, etc. remotely. The Pi just acts on the commands.


Graph Model

Nodes

Each node is a named process instance, identified by a namespace and name (e.g. v4l2:microscope, ffmpeg:ingest1, mpv:preview, archiver:main).

Node types:

Type Role
Source Produces video — V4L2 camera, file, test signal
Relay Receives one or more input streams and distributes to one or more outputs, each with its own delivery mode and buffer; never blocks upstream
Sink Consumes video — display, archiver, encoder output

A relay with multiple inputs is what would traditionally be called a mux — it combines streams from several sources and forwards them, possibly over a single transport. The dispatch and buffering logic is the same regardless of input count.

Edges

An edge is a transport connection between two nodes. Edges carry:

  • The video stream itself (TCP, pipe, or other transport)
  • A priority value
  • A transport mode — opaque or encapsulated (see Transport Protocol)

Priority

Priority governs how the system allocates resources and makes trade-offs when paths compete:

  • High priority (low latency) — frames are forwarded immediately; buffering is minimized; if a downstream node is slow it gets dropped frames, not delayed ones; quality may be lower
  • Low priority (archival) — frames may be buffered, quality should be maximized; latency is acceptable; dropped frames are undesirable

Priority is a property of the path, not of the source. The same source can feed a high-priority monitoring path and a low-priority archival path simultaneously.


Control Plane

A central message hub coordinates the graph. Nodes self-register with an identity (origin, function, description) and communicate via unicast or broadcast messages.

The hub does not dictate topology — nodes announce their capabilities and the controller assembles connections. This keeps the hub stateless with respect to stream routing; it only routes control messages.

The hub protocol is newline-delimited JSON over TCP. V4L2 device control and enumeration are carried as control messages within the encapsulated transport rather than on a separate connection — see Transport Protocol.


Ingestion Pipeline (Raspberry Pi Example)

graph LR
    CAM[V4L2 Camera\ndev/video0] -->|raw MJPEG| PI[Pi: forward over TCP\nno processing]
    PI -->|TCP stream| INGEST[Ingest Node\nmore capable machine]
    INGEST -->|well-formed stream| RELAY[Relay]
    RELAY -->|high priority| DISPLAY[Display / Preview\nlow latency]
    RELAY -->|low priority| ARCHIVE[Archiver\nhigh quality]
    INGEST -.->|V4L2 control\nvia transport| PI
    CTRL[Control Plane\nmessage hub] -.->|node control| INGEST
    CTRL -.->|node control| RELAY

The Pi runs only:

  • A forwarding process (e.g. dd if=/dev/video0 | nc <host> <port> or equivalent)
  • The V4L2 control endpoint (receives parameter change commands)

Everything else happens on machines with adequate resources.


Transport Protocol

Transport between nodes operates in one of two modes. The choice is per-edge and has direct implications for what the relay on that edge can do.

Opaque Binary Stream

The transport forwards bytes as they arrive with no understanding of frame boundaries. The relay acts as a pure byte pipe.

  • Zero framing overhead
  • Cannot drop frames (frame boundaries are unknown)
  • Cannot multiplex multiple streams (no way to distinguish them)
  • Cannot do per-frame accounting (byte budgets become byte-rate estimates only)
  • Low-latency output is not available — the relay cannot discard a partial frame

This mode is appropriate for simple point-to-point forwarding where the consumer handles all framing, and where the relay has no need for frame-level intelligence.

Frame-Encapsulated Stream

Each message is prefixed with a small fixed-size header. This applies to both video frames and control messages — the transport is unified.

Header fields:

Field Size Purpose
message_type 2 bytes Distinguishes video frame, control request, control response
channel_id 2 bytes For video: identifies the stream. For control: identifies the request/response pair (correlation ID)
payload_length 4 bytes Byte length of the following payload

Message types:

Value Meaning
0x0001 Video frame
0x0002 Control request
0x0003 Control response

Video frame payloads are raw compressed frames. Control payloads are binary-serialized structures — see Protocol Serialization.

Unified Control and Video on One Connection

By carrying control messages on the same transport as video frames, the system avoids managing separate connections per peer. A node that receives a video stream can be queried or commanded over the same socket.

This directly enables remote device enumeration: a connecting node can issue a control request asking what V4L2 devices the remote host exposes, and receive the list in a control response — before any video streams are established. Discovery and streaming share the same channel.

The V4L2 control operations map naturally to control request/response pairs:

Operation Direction
Enumerate devices request → response
Get device controls (parameters, ranges, menus) request → response
Get control values request → response
Set control values request → response (ack/fail)

Control messages are low-volume and can be interleaved with the video frame stream without meaningful overhead.

Capability Implications

Feature Opaque Encapsulated
Simple forwarding yes yes
Low-latency drop no yes
Per-frame byte accounting no yes
Multi-stream over one transport no yes
Sequence numbers / timestamps no yes (via extension)
Control / command channel no yes
Remote device enumeration no yes

The most important forcing function is low-latency relay: to drop a pending frame when a newer one arrives, the relay must know where frames begin and end. An opaque stream cannot support this, so any edge that requires low-latency output must use encapsulation.

Opaque streams are a valid optimization for leaf edges where the downstream consumer (e.g. an archiver writing raw bytes to disk) does its own framing, requires no relay intelligence, and has no need for remote control.


Relay Design

A relay receives frames from one or more upstream sources and distributes them to any number of outputs. Each output is independently configured with a delivery mode that determines how it handles the tension between latency and completeness.

Output Delivery Modes

Low-latency mode — minimize delay, accept loss

The output holds at most one pending frame. When a new frame arrives:

  • If the slot is empty, the frame occupies it and is sent as soon as the transport allows
  • If the slot is already occupied (transport not ready), the incoming frame is dropped — the pending frame is already stale enough

The consumer always receives the most recent frame the transport could deliver. Frame loss is expected and acceptable.

Completeness mode — minimize loss, accept delay

The output maintains a queue. When a new frame arrives it is enqueued. The transport drains the queue in order. When the queue is full, a drop policy is applied — either drop the oldest frame (preserve recency) or drop the newest (preserve continuity). Which policy fits depends on the consumer: an archiver may prefer continuity; a scrubber may prefer recency.

Memory Model

Compressed frames have variable sizes (I-frames vs P-frames, quality settings, scene complexity), so fixed-slot buffers waste memory unpredictably. The preferred model is per-frame allocation with explicit bookkeeping.

Each allocated frame is tracked with at minimum:

  • Byte size
  • Sequence number or timestamp
  • Which outputs still hold a reference

Limits are enforced per output independently — not as a shared pool — so a slow completeness output cannot starve a low-latency output or exhaust global memory. Per-output limits have two axes:

  • Frame count — cap on number of queued frames
  • Byte budget — cap on total bytes in flight for that output

Both limits should be configurable. Either limit being reached triggers the drop policy.

Congestion: Two Sides

Congestion can arise at both ends of the relay and must be handled explicitly on each.

Inbound congestion (upstream → relay)

If the upstream source produces frames faster than any output can dispatch them:

  • Low-latency outputs are unaffected by design — they always hold at most one frame
  • Completeness outputs will see their queues grow; limits and drop policy absorb the excess

The relay never signals backpressure to the upstream. It is the upstream's concern to produce frames at a sustainable rate; the relay's concern is only to handle whatever arrives without blocking.

Outbound congestion (relay → downstream transport)

If the transport layer cannot accept a frame immediately:

  • Low-latency mode: the pending frame is dropped when the next frame arrives; the transport sends the newest frame it can when it becomes ready
  • Completeness mode: the frame stays in the queue; the queue grows until the transport catches up or limits are reached

The interaction between outbound congestion and the byte budget is important: a transport that is consistently slow will fill the completeness queue to its byte budget limit, at which point the drop policy engages. This is the intended safety valve — the budget defines the maximum acceptable latency inflation before the system reverts to dropping.

Congestion Signals

Even though the relay does not apply backpressure, it should emit observable congestion signals — drop counts, queue depth, byte utilization — on the control plane so that the controller can make decisions: reduce upstream quality, reroute, alert, or adjust budgets dynamically.

graph TD
    UP1[Upstream Source A] -->|encapsulated stream| RELAY[Relay]
    UP2[Upstream Source B] -->|encapsulated stream| RELAY

    RELAY --> LS[Low-latency Output\nsingle-slot\ndrop on collision]
    RELAY --> CS[Completeness Output\nqueued\ndrop on budget exceeded]
    RELAY --> OB[Opaque Output\nbyte pipe\nno frame awareness]

    LS -->|encapsulated| LC[Low-latency Consumer\neg. preview display]
    CS -->|encapsulated| CC[Completeness Consumer\neg. archiver]
    OB -->|opaque| RAW[Raw Consumer\neg. disk writer]

    RELAY -.->|drop count\nqueue depth\nbyte utilization| CTRL[Control Plane]

Implementation Approach

The system is built module by module in C11. Each translation unit is developed and validated independently before being integrated. See planning.md for current status and module order, and conventions.md for code and project conventions.

The final deliverable is a single configurable node binary. During development, each module is exercised through small driver programs that live in the development tree, not in the module directories.


Protocol Serialization

Control message payloads use a compact binary format. The wire encoding is little-endian throughout — all target platforms (Raspberry Pi ARM, x86 laptop) are little-endian, and little-endian is the convention of most modern protocols (USB, Bluetooth LE, etc.).

Serialization Layer

A serial module provides the primitive read/write operations on byte buffers:

  • put_u8, put_u16, put_u32, put_i32, put_u64 — write a value at a position in a buffer
  • get_u8, get_u16, get_u32, get_i32, get_u64 — read a value from a position in a buffer

These are pure buffer operations with no I/O. Fields are never written by casting a struct to bytes — each field is placed explicitly, which eliminates struct padding and alignment assumptions.

Protocol Layer

A protocol module builds on serial and the transport to provide typed message functions:

write_v4l2_set_control(stream, id, value);
write_v4l2_get_control(stream, id);
write_v4l2_enumerate_controls(stream);

Each write_* function knows the exact wire layout of its message, packs the full frame (header + payload) into a stack buffer using put_*, then issues a single write to the stream. The corresponding read_* functions unpack responses using get_*.

This gives a clean two-layer separation: serial handles byte layout, protocol handles message semantics and I/O.

Web Interface as a Protocol Peer

The web interface (Node.js/Express) participates in the graph as a first-class protocol peer — it speaks the same binary protocol as any C node. There is no JSON bridge or special C code to serve the web layer. The boundary is:

  • Socket side: binary protocol, framed messages, little-endian fields read with DataView (dataView.getUint32(offset, true) maps directly to get_u32)
  • Browser side: HTTP/WebSocket, JSON, standard web APIs

A protocol.mjs module in the web layer mirrors the C protocol module — same message types, same wire layout, different language. This lets the web interface connect to any video node, send control requests (V4L2 enumeration, parameter get/set, device discovery), and receive structured responses.

Treating the web node as a peer also means it exercises the real protocol, which surfaces bugs that a JSON bridge would hide.

Future: Single Source of Truth via Preprocessor

The C protocol module and the JavaScript protocol.mjs currently encode the same wire format in two languages. This duplication is a drift risk — a change to a message layout must be applied in both places.

A future preprocessor will eliminate this. Protocol messages will be defined once in a language-agnostic schema, and the preprocessor will emit both:

  • C source — put_*/get_* calls, struct definitions, write_*/read_* functions
  • ESM JavaScript — DataView-based encode/decode, typed constants

The preprocessor is the same tool planned for generating error location codes (see common/error). The protocol schema becomes a single source of truth, and both the C and JavaScript implementations are derived artifacts.


Node Discovery

Standard mDNS (RFC 6762) uses UDP multicast over 224.0.0.251:5353 with DNS-SD service records. The wire protocol is well-defined and the multicast group is already in active use on most LANs. The standard service discovery stack (Avahi, Bonjour, nss-mdns) provides that transport but brings significant overhead: persistent daemons, D-Bus dependencies, complex configuration surface, and substantial resident memory. None of that is needed here.

The approach: reuse the multicast transport, define our own wire format.

Rather than DNS wire format, node announcements are encoded as binary frames using the same serialization layer (serial) and frame header used for video transport. A node joins the multicast group, broadcasts periodic announcements, and listens for announcements from peers.

Announcement Frame

Field Size Purpose
message_type 2 bytes Discovery message type (e.g. 0x0010 for node announcement)
channel_id 2 bytes Reserved / zero
payload_length 4 bytes Byte length of payload
Payload variable Encoded node identity and capabilities

Payload fields:

Field Type Purpose
protocol_version u8 Wire format version
site_id u16 Site this node belongs to (0 = local / unassigned)
tcp_port u16 Port where this node accepts transport connections
function_flags u16 Bitfield declaring node capabilities (see below)
name_len u8 Length of name string
name bytes Node name (namespace:instance, e.g. v4l2:microscope)

function_flags bits:

Bit Mask Meaning
0 0x0001 Source — produces video
1 0x0002 Relay — receives and distributes streams
2 0x0004 Sink — consumes video (display, archiver, etc.)
3 0x0008 Controller — participates in control plane coordination

A node may set multiple bits — a relay that also archives sets both RELAY and SINK.

Behaviour

  • Nodes send announcements periodically (e.g. every 5 s) and immediately on startup
  • No daemon — the node process itself sends and listens; no background service required
  • On receiving an announcement, the control plane records the peer (address, port, name, function) and can initiate a transport connection if needed
  • A node going silent for a configured number of announcement intervals is considered offline
  • Announcements are informational only — the hub validates identity at connection time

No Avahi/Bonjour Dependency

The system does not link against, depend on, or interact with Avahi or Bonjour. It opens a raw UDP multicast socket directly, which requires only standard POSIX socket APIs. This keeps the runtime dependency footprint minimal and the behaviour predictable.


Multi-Site (Forward Compatibility)

The immediate use case is a single LAN. A planned future use case is site-to-site linking — two independent networks (e.g. a lab and a remote location) connected by a tunnel (SSH port-forward, WireGuard, etc.), where nodes on both sites are reachable from either side.

Site Identity

Every node carries a site_id (u16) in its announcement. In a single-site deployment this is always 0. When sites are joined, each site is assigned a distinct non-zero ID; nodes retain their IDs across the join and are fully addressable by (site_id, name) from anywhere in the combined network.

This field is reserved from day one so that multi-site never requires a wire format change or a rename of existing identifiers.

Site Gateway Node

A site gateway is a node that participates in both networks simultaneously — it has a connection on the local transport and a connection over the inter-site tunnel. It:

  • Bridges discovery announcements between sites (rewriting site_id appropriately)
  • Forwards encapsulated transport frames across the tunnel on behalf of cross-site edges
  • Is itself a named node, so the control plane can see and reason about it

The tunnel transport is out of scope for now. The gateway is a node type, not a special infrastructure component — it uses the same wire protocol as everything else.

Addressing

A fully-qualified node address is site_id:namespace:instance. Within a single site, site_id is implicit and can be omitted. The control plane and discovery layer must store site_id alongside every peer record from the start, even if it is always 0, so that the upgrade to multi-site addressing requires only configuration and a gateway node — not code changes.


Open Questions

  • What is the graph's representation format — in-memory object graph, serialized config, or both?
  • How are connections established — does the controller push connection instructions to nodes, or do nodes pull from a known address?
  • Drop policy for completeness queues: drop oldest (recency) or drop newest (continuity)? Should be per-output configurable.
  • When a relay has multiple inputs on an encapsulated transport, how are streams tagged on the outbound side — same stream_id passthrough, or remapped?
  • What transport is used for relay edges — TCP, UDP, shared memory for local hops?
  • Should per-output byte budgets be hard limits or soft limits with hysteresis?