Compare commits

...

10 Commits

Author SHA1 Message Date
51e2a3e79e Make V4L2 dequeue the primary ingest path; demote EOI scanner
ingest module: dequeue V4L2 buffers, emit one encapsulated frame per buffer.
Driver guarantees per-buffer framing for V4L2_PIX_FMT_MJPEG; no scanning needed.

mjpeg_scan: future optional module for non-compliant hardware only.
Explicitly a workaround, not part of the primary pipeline.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 22:55:44 +00:00
caad1565b8 Clarify ingest scanner scope: opaque pipe vs well-formed V4L2 vs weird sources
V4L2 with proper node: driver guarantees per-buffer framing, no scan needed.
Opaque pipe (dd|nc): buffer boundaries lost, EOI scanner is the correct tool.
Weird containers (RIFF-wrapped USB cams, IP cameras, RTSP): route via ffmpeg,
not a custom parser. Scanner is an option only for constrained raw-stream cases.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 22:53:34 +00:00
ffaa66ab96 Redesign stream metadata: separate format, pixel_format, and origin
format (u16): what the bytes are — drives decode, stable across encoder changes
pixel_format (u16): layout for raw formats, ignored otherwise
origin (u16): how it was produced — informational only, no effect on decode

Eliminates numerical range assumptions (0x01xx ffmpeg range). A camera
outputting MJPEG natively and libjpeg-turbo encoding MJPEG are the same
format with different origins; receiver handles both identically.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 22:49:57 +00:00
8260d456aa Add ffmpeg as codec backend; extend codec ID table with archival formats
0x01xx range reserved for ffmpeg-backed formats (H.265, AV1, FFV1,
ProRes). Documents libavcodec vs subprocess trade-offs: subprocess suits
archival completeness paths, libavcodec suits low-latency encode. Receiver
only cares about wire format, not which encoder produced it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 22:47:06 +00:00
44a3326a76 Add codec module: per-frame encode/decode for screen grabs
Documents codec identification (u16 per channel, set at stream open),
four initial candidates: MJPEG/libjpeg-turbo, QOI, ZSTD-raw, VA-API
H.264 intra. Screen grab source calls codec before transport; relay and
archive remain payload-agnostic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 22:45:18 +00:00
5cea34caf5 Add xorg module plan and audio forward-compatibility note
xorg module: XRandR geometry queries, screen grab source (XShmGetImage),
frame viewer sink (XShmPutImage, fullscreen per monitor). All exposed as
standard source/sink node roles on the existing transport.

Audio: deferred but transport is already compatible — channel_id mux,
audio_frame message type slot reserved, relay/allocator are payload-agnostic.

Also marks serial as done in planning.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 22:42:19 +00:00
c58c211fee Document device resilience and stream lifecycle signals
Adds stream_event message type (0x0004) with interrupted/resumed codes
for encapsulated edges. Documents per-layer implications: opaque stream
limitation, ingest parser reset requirement, frame allocator abandon
operation, and source node recovery loop structure.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 22:37:22 +00:00
8579bece57 Plan multi-site support; add site_id to discovery announcement
site_id (u16) is reserved in the announcement payload from day one,
always 0 in single-site deployments. Documents the site gateway node
concept and fully-qualified addressing (site_id:namespace:instance) so
multi-site can be added later without wire format changes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 22:33:30 +00:00
03fe0ba806 Change discovery function field to u16 bitfield
A node can declare multiple roles simultaneously (e.g. relay + sink).
Replaces the function string with a fixed-size flags field; keeps the
payload layout simple and fixed-width up to the name.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 22:30:56 +00:00
5dc28890f0 Document custom mDNS-inspired discovery in architecture
Reuse UDP multicast transport (224.0.0.251:5353) with our own binary
wire format — no Avahi, no Bonjour, no daemon dependency.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 22:28:56 +00:00
2 changed files with 293 additions and 10 deletions

View File

@@ -35,9 +35,9 @@ Node types:
| Type | Role | | Type | Role |
|---|---| |---|---|
| **Source** | Produces video — V4L2 camera, file, test signal | | **Source** | Produces video — V4L2 camera, screen grab, file, test signal |
| **Relay** | Receives one or more input streams and distributes to one or more outputs, each with its own delivery mode and buffer; never blocks upstream | | **Relay** | Receives one or more input streams and distributes to one or more outputs, each with its own delivery mode and buffer; never blocks upstream |
| **Sink** | Consumes video — display, archiver, encoder output | | **Sink** | Consumes video — display window, archiver, encoder output |
A relay with multiple inputs is what would traditionally be called a mux — it combines streams from several sources and forwards them, possibly over a single transport. The dispatch and buffering logic is the same regardless of input count. A relay with multiple inputs is what would traditionally be called a mux — it combines streams from several sources and forwards them, possibly over a single transport. The dispatch and buffering logic is the same regardless of input count.
@@ -84,12 +84,22 @@ graph LR
CTRL -.->|node control| RELAY CTRL -.->|node control| RELAY
``` ```
The Pi runs only: The Pi runs a node process that dequeues V4L2 buffers and forwards each buffer as an encapsulated frame over TCP. It also exposes the V4L2 control endpoint for remote parameter adjustment.
- A forwarding process (e.g. `dd if=/dev/video0 | nc <host> <port>` or equivalent)
- The V4L2 control endpoint (receives parameter change commands)
Everything else happens on machines with adequate resources. Everything else happens on machines with adequate resources.
### V4L2 Buffer Dequeuing
When a V4L2 device is configured for `V4L2_PIX_FMT_MJPEG`, the driver delivers one complete MJPEG frame per dequeued buffer — frame boundaries are guaranteed at the source. The ingest module dequeues these buffers and emits each one as an encapsulated frame directly into the transport. No scanning or frame boundary detection is needed.
This is the primary capture path. It is clean, well-defined, and relies on standard V4L2 kernel behaviour rather than heuristics.
### Misbehaving Hardware: `mjpeg_scan` (Future)
Some hardware does not honour the per-buffer framing contract — cheap USB webcams or cameras with unusual firmware may concatenate multiple partial frames into a single buffer, or split one frame across multiple buffers. For these cases a separate optional `mjpeg_scan` module provides a fallback: it scans the incoming byte stream for JPEG SOI (`0xFF 0xD8`) and EOI (`0xFF 0xD9`) markers to recover frame boundaries heuristically.
This module is explicitly a workaround for non-compliant hardware. It is not part of the primary pipeline and will be implemented only if a specific device requires it. For sources with unusual container formats (AVI-wrapped MJPEG, HTTP multipart, RTSP with quirky packetisation), the preferred approach is to route through ffmpeg rather than write a custom parser.
--- ---
## Transport Protocol ## Transport Protocol
@@ -127,8 +137,9 @@ Header fields:
| `0x0001` | Video frame | | `0x0001` | Video frame |
| `0x0002` | Control request | | `0x0002` | Control request |
| `0x0003` | Control response | | `0x0003` | Control response |
| `0x0004` | Stream event |
Video frame payloads are raw compressed frames. Control payloads are binary-serialized structures — see [Protocol Serialization](#protocol-serialization). Video frame payloads are raw compressed frames. Control payloads are binary-serialized structures — see [Protocol Serialization](#protocol-serialization). Stream events carry lifecycle signals for a channel — see [Device Resilience](#device-resilience).
### Unified Control and Video on One Connection ### Unified Control and Video on One Connection
@@ -158,6 +169,7 @@ Control messages are low-volume and can be interleaved with the video frame stre
| Sequence numbers / timestamps | **no** | yes (via extension) | | Sequence numbers / timestamps | **no** | yes (via extension) |
| Control / command channel | **no** | yes | | Control / command channel | **no** | yes |
| Remote device enumeration | **no** | yes | | Remote device enumeration | **no** | yes |
| Stream lifecycle signals | **no** | yes |
The most important forcing function is **low-latency relay**: to drop a pending frame when a newer one arrives, the relay must know where frames begin and end. An opaque stream cannot support this, so any edge that requires low-latency output must use encapsulation. The most important forcing function is **low-latency relay**: to drop a pending frame when a newer one arrives, the relay must know where frames begin and end. An opaque stream cannot support this, so any edge that requires low-latency output must use encapsulation.
@@ -240,6 +252,196 @@ graph TD
--- ---
## Codec Module
A `codec` module provides per-frame encode and decode operations for pixel data. It sits between raw pixel buffers and the transport — sources call encode before sending, sinks call decode after receiving. The relay and transport layers never need to understand pixel formats; they carry opaque payloads.
### Stream Metadata
Receivers must know what format a frame payload is in before they can decode it. This is communicated once at stream setup via a `stream_open` control message rather than tagging every frame header. The message carries three fields:
**`format` (u16)** — the wire format of the payload bytes; determines how the receiver decodes the frame:
| Value | Format |
|---|---|
| `0x0001` | MJPEG |
| `0x0002` | H.264 |
| `0x0003` | H.265 / HEVC |
| `0x0004` | AV1 |
| `0x0005` | FFV1 |
| `0x0006` | ProRes |
| `0x0007` | QOI |
| `0x0008` | Raw pixels (see `pixel_format`) |
| `0x0009` | Raw pixels + ZSTD (see `pixel_format`) |
**`pixel_format` (u16)** — pixel layout for raw formats; zero and ignored for compressed formats:
| Value | Layout |
|---|---|
| `0x0001` | BGRA 8:8:8:8 |
| `0x0002` | RGBA 8:8:8:8 |
| `0x0003` | BGR 8:8:8 |
| `0x0004` | YUV 4:2:0 planar |
| `0x0005` | YUV 4:2:2 packed |
**`origin` (u16)** — how the frame was produced; informational only, does not affect decoding; useful for diagnostics, quality inference, and routing decisions:
| Value | Origin |
|---|---|
| `0x0001` | Device native — camera or capture card encoded it directly |
| `0x0002` | libjpeg-turbo |
| `0x0003` | ffmpeg (libavcodec) |
| `0x0004` | ffmpeg (subprocess) |
| `0x0005` | VA-API direct |
| `0x0006` | NVENC direct |
| `0x0007` | Software (other) |
A V4L2 camera outputting MJPEG has `format=MJPEG, origin=device_native`. The same format re-encoded in process has `format=MJPEG, origin=libjpeg-turbo`. The receiver decodes both identically; the distinction is available for logging and diagnostics without polluting the format identifier.
### Format Negotiation
When a source node opens a stream channel it sends a `stream_open` control message that includes the codec identifier. The receiver can reject the codec if it has no decoder for it. This keeps codec knowledge at the edges — relay nodes are unaffected.
### libjpeg-turbo
JPEG is the natural first codec: libjpeg-turbo provides SIMD-accelerated encode on both x86 and ARM, the output format is identical to what V4L2 cameras already produce (so the ingest and archive paths treat them the same), and it is universally decodable including in browsers via `<img>` or `createImageBitmap`. Lossy, but quality is configurable.
### QOI
QOI (Quite OK Image Format) is a strong candidate for lossless screen grabs: it encodes and decodes in a single pass with no external dependencies, performs well on content with large uniform regions (UIs, text, diagrams), and the reference implementation is a single `.h` file. Output is larger than JPEG but decode is simpler and there is no quality loss. Worth benchmarking against JPEG at high quality settings for screen content.
### ZSTD over Raw Pixels
ZSTD at compression level 1 is extremely fast and can achieve meaningful ratios on screen content (which tends to be repetitive). No pixel format conversion is needed — capture raw, compress raw, decompress raw, display raw. This avoids any colour space or chroma subsampling decisions and is entirely lossless. The downside is that even compressed, the payload is larger than JPEG for photographic content; for UI-heavy screens it can be competitive.
### VA-API (Hardware H.264 Intra)
Intra-only H.264 via VA-API gives very high compression with GPU offload. This is the most complex option to set up and introduces a GPU dependency, but may be worthwhile for high-resolution grabs over constrained links. Deferred until simpler codecs are validated.
### ffmpeg Backend
ffmpeg (via libavcodec or subprocess) is a practical escape hatch that gives access to a large number of codecs, container formats, and hardware acceleration paths without implementing them from scratch. It is particularly useful for archival formats where the encode latency of a more complex codec is acceptable.
**Integration options:**
- **libavcodec** — link directly against the library; programmatic API, tight integration, same process; introduces a large build dependency but gives full control over codec parameters and hardware acceleration (NVENC, VA-API, VideoToolbox, etc.)
- **subprocess pipe** — spawn `ffmpeg`, pipe raw frames to stdin, read encoded output from stdout; simpler, no build dependency, more isolated from the rest of the node process; latency is higher due to process overhead but acceptable for archival paths where real-time delivery is not required
The subprocess approach fits naturally into the completeness output path of the relay: frames arrive in order, there is no real-time drop pressure, and the ffmpeg process can be restarted independently if it crashes without taking down the node. libavcodec is the better fit for low-latency encoding (e.g. screen grab over a constrained link).
**Archival formats of interest:**
| Format | Notes |
|---|---|
| H.265 / HEVC | ~50% better compression than H.264 at same quality; NVENC and VA-API hardware support widely available |
| AV1 | Best open-format compression; software encode is slow, hardware encode (AV1 NVENC on RTX 30+) is fast |
| FFV1 | Lossless, designed for archival; good compression for video content; the format used by film archives |
| ProRes | Near-lossless, widely accepted in post-production toolchains; large files but easy to edit downstream |
The encoder backend is recorded in the `origin` field of `stream_open` — the receiver cares only about `format`, not how the bytes were produced. Switching from a subprocess encode to libavcodec, or from software to hardware, requires no protocol change.
---
## X11 / Xorg Integration
An `xorg` module provides two capabilities that complement the V4L2 camera pipeline: screen geometry queries and an X11-based video feed viewer. Both operate as first-class node roles.
### Screen Geometry Queries (XRandR)
Using the XRandR extension, the module can enumerate connected outputs and retrieve their geometry — resolution, position within the desktop coordinate space, physical size, and refresh rate. This is useful for:
- **Routing decisions**: knowing the resolution of the target display before deciding how to scale or crop an incoming stream
- **Screen grab source**: determining the exact rectangle to capture for a given monitor
- **Multi-monitor layouts**: placing viewer windows correctly in a multi-head setup without guessing offsets
Queries are exposed as control request/response pairs on the standard transport, so a remote node can ask "what monitors does this machine have?" and receive structured geometry data without any X11 code on the asking side.
### Screen Grab Source
The module can act as a video source by capturing the contents of a screen region using `XShmGetImage` (MIT-SHM extension) for zero-copy capture within the same machine. The captured region is a configurable rectangle — typically one full monitor by its XRandR geometry, but can be any sub-region.
Raw captured pixels are uncompressed — 1920×1080 at 32 bpp is ~8 MB per frame. Before the frame enters the transport it must be encoded. The grab loop calls the `codec` module to compress each frame, then encapsulates the result. The codec is configured per stream; see [Codec Module](#codec-module).
The grab loop produces frames at a configured rate, encapsulates them, and feeds them into the transport like any other video source. Combined with geometry queries, a remote controller can enumerate monitors, select one, and start a screen grab stream without manual coordinate configuration.
### Frame Viewer Sink
The module can act as a video sink by creating an X11 window and rendering the latest received frame into it. The window:
- Can be placed on a specific monitor using XRandR geometry
- Can be made fullscreen on a chosen output
- Renders using `XShmPutImage` (MIT-SHM) when the source is local, or `XPutImage` otherwise
- Displays the most recently received frame — it is driven by the low-latency output mode of the relay feeding it; it never buffers for completeness
This makes it the display-side counterpart of the V4L2 capture source: the same frame that was grabbed from a camera on a Pi can be viewed on any machine in the network that runs an xorg sink node, with the relay handling the path and delivery mode between them.
Scale and crop are applied at render time — the incoming frame is stretched or cropped to fill the window. This allows a high-resolution screen grab from one machine to be displayed scaled-down on a smaller physical monitor elsewhere in the network.
---
## Audio (Future)
Audio streams are not in scope for the initial implementation but the transport is designed to accommodate them without structural changes.
The `channel_id` field already provides stream multiplexing on a single connection. A future audio channel is just another channel on an existing transport connection — no new connection type is needed. The message type table has room for an `audio_frame` type alongside `video_frame`.
The main open question is codec and container: raw PCM is trivial to handle but large; compressed formats (Opus, AAC) need framing conventions. This is deferred until video is solid.
The frame allocator, relay, and archive modules should not make assumptions that `channel_id` implies video — they operate on opaque byte payloads with a message type and length, so audio frames will pass through the same infrastructure unchanged.
---
## Device Resilience
Nodes that read from hardware devices (V4L2 cameras, media devices) must handle transient device loss — a USB camera that disconnects and reconnects, a device node that briefly disappears during a mode switch, or a stream that errors out and can be retried. This is not an early implementation concern but has structural implications that should be respected from the start.
### The Problem by Layer
**Source node / device reader**
A device is opened by fd. On a transient disconnect, the fd becomes invalid — reads return errors or short counts. The device may reappear under the same path after some time. Recovery requires closing the bad fd, waiting or polling for the device to reappear, reopening, and restarting the capture loop. Any state tied to the old fd (ioctl configuration, stream-on status) must be re-established.
**Opaque stream edge**
The downstream receiver sees bytes stop. There is no mechanism in an opaque stream to distinguish "slow source", "dead source", or "recovered source". A reconnection produces a new byte stream that appears continuous to the receiver — but contains a hard discontinuity. The receiver has no way to know it should reset state. This is a known limitation of opaque mode. If the downstream consumer is sensitive to stream discontinuities (e.g. a frame parser), it must use encapsulated mode on that edge.
**Encapsulated stream edge**
The source node sends a `stream_event` message (`0x0004`) on the affected `channel_id` before the bytes stop (if possible) or as the first message when stream resumes. The payload carries an event code:
| Code | Meaning |
|---|---|
| `0x01` | Stream interrupted — device lost, bytes will stop |
| `0x02` | Stream resumed — device recovered, frames will follow |
On receiving `stream_interrupted`, downstream nodes know to discard any partial frame being assembled and reset parser state. On `stream_resumed`, they know a clean frame boundary follows and can restart cleanly.
**Ingest module (MJPEG parser)**
The two-pass EOI state machine is stateful per stream. It must expose an explicit reset operation that discards any partial frame in progress and returns the parser to a clean initial state. This reset is triggered by a `stream_interrupted` event, or by any read error from the device. Any frame allocation begun for the discarded partial frame must be released before the reset completes.
**Frame allocator**
A partial frame that was being assembled when the device dropped must be explicitly abandoned. The allocator must support an `abandon` operation distinct from a normal `release` — abandon means the allocation is invalid and any reference tracking for it should be unwound immediately. This prevents a partial allocation from sitting in the accounting tables and consuming budget.
### Source Node Recovery Loop
The general structure for a resilient device reader (not yet implemented, for design awareness):
1. Open device, configure, start capture
2. On read error: emit `stream_interrupted` on the transport, close fd, enter retry loop
3. Poll for device reappearance (inotify on `/dev`, or timed retry)
4. On device back: reopen, reconfigure (ioctl state is lost), emit `stream_resumed`, resume capture
5. Log reconnection events to the control plane as observable signals
The retry loop must be bounded — a device that never returns should eventually cause the node to report a permanent failure rather than loop indefinitely.
### Implications for Opaque Streams
If a source node is producing an opaque stream and the device drops, the TCP connection itself may remain open while bytes stop flowing. The downstream node only learns something is wrong via a timeout or its own read error. For this reason, **opaque streams should only be used on edges where the downstream consumer either does not care about discontinuities or has its own out-of-band mechanism to detect them**. Edges into an ingest node must use encapsulated mode.
---
## Implementation Approach ## Implementation Approach
The system is built module by module in C11. Each translation unit is developed and validated independently before being integrated. See [planning.md](planning.md) for current status and module order, and [conventions.md](conventions.md) for code and project conventions. The system is built module by module in C11. Each translation unit is developed and validated independently before being integrated. See [planning.md](planning.md) for current status and module order, and [conventions.md](conventions.md) for code and project conventions.
@@ -298,6 +500,85 @@ The preprocessor is the same tool planned for generating error location codes (s
--- ---
## Node Discovery
Standard mDNS (RFC 6762) uses UDP multicast over `224.0.0.251:5353` with DNS-SD service records. The wire protocol is well-defined and the multicast group is already in active use on most LANs. The standard service discovery stack (Avahi, Bonjour, `nss-mdns`) provides that transport but brings significant overhead: persistent daemons, D-Bus dependencies, complex configuration surface, and substantial resident memory. None of that is needed here.
The approach: **reuse the multicast transport, define our own wire format**.
Rather than DNS wire format, node announcements are encoded as binary frames using the same serialization layer (`serial`) and frame header used for video transport. A node joins the multicast group, broadcasts periodic announcements, and listens for announcements from peers.
### Announcement Frame
| Field | Size | Purpose |
|---|---|---|
| `message_type` | 2 bytes | Discovery message type (e.g. `0x0010` for node announcement) |
| `channel_id` | 2 bytes | Reserved / zero |
| `payload_length` | 4 bytes | Byte length of payload |
| Payload | variable | Encoded node identity and capabilities |
Payload fields:
| Field | Type | Purpose |
|---|---|---|
| `protocol_version` | u8 | Wire format version |
| `site_id` | u16 | Site this node belongs to (`0` = local / unassigned) |
| `tcp_port` | u16 | Port where this node accepts transport connections |
| `function_flags` | u16 | Bitfield declaring node capabilities (see below) |
| `name_len` | u8 | Length of name string |
| `name` | bytes | Node name (`namespace:instance`, e.g. `v4l2:microscope`) |
`function_flags` bits:
| Bit | Mask | Meaning |
|---|---|---|
| 0 | `0x0001` | Source — produces video |
| 1 | `0x0002` | Relay — receives and distributes streams |
| 2 | `0x0004` | Sink — consumes video (display, archiver, etc.) |
| 3 | `0x0008` | Controller — participates in control plane coordination |
A node may set multiple bits — a relay that also archives sets both `RELAY` and `SINK`.
### Behaviour
- Nodes send announcements periodically (e.g. every 5 s) and immediately on startup
- No daemon — the node process itself sends and listens; no background service required
- On receiving an announcement, the control plane records the peer (address, port, name, function) and can initiate a transport connection if needed
- A node going silent for a configured number of announcement intervals is considered offline
- Announcements are informational only — the hub validates identity at connection time
### No Avahi/Bonjour Dependency
The system does not link against, depend on, or interact with Avahi or Bonjour. It opens a raw UDP multicast socket directly, which requires only standard POSIX socket APIs. This keeps the runtime dependency footprint minimal and the behaviour predictable.
---
## Multi-Site (Forward Compatibility)
The immediate use case is a single LAN. A planned future use case is **site-to-site linking** — two independent networks (e.g. a lab and a remote location) connected by a tunnel (SSH port-forward, WireGuard, etc.), where nodes on both sites are reachable from either side.
### Site Identity
Every node carries a `site_id` (`u16`) in its announcement. In a single-site deployment this is always `0`. When sites are joined, each site is assigned a distinct non-zero ID; nodes retain their IDs across the join and are fully addressable by `(site_id, name)` from anywhere in the combined network.
This field is reserved from day one so that multi-site never requires a wire format change or a rename of existing identifiers.
### Site Gateway Node
A site gateway is a node that participates in both networks simultaneously — it has a connection on the local transport and a connection over the inter-site tunnel. It:
- Bridges discovery announcements between sites (rewriting `site_id` appropriately)
- Forwards encapsulated transport frames across the tunnel on behalf of cross-site edges
- Is itself a named node, so the control plane can see and reason about it
The tunnel transport is out of scope for now. The gateway is a node type, not a special infrastructure component — it uses the same wire protocol as everything else.
### Addressing
A fully-qualified node address is `site_id:namespace:instance`. Within a single site, `site_id` is implicit and can be omitted. The control plane and discovery layer must store `site_id` alongside every peer record from the start, even if it is always `0`, so that the upgrade to multi-site addressing requires only configuration and a gateway node — not code changes.
---
## Open Questions ## Open Questions
- What is the graph's representation format — in-memory object graph, serialized config, or both? - What is the graph's representation format — in-memory object graph, serialized config, or both?
@@ -305,5 +586,4 @@ The preprocessor is the same tool planned for generating error location codes (s
- Drop policy for completeness queues: drop oldest (recency) or drop newest (continuity)? Should be per-output configurable. - Drop policy for completeness queues: drop oldest (recency) or drop newest (continuity)? Should be per-output configurable.
- When a relay has multiple inputs on an encapsulated transport, how are streams tagged on the outbound side — same stream_id passthrough, or remapped? - When a relay has multiple inputs on an encapsulated transport, how are streams tagged on the outbound side — same stream_id passthrough, or remapped?
- What transport is used for relay edges — TCP, UDP, shared memory for local hops? - What transport is used for relay edges — TCP, UDP, shared memory for local hops?
- How are nodes discovered — static config, mDNS, manual registration?
- Should per-output byte budgets be hard limits or soft limits with hysteresis? - Should per-output byte budgets be hard limits or soft limits with hysteresis?

View File

@@ -45,14 +45,17 @@ Modules are listed in intended build order. Each depends only on modules above i
| 1 | `common` | done | Error types, base definitions — no dependencies | | 1 | `common` | done | Error types, base definitions — no dependencies |
| 2 | `media_ctrl` | done | Media Controller API — device and topology enumeration, pad format config | | 2 | `media_ctrl` | done | Media Controller API — device and topology enumeration, pad format config |
| 3 | `v4l2_ctrl` | done | V4L2 controls — enumerate, get, set camera parameters | | 3 | `v4l2_ctrl` | done | V4L2 controls — enumerate, get, set camera parameters |
| 4 | `serial` | not started | `put`/`get` primitives for little-endian binary serialization into byte buffers | | 4 | `serial` | done | `put`/`get` primitives for little-endian binary serialization into byte buffers |
| 5 | `transport` | not started | Encapsulated transport — frame header, TCP stream abstraction, single-write send | | 5 | `transport` | not started | Encapsulated transport — frame header, TCP stream abstraction, single-write send |
| 6 | `protocol` | not started | Typed `write_*`/`read_*` functions for all message types; builds on serial + transport | | 6 | `protocol` | not started | Typed `write_*`/`read_*` functions for all message types; builds on serial + transport |
| 7 | `frame_alloc` | not started | Per-frame allocation with bookkeeping (byte budget, ref counting) | | 7 | `frame_alloc` | not started | Per-frame allocation with bookkeeping (byte budget, ref counting) |
| 8 | `relay` | not started | Input dispatch to output queues (low-latency and completeness modes) | | 8 | `relay` | not started | Input dispatch to output queues (low-latency and completeness modes) |
| 9 | `ingest` | not started | MJPEG frame parser (two-pass EOI state machine, opaque stream → discrete frames) | | 9 | `ingest` | not started | V4L2 capture loop — dequeue buffers, emit one encapsulated frame per buffer |
| 10 | `archive` | not started | Write frames to disk, control messages to binary log | | 10 | `archive` | not started | Write frames to disk, control messages to binary log |
| 11 | `web node` | not started | Node.js/Express peer — speaks binary protocol on socket side, HTTP/WebSocket to browser; `protocol.mjs` mirrors C protocol module | | 11 | `codec` | not started | Per-frame encode/decode — MJPEG (libjpeg-turbo), QOI, ZSTD-raw, VA-API H.264 intra; used by screen grab source and archive |
| 12 | `xorg` | not started | X11 screen geometry queries (XRandR), screen grab source (calls codec), frame viewer sink — see architecture.md |
| 13 | `web node` | not started | Node.js/Express peer — speaks binary protocol on socket side, HTTP/WebSocket to browser; `protocol.mjs` mirrors C protocol module |
| — | `mjpeg_scan` | future | EOI marker scanner for misbehaving hardware that does not deliver clean per-buffer frames; not part of the primary pipeline |
--- ---