diff --git a/architecture.md b/architecture.md index dbea1c6..ea60d88 100644 --- a/architecture.md +++ b/architecture.md @@ -35,9 +35,9 @@ Node types: | Type | Role | |---|---| -| **Source** | Produces video — V4L2 camera, file, test signal | +| **Source** | Produces video — V4L2 camera, screen grab, file, test signal | | **Relay** | Receives one or more input streams and distributes to one or more outputs, each with its own delivery mode and buffer; never blocks upstream | -| **Sink** | Consumes video — display, archiver, encoder output | +| **Sink** | Consumes video — display window, archiver, encoder output | A relay with multiple inputs is what would traditionally be called a mux — it combines streams from several sources and forwards them, possibly over a single transport. The dispatch and buffering logic is the same regardless of input count. @@ -242,6 +242,53 @@ graph TD --- +## X11 / Xorg Integration + +An `xorg` module provides two capabilities that complement the V4L2 camera pipeline: screen geometry queries and an X11-based video feed viewer. Both operate as first-class node roles. + +### Screen Geometry Queries (XRandR) + +Using the XRandR extension, the module can enumerate connected outputs and retrieve their geometry — resolution, position within the desktop coordinate space, physical size, and refresh rate. This is useful for: + +- **Routing decisions**: knowing the resolution of the target display before deciding how to scale or crop an incoming stream +- **Screen grab source**: determining the exact rectangle to capture for a given monitor +- **Multi-monitor layouts**: placing viewer windows correctly in a multi-head setup without guessing offsets + +Queries are exposed as control request/response pairs on the standard transport, so a remote node can ask "what monitors does this machine have?" and receive structured geometry data without any X11 code on the asking side. + +### Screen Grab Source + +The module can act as a video source by capturing the contents of a screen region using `XShmGetImage` (MIT-SHM extension) for zero-copy capture within the same machine. The captured region is a configurable rectangle — typically one full monitor by its XRandR geometry, but can be any sub-region. + +The grab loop produces frames at a configured rate, encapsulates them, and feeds them into the transport like any other video source. Combined with geometry queries, a remote controller can enumerate monitors, select one, and start a screen grab stream without manual coordinate configuration. + +### Frame Viewer Sink + +The module can act as a video sink by creating an X11 window and rendering the latest received frame into it. The window: + +- Can be placed on a specific monitor using XRandR geometry +- Can be made fullscreen on a chosen output +- Renders using `XShmPutImage` (MIT-SHM) when the source is local, or `XPutImage` otherwise +- Displays the most recently received frame — it is driven by the low-latency output mode of the relay feeding it; it never buffers for completeness + +This makes it the display-side counterpart of the V4L2 capture source: the same frame that was grabbed from a camera on a Pi can be viewed on any machine in the network that runs an xorg sink node, with the relay handling the path and delivery mode between them. + +Scale and crop are applied at render time — the incoming frame is stretched or cropped to fill the window. This allows a high-resolution screen grab from one machine to be displayed scaled-down on a smaller physical monitor elsewhere in the network. + +--- + +## Audio (Future) + +Audio streams are not in scope for the initial implementation but the transport is designed to accommodate them without structural changes. + +The `channel_id` field already provides stream multiplexing on a single connection. A future audio channel is just another channel on an existing transport connection — no new connection type is needed. The message type table has room for an `audio_frame` type alongside `video_frame`. + +The main open question is codec and container: raw PCM is trivial to handle but large; compressed formats (Opus, AAC) need framing conventions. This is deferred until video is solid. + +The frame allocator, relay, and archive modules should not make assumptions that `channel_id` implies video — they operate on opaque byte payloads with a message type and length, so audio frames will pass through the same infrastructure unchanged. + +--- + ## Device Resilience Nodes that read from hardware devices (V4L2 cameras, media devices) must handle transient device loss — a USB camera that disconnects and reconnects, a device node that briefly disappears during a mode switch, or a stream that errors out and can be retried. This is not an early implementation concern but has structural implications that should be respected from the start. diff --git a/planning.md b/planning.md index 512cd56..e66d229 100644 --- a/planning.md +++ b/planning.md @@ -45,14 +45,15 @@ Modules are listed in intended build order. Each depends only on modules above i | 1 | `common` | done | Error types, base definitions — no dependencies | | 2 | `media_ctrl` | done | Media Controller API — device and topology enumeration, pad format config | | 3 | `v4l2_ctrl` | done | V4L2 controls — enumerate, get, set camera parameters | -| 4 | `serial` | not started | `put`/`get` primitives for little-endian binary serialization into byte buffers | +| 4 | `serial` | done | `put`/`get` primitives for little-endian binary serialization into byte buffers | | 5 | `transport` | not started | Encapsulated transport — frame header, TCP stream abstraction, single-write send | | 6 | `protocol` | not started | Typed `write_*`/`read_*` functions for all message types; builds on serial + transport | | 7 | `frame_alloc` | not started | Per-frame allocation with bookkeeping (byte budget, ref counting) | | 8 | `relay` | not started | Input dispatch to output queues (low-latency and completeness modes) | | 9 | `ingest` | not started | MJPEG frame parser (two-pass EOI state machine, opaque stream → discrete frames) | | 10 | `archive` | not started | Write frames to disk, control messages to binary log | -| 11 | `web node` | not started | Node.js/Express peer — speaks binary protocol on socket side, HTTP/WebSocket to browser; `protocol.mjs` mirrors C protocol module | +| 11 | `xorg` | not started | X11 screen geometry queries (XRandR), screen grab source, frame viewer sink — see architecture.md | +| 12 | `web node` | not started | Node.js/Express peer — speaks binary protocol on socket side, HTTP/WebSocket to browser; `protocol.mjs` mirrors C protocol module | ---