Add xorg module plan and audio forward-compatibility note
xorg module: XRandR geometry queries, screen grab source (XShmGetImage), frame viewer sink (XShmPutImage, fullscreen per monitor). All exposed as standard source/sink node roles on the existing transport. Audio: deferred but transport is already compatible — channel_id mux, audio_frame message type slot reserved, relay/allocator are payload-agnostic. Also marks serial as done in planning.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -35,9 +35,9 @@ Node types:
|
|||||||
|
|
||||||
| Type | Role |
|
| Type | Role |
|
||||||
|---|---|
|
|---|---|
|
||||||
| **Source** | Produces video — V4L2 camera, file, test signal |
|
| **Source** | Produces video — V4L2 camera, screen grab, file, test signal |
|
||||||
| **Relay** | Receives one or more input streams and distributes to one or more outputs, each with its own delivery mode and buffer; never blocks upstream |
|
| **Relay** | Receives one or more input streams and distributes to one or more outputs, each with its own delivery mode and buffer; never blocks upstream |
|
||||||
| **Sink** | Consumes video — display, archiver, encoder output |
|
| **Sink** | Consumes video — display window, archiver, encoder output |
|
||||||
|
|
||||||
A relay with multiple inputs is what would traditionally be called a mux — it combines streams from several sources and forwards them, possibly over a single transport. The dispatch and buffering logic is the same regardless of input count.
|
A relay with multiple inputs is what would traditionally be called a mux — it combines streams from several sources and forwards them, possibly over a single transport. The dispatch and buffering logic is the same regardless of input count.
|
||||||
|
|
||||||
@@ -242,6 +242,53 @@ graph TD
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## X11 / Xorg Integration
|
||||||
|
|
||||||
|
An `xorg` module provides two capabilities that complement the V4L2 camera pipeline: screen geometry queries and an X11-based video feed viewer. Both operate as first-class node roles.
|
||||||
|
|
||||||
|
### Screen Geometry Queries (XRandR)
|
||||||
|
|
||||||
|
Using the XRandR extension, the module can enumerate connected outputs and retrieve their geometry — resolution, position within the desktop coordinate space, physical size, and refresh rate. This is useful for:
|
||||||
|
|
||||||
|
- **Routing decisions**: knowing the resolution of the target display before deciding how to scale or crop an incoming stream
|
||||||
|
- **Screen grab source**: determining the exact rectangle to capture for a given monitor
|
||||||
|
- **Multi-monitor layouts**: placing viewer windows correctly in a multi-head setup without guessing offsets
|
||||||
|
|
||||||
|
Queries are exposed as control request/response pairs on the standard transport, so a remote node can ask "what monitors does this machine have?" and receive structured geometry data without any X11 code on the asking side.
|
||||||
|
|
||||||
|
### Screen Grab Source
|
||||||
|
|
||||||
|
The module can act as a video source by capturing the contents of a screen region using `XShmGetImage` (MIT-SHM extension) for zero-copy capture within the same machine. The captured region is a configurable rectangle — typically one full monitor by its XRandR geometry, but can be any sub-region.
|
||||||
|
|
||||||
|
The grab loop produces frames at a configured rate, encapsulates them, and feeds them into the transport like any other video source. Combined with geometry queries, a remote controller can enumerate monitors, select one, and start a screen grab stream without manual coordinate configuration.
|
||||||
|
|
||||||
|
### Frame Viewer Sink
|
||||||
|
|
||||||
|
The module can act as a video sink by creating an X11 window and rendering the latest received frame into it. The window:
|
||||||
|
|
||||||
|
- Can be placed on a specific monitor using XRandR geometry
|
||||||
|
- Can be made fullscreen on a chosen output
|
||||||
|
- Renders using `XShmPutImage` (MIT-SHM) when the source is local, or `XPutImage` otherwise
|
||||||
|
- Displays the most recently received frame — it is driven by the low-latency output mode of the relay feeding it; it never buffers for completeness
|
||||||
|
|
||||||
|
This makes it the display-side counterpart of the V4L2 capture source: the same frame that was grabbed from a camera on a Pi can be viewed on any machine in the network that runs an xorg sink node, with the relay handling the path and delivery mode between them.
|
||||||
|
|
||||||
|
Scale and crop are applied at render time — the incoming frame is stretched or cropped to fill the window. This allows a high-resolution screen grab from one machine to be displayed scaled-down on a smaller physical monitor elsewhere in the network.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Audio (Future)
|
||||||
|
|
||||||
|
Audio streams are not in scope for the initial implementation but the transport is designed to accommodate them without structural changes.
|
||||||
|
|
||||||
|
The `channel_id` field already provides stream multiplexing on a single connection. A future audio channel is just another channel on an existing transport connection — no new connection type is needed. The message type table has room for an `audio_frame` type alongside `video_frame`.
|
||||||
|
|
||||||
|
The main open question is codec and container: raw PCM is trivial to handle but large; compressed formats (Opus, AAC) need framing conventions. This is deferred until video is solid.
|
||||||
|
|
||||||
|
The frame allocator, relay, and archive modules should not make assumptions that `channel_id` implies video — they operate on opaque byte payloads with a message type and length, so audio frames will pass through the same infrastructure unchanged.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## Device Resilience
|
## Device Resilience
|
||||||
|
|
||||||
Nodes that read from hardware devices (V4L2 cameras, media devices) must handle transient device loss — a USB camera that disconnects and reconnects, a device node that briefly disappears during a mode switch, or a stream that errors out and can be retried. This is not an early implementation concern but has structural implications that should be respected from the start.
|
Nodes that read from hardware devices (V4L2 cameras, media devices) must handle transient device loss — a USB camera that disconnects and reconnects, a device node that briefly disappears during a mode switch, or a stream that errors out and can be retried. This is not an early implementation concern but has structural implications that should be respected from the start.
|
||||||
|
|||||||
@@ -45,14 +45,15 @@ Modules are listed in intended build order. Each depends only on modules above i
|
|||||||
| 1 | `common` | done | Error types, base definitions — no dependencies |
|
| 1 | `common` | done | Error types, base definitions — no dependencies |
|
||||||
| 2 | `media_ctrl` | done | Media Controller API — device and topology enumeration, pad format config |
|
| 2 | `media_ctrl` | done | Media Controller API — device and topology enumeration, pad format config |
|
||||||
| 3 | `v4l2_ctrl` | done | V4L2 controls — enumerate, get, set camera parameters |
|
| 3 | `v4l2_ctrl` | done | V4L2 controls — enumerate, get, set camera parameters |
|
||||||
| 4 | `serial` | not started | `put`/`get` primitives for little-endian binary serialization into byte buffers |
|
| 4 | `serial` | done | `put`/`get` primitives for little-endian binary serialization into byte buffers |
|
||||||
| 5 | `transport` | not started | Encapsulated transport — frame header, TCP stream abstraction, single-write send |
|
| 5 | `transport` | not started | Encapsulated transport — frame header, TCP stream abstraction, single-write send |
|
||||||
| 6 | `protocol` | not started | Typed `write_*`/`read_*` functions for all message types; builds on serial + transport |
|
| 6 | `protocol` | not started | Typed `write_*`/`read_*` functions for all message types; builds on serial + transport |
|
||||||
| 7 | `frame_alloc` | not started | Per-frame allocation with bookkeeping (byte budget, ref counting) |
|
| 7 | `frame_alloc` | not started | Per-frame allocation with bookkeeping (byte budget, ref counting) |
|
||||||
| 8 | `relay` | not started | Input dispatch to output queues (low-latency and completeness modes) |
|
| 8 | `relay` | not started | Input dispatch to output queues (low-latency and completeness modes) |
|
||||||
| 9 | `ingest` | not started | MJPEG frame parser (two-pass EOI state machine, opaque stream → discrete frames) |
|
| 9 | `ingest` | not started | MJPEG frame parser (two-pass EOI state machine, opaque stream → discrete frames) |
|
||||||
| 10 | `archive` | not started | Write frames to disk, control messages to binary log |
|
| 10 | `archive` | not started | Write frames to disk, control messages to binary log |
|
||||||
| 11 | `web node` | not started | Node.js/Express peer — speaks binary protocol on socket side, HTTP/WebSocket to browser; `protocol.mjs` mirrors C protocol module |
|
| 11 | `xorg` | not started | X11 screen geometry queries (XRandR), screen grab source, frame viewer sink — see architecture.md |
|
||||||
|
| 12 | `web node` | not started | Node.js/Express peer — speaks binary protocol on socket side, HTTP/WebSocket to browser; `protocol.mjs` mirrors C protocol module |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user