Add xorg module plan and audio forward-compatibility note
xorg module: XRandR geometry queries, screen grab source (XShmGetImage), frame viewer sink (XShmPutImage, fullscreen per monitor). All exposed as standard source/sink node roles on the existing transport. Audio: deferred but transport is already compatible — channel_id mux, audio_frame message type slot reserved, relay/allocator are payload-agnostic. Also marks serial as done in planning.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -35,9 +35,9 @@ Node types:
|
||||
|
||||
| Type | Role |
|
||||
|---|---|
|
||||
| **Source** | Produces video — V4L2 camera, file, test signal |
|
||||
| **Source** | Produces video — V4L2 camera, screen grab, file, test signal |
|
||||
| **Relay** | Receives one or more input streams and distributes to one or more outputs, each with its own delivery mode and buffer; never blocks upstream |
|
||||
| **Sink** | Consumes video — display, archiver, encoder output |
|
||||
| **Sink** | Consumes video — display window, archiver, encoder output |
|
||||
|
||||
A relay with multiple inputs is what would traditionally be called a mux — it combines streams from several sources and forwards them, possibly over a single transport. The dispatch and buffering logic is the same regardless of input count.
|
||||
|
||||
@@ -242,6 +242,53 @@ graph TD
|
||||
|
||||
---
|
||||
|
||||
## X11 / Xorg Integration
|
||||
|
||||
An `xorg` module provides two capabilities that complement the V4L2 camera pipeline: screen geometry queries and an X11-based video feed viewer. Both operate as first-class node roles.
|
||||
|
||||
### Screen Geometry Queries (XRandR)
|
||||
|
||||
Using the XRandR extension, the module can enumerate connected outputs and retrieve their geometry — resolution, position within the desktop coordinate space, physical size, and refresh rate. This is useful for:
|
||||
|
||||
- **Routing decisions**: knowing the resolution of the target display before deciding how to scale or crop an incoming stream
|
||||
- **Screen grab source**: determining the exact rectangle to capture for a given monitor
|
||||
- **Multi-monitor layouts**: placing viewer windows correctly in a multi-head setup without guessing offsets
|
||||
|
||||
Queries are exposed as control request/response pairs on the standard transport, so a remote node can ask "what monitors does this machine have?" and receive structured geometry data without any X11 code on the asking side.
|
||||
|
||||
### Screen Grab Source
|
||||
|
||||
The module can act as a video source by capturing the contents of a screen region using `XShmGetImage` (MIT-SHM extension) for zero-copy capture within the same machine. The captured region is a configurable rectangle — typically one full monitor by its XRandR geometry, but can be any sub-region.
|
||||
|
||||
The grab loop produces frames at a configured rate, encapsulates them, and feeds them into the transport like any other video source. Combined with geometry queries, a remote controller can enumerate monitors, select one, and start a screen grab stream without manual coordinate configuration.
|
||||
|
||||
### Frame Viewer Sink
|
||||
|
||||
The module can act as a video sink by creating an X11 window and rendering the latest received frame into it. The window:
|
||||
|
||||
- Can be placed on a specific monitor using XRandR geometry
|
||||
- Can be made fullscreen on a chosen output
|
||||
- Renders using `XShmPutImage` (MIT-SHM) when the source is local, or `XPutImage` otherwise
|
||||
- Displays the most recently received frame — it is driven by the low-latency output mode of the relay feeding it; it never buffers for completeness
|
||||
|
||||
This makes it the display-side counterpart of the V4L2 capture source: the same frame that was grabbed from a camera on a Pi can be viewed on any machine in the network that runs an xorg sink node, with the relay handling the path and delivery mode between them.
|
||||
|
||||
Scale and crop are applied at render time — the incoming frame is stretched or cropped to fill the window. This allows a high-resolution screen grab from one machine to be displayed scaled-down on a smaller physical monitor elsewhere in the network.
|
||||
|
||||
---
|
||||
|
||||
## Audio (Future)
|
||||
|
||||
Audio streams are not in scope for the initial implementation but the transport is designed to accommodate them without structural changes.
|
||||
|
||||
The `channel_id` field already provides stream multiplexing on a single connection. A future audio channel is just another channel on an existing transport connection — no new connection type is needed. The message type table has room for an `audio_frame` type alongside `video_frame`.
|
||||
|
||||
The main open question is codec and container: raw PCM is trivial to handle but large; compressed formats (Opus, AAC) need framing conventions. This is deferred until video is solid.
|
||||
|
||||
The frame allocator, relay, and archive modules should not make assumptions that `channel_id` implies video — they operate on opaque byte payloads with a message type and length, so audio frames will pass through the same infrastructure unchanged.
|
||||
|
||||
---
|
||||
|
||||
## Device Resilience
|
||||
|
||||
Nodes that read from hardware devices (V4L2 cameras, media devices) must handle transient device loss — a USB camera that disconnects and reconnects, a device node that briefly disappears during a mode switch, or a stream that errors out and can be retried. This is not an early implementation concern but has structural implications that should be respected from the start.
|
||||
|
||||
Reference in New Issue
Block a user