docs: redesign frame viewer sink — GLFW+OpenGL now, Vulkan as future alt

Replace XShmPutImage approach with GLFW+OpenGL as the initial renderer.
Documents the two-renderer plan: GLFW handles window/input for both;
only the rendering backend differs. Notes that both renderers should
conform to the same internal interface for swappability.

Adds input event forwarding (keyboard/mouse → INPUT_EVENT upstream)
as a first-class capability of the viewer sink.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-28 20:35:49 +00:00
parent 24d031d42b
commit 14926f5421

View File

@@ -389,16 +389,38 @@ The grab loop produces frames at a configured rate, encapsulates them, and feeds
### Frame Viewer Sink ### Frame Viewer Sink
The module can act as a video sink by creating an X11 window and rendering the latest received frame into it. The window: The module can act as a video sink by creating a window and rendering the latest received frame into it. The window:
- Can be placed on a specific monitor using XRandR geometry - Geometry (size and monitor placement) is specified at stream open time, using XRandR data when targeting a specific output
- Can be made fullscreen on a chosen output - Can be made fullscreen on a chosen output
- Renders using `XShmPutImage` (MIT-SHM) when the source is local, or `XPutImage` otherwise - Displays the most recently received frame — driven by the low-latency output mode of the relay; never buffers for completeness
- Displays the most recently received frame — it is driven by the low-latency output mode of the relay feeding it; it never buffers for completeness - Forwards keyboard and mouse events back upstream as `INPUT_EVENT` protocol messages, enabling remote control use cases
This makes it the display-side counterpart of the V4L2 capture source: the same frame that was grabbed from a camera on a Pi can be viewed on any machine in the network that runs an xorg sink node, with the relay handling the path and delivery mode between them. Scale and crop are applied in the renderer — the incoming frame is stretched or letterboxed to fill the window. This allows a high-resolution source (Pi camera, screen grab) to be displayed scaled-down on a different machine.
Scale and crop are applied at render time — the incoming frame is stretched or cropped to fill the window. This allows a high-resolution screen grab from one machine to be displayed scaled-down on a smaller physical monitor elsewhere in the network. This makes it the display-side counterpart of the V4L2 capture source: a frame grabbed from a camera on a Pi can be viewed on any machine in the network running a viewer sink node, with the relay handling the path and delivery mode.
#### Renderer: GLFW + OpenGL
The initial implementation uses **GLFW** for window and input management and **OpenGL** for rendering.
GLFW handles window creation, the event loop, resize, and input callbacks — it also supports Vulkan surface creation using the same API, which makes a future renderer swap straightforward. Input events (keyboard, mouse) are normalised by GLFW before being encoded as protocol messages.
The OpenGL renderer:
1. Receives a decoded frame as a pixel buffer (libjpeg-turbo for MJPEG, raw for uncompressed formats)
2. Uploads it as a 2D texture
3. Runs a fragment shader that handles YUV→RGB conversion (where needed) and scaling/filtering
4. Presents via GLFW's swap-buffers call
This keeps CPU load low — chroma conversion and scaling happen on the GPU — while keeping the implementation simple relative to a full Vulkan pipeline.
#### Renderer: Vulkan (future alternative)
A Vulkan renderer is planned as an alternative to the OpenGL one. GLFW's surface creation API is renderer-agnostic, so the window management and input handling code is shared. Only the renderer backend changes.
Vulkan offers more explicit control over presentation timing, multi-queue workloads, and compute shaders (e.g. on-GPU MJPEG decode via a compute pass if a suitable library is available). It is not needed for the initial viewer but worth having for high-frame-rate or multi-stream display scenarios.
The renderer selection should be a compile-time or runtime option — both implementations conform to the same internal interface (`render_frame(pixel_buffer, width, height, format)`).
--- ---