docs: redesign frame viewer sink — GLFW+OpenGL now, Vulkan as future alt

Replace XShmPutImage approach with GLFW+OpenGL as the initial renderer. Documents the two-renderer plan: GLFW handles window/input for both; only the rendering backend differs. Notes that both renderers should conform to the same internal interface for swappability. Adds input event forwarding (keyboard/mouse → INPUT_EVENT upstream) as a first-class capability of the viewer sink. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-28 20:35:49 +00:00
parent 24d031d42b
commit 14926f5421
1 changed files with 28 additions and 6 deletions
--- a/architecture.md
+++ b/architecture.md
@@ -389,16 +389,38 @@ The grab loop produces frames at a configured rate, encapsulates them, and feeds

 ### Frame Viewer Sink

-The module can act as a video sink by creating an X11 window and rendering the latest received frame into it. The window:
+The module can act as a video sink by creating a window and rendering the latest received frame into it. The window:

- Can be placed on a specific monitor using XRandR geometry
+- Geometry (size and monitor placement) is specified at stream open time, using XRandR data when targeting a specific output
 - Can be made fullscreen on a chosen output
- Renders using `XShmPutImage` (MIT-SHM) when the source is local, or `XPutImage` otherwise
- Displays the most recently received frame — it is driven by the low-latency output mode of the relay feeding it; it never buffers for completeness
+- Displays the most recently received frame — driven by the low-latency output mode of the relay; never buffers for completeness
+- Forwards keyboard and mouse events back upstream as `INPUT_EVENT` protocol messages, enabling remote control use cases

-This makes it the display-side counterpart of the V4L2 capture source: the same frame that was grabbed from a camera on a Pi can be viewed on any machine in the network that runs an xorg sink node, with the relay handling the path and delivery mode between them.
+Scale and crop are applied in the renderer — the incoming frame is stretched or letterboxed to fill the window. This allows a high-resolution source (Pi camera, screen grab) to be displayed scaled-down on a different machine.

-Scale and crop are applied at render time — the incoming frame is stretched or cropped to fill the window. This allows a high-resolution screen grab from one machine to be displayed scaled-down on a smaller physical monitor elsewhere in the network.
+This makes it the display-side counterpart of the V4L2 capture source: a frame grabbed from a camera on a Pi can be viewed on any machine in the network running a viewer sink node, with the relay handling the path and delivery mode.
+
+#### Renderer: GLFW + OpenGL
+
+The initial implementation uses **GLFW** for window and input management and **OpenGL** for rendering.
+
+GLFW handles window creation, the event loop, resize, and input callbacks — it also supports Vulkan surface creation using the same API, which makes a future renderer swap straightforward. Input events (keyboard, mouse) are normalised by GLFW before being encoded as protocol messages.
+
+The OpenGL renderer:
+1. Receives a decoded frame as a pixel buffer (libjpeg-turbo for MJPEG, raw for uncompressed formats)
+2. Uploads it as a 2D texture
+3. Runs a fragment shader that handles YUV→RGB conversion (where needed) and scaling/filtering
+4. Presents via GLFW's swap-buffers call
+
+This keeps CPU load low — chroma conversion and scaling happen on the GPU — while keeping the implementation simple relative to a full Vulkan pipeline.
+
+#### Renderer: Vulkan (future alternative)
+
+A Vulkan renderer is planned as an alternative to the OpenGL one. GLFW's surface creation API is renderer-agnostic, so the window management and input handling code is shared. Only the renderer backend changes.
+
+Vulkan offers more explicit control over presentation timing, multi-queue workloads, and compute shaders (e.g. on-GPU MJPEG decode via a compute pass if a suitable library is available). It is not needed for the initial viewer but worth having for high-frame-rate or multi-stream display scenarios.
+
+The renderer selection should be a compile-time or runtime option — both implementations conform to the same internal interface (`render_frame(pixel_buffer, width, height, format)`).

 ---