From 14926f5421b2123b620abd4ca859a1eb0e6dcf6d Mon Sep 17 00:00:00 2001
From: mikael-lovqvists-claude-agent <mikaels.claude.agent@efforting.tech>
Date: Sat, 28 Mar 2026 20:35:49 +0000
Subject: [PATCH] =?UTF-8?q?docs:=20redesign=20frame=20viewer=20sink=20?=
 =?UTF-8?q?=E2=80=94=20GLFW+OpenGL=20now,=20Vulkan=20as=20future=20alt?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Replace XShmPutImage approach with GLFW+OpenGL as the initial renderer.
Documents the two-renderer plan: GLFW handles window/input for both;
only the rendering backend differs. Notes that both renderers should
conform to the same internal interface for swappability.

Adds input event forwarding (keyboard/mouse → INPUT_EVENT upstream)
as a first-class capability of the viewer sink.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---
 architecture.md | 34 ++++++++++++++++++++++++++++------
 1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/architecture.md b/architecture.md
index bb1b584..7d9ed9e 100644
--- a/architecture.md
+++ b/architecture.md
@@ -389,16 +389,38 @@ The grab loop produces frames at a configured rate, encapsulates them, and feeds
 
 ### Frame Viewer Sink
 
-The module can act as a video sink by creating an X11 window and rendering the latest received frame into it. The window:
+The module can act as a video sink by creating a window and rendering the latest received frame into it. The window:
 
-- Can be placed on a specific monitor using XRandR geometry
+- Geometry (size and monitor placement) is specified at stream open time, using XRandR data when targeting a specific output
 - Can be made fullscreen on a chosen output
-- Renders using `XShmPutImage` (MIT-SHM) when the source is local, or `XPutImage` otherwise
-- Displays the most recently received frame — it is driven by the low-latency output mode of the relay feeding it; it never buffers for completeness
+- Displays the most recently received frame — driven by the low-latency output mode of the relay; never buffers for completeness
+- Forwards keyboard and mouse events back upstream as `INPUT_EVENT` protocol messages, enabling remote control use cases
 
-This makes it the display-side counterpart of the V4L2 capture source: the same frame that was grabbed from a camera on a Pi can be viewed on any machine in the network that runs an xorg sink node, with the relay handling the path and delivery mode between them.
+Scale and crop are applied in the renderer — the incoming frame is stretched or letterboxed to fill the window. This allows a high-resolution source (Pi camera, screen grab) to be displayed scaled-down on a different machine.
 
-Scale and crop are applied at render time — the incoming frame is stretched or cropped to fill the window. This allows a high-resolution screen grab from one machine to be displayed scaled-down on a smaller physical monitor elsewhere in the network.
+This makes it the display-side counterpart of the V4L2 capture source: a frame grabbed from a camera on a Pi can be viewed on any machine in the network running a viewer sink node, with the relay handling the path and delivery mode.
+
+#### Renderer: GLFW + OpenGL
+
+The initial implementation uses **GLFW** for window and input management and **OpenGL** for rendering.
+
+GLFW handles window creation, the event loop, resize, and input callbacks — it also supports Vulkan surface creation using the same API, which makes a future renderer swap straightforward. Input events (keyboard, mouse) are normalised by GLFW before being encoded as protocol messages.
+
+The OpenGL renderer:
+1. Receives a decoded frame as a pixel buffer (libjpeg-turbo for MJPEG, raw for uncompressed formats)
+2. Uploads it as a 2D texture
+3. Runs a fragment shader that handles YUV→RGB conversion (where needed) and scaling/filtering
+4. Presents via GLFW's swap-buffers call
+
+This keeps CPU load low — chroma conversion and scaling happen on the GPU — while keeping the implementation simple relative to a full Vulkan pipeline.
+
+#### Renderer: Vulkan (future alternative)
+
+A Vulkan renderer is planned as an alternative to the OpenGL one. GLFW's surface creation API is renderer-agnostic, so the window management and input handling code is shared. Only the renderer backend changes.
+
+Vulkan offers more explicit control over presentation timing, multi-queue workloads, and compute shaders (e.g. on-GPU MJPEG decode via a compute pass if a suitable library is available). It is not needed for the initial viewer but worth having for high-frame-rate or multi-stream display scenarios.
+
+The renderer selection should be a compile-time or runtime option — both implementations conform to the same internal interface (`render_frame(pixel_buffer, width, height, format)`).
 
 ---