Files
video-setup/planning.md
mikael-lovqvists-claude-agent 61c81398bb feat: node-to-node MJPEG streaming CLIs and shared V4L2 format header
Add stream_send_cli (V4L2 capture → TCP → VIDEO_FRAME) and
stream_recv_cli (TCP → threaded frame slot → GLFW display) to
exercise end-to-end streaming between two nodes on the same machine
or across the network.

Add include/stream_stats.h (header-only rolling-window fps/Mbps tracker)
and include/v4l2_fmt.h (header-only V4L2 format enumeration shared between
v4l2_view_cli and stream_send_cli). Refactor v4l2_view_cli to use the
shared header.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-28 22:31:54 +00:00

6.4 KiB

Planning

Approach

Build the system module by module in C11. Each module is a translation unit (.h + .c) with a clearly defined API. Modules are exercised by small driver programs in dev/ before anything depends on them. This keeps each unit independently testable and prevents architectural decisions from being made prematurely in code.

The final binary is a single configurable node program. That integration work comes after the modules are solid.


Directory Structure

video-setup/
  src/
    modules/
      common/        - shared definitions (error types, base types)
      media_ctrl/    - Linux Media Controller API (topology, pad formats, links)
      v4l2_ctrl/     - V4L2 camera controls (enumerate, get, set)
      serial/        - little-endian binary serialization primitives
      transport/     - framed TCP stream, single-write send
      protocol/      - typed write_*/read_* message functions
      test_image/    - test pattern generator (colour bars, ramp, grid; YUV420/BGRA)
      xorg/          - GLFW+OpenGL viewer sink; stub for headless builds
    node/            - video node entry point and top-level integration (later)
  include/           - public headers
  dev/
    cli/             - exploratory CLI drivers, one per module
    web/             - development web UI (Node.js/Express); browser-side equivalent
                       of the CLI tools; depends on protocol being finalised
    experiments/     - freeform experiments
  tools/
    gen_font_atlas/  - build-time bitmap font atlas generator (Python/Pillow);
                       outputs build/gen/font_atlas.h consumed by xorg module
  tests/             - automated tests (later)
  Makefile
  architecture.md
  planning.md
  conventions.md

Module Order

Modules are listed in intended build order. Each depends only on modules above it.

# Module Status Notes
1 common done Error types, base definitions — no dependencies
config done INI file loader with schema-driven defaults, typed getters, FLAGS type for bitmask values
2 media_ctrl done Media Controller API — device and topology enumeration, pad format config
3 v4l2_ctrl done V4L2 controls — enumerate, get, set camera parameters
4 serial done put/get primitives for little-endian binary serialization into byte buffers
5 transport done Encapsulated transport — frame header, TCP stream abstraction, single-write send
6 discovery done UDP multicast announcements, peer table, found/lost callbacks
7 protocol done Typed write_*/read_* functions for all message types; builds on serial + transport
node done Video node binary — config, discovery, transport server, V4L2/media control request handlers
8 test_image done Test pattern generator — colour bars, luminance ramp, grid crosshatch; YUV420/BGRA output
9 xorg done GLFW+OpenGL viewer sink — YUV420/BGRA/MJPEG display, all scale/anchor modes, bitmap font atlas text overlays; XRandR queries and screen grab not yet implemented
10 frame_alloc not started Per-frame allocation with bookkeeping (byte budget, ref counting)
11 relay not started Input dispatch to output queues (low-latency and completeness modes)
12 ingest not started V4L2 capture loop — dequeue buffers, emit one encapsulated frame per buffer
13 archive not started Write frames to disk, control messages to binary log
14 codec not started Per-frame encode/decode — MJPEG (libjpeg-turbo), QOI, ZSTD-raw, VA-API H.264 intra; used by screen grab source and archive
15 web node not started Node.js/Express peer — speaks binary protocol on socket side, HTTP/WebSocket to browser; protocol.mjs mirrors C protocol module
mjpeg_scan future EOI marker scanner for misbehaving hardware that does not deliver clean per-buffer frames; not part of the primary pipeline

Dev Tools

CLI (dev/cli/)

Each module gets a corresponding CLI driver that exercises its API and serves as both an integration check and a useful development tool.

Driver Exercises Notes
media_ctrl_cli media_ctrl List media devices, show topology, configure pad formats
v4l2_ctrl_cli v4l2_ctrl List controls, get/set values — lightweight v4l2-ctl equivalent
transport_cli transport Send/receive framed messages, inspect headers
test_image_cli test_image Generate test patterns, write PPM for visual inspection
xorg_cli xorg Display test pattern in viewer window; exercises scale/anchor modes and text overlays
v4l2_view_cli V4L2 + xorg Live camera viewer — auto-selects highest-FPS format, FPS/format overlay; bypasses node system
stream_send_cli V4L2 + transport + protocol Capture MJPEG from V4L2, connect to receiver, send VIDEO_FRAME messages; prints fps/Mbps stats
stream_recv_cli transport + protocol + xorg Listen for incoming VIDEO_FRAME stream, display in viewer; fps/Mbps overlay; threaded transport→GL handoff

Web UI (dev/web/)

A Node.js/Express development web UI that connects to running video nodes as a binary protocol peer. Exposes V4L2 control inspection and adjustment, media topology view, and stream state through a browser interface. Supports live peer discovery via SSE — discovered nodes appear automatically in the UI.

This is a development aid, not the production dashboard. The production dashboard (full stream configuration UI) is a later, separate project.


Future: Protocol Preprocessor

The C protocol module and JavaScript protocol.mjs will eventually be generated from a single schema by a future preprocessor. This eliminates drift between the two implementations. The preprocessor also handles error location codes (see common/error). Neither the schema format nor the preprocessor tool exists yet — the hand-written implementations are the interim state.


Deferred Decisions

These are open questions tracked in architecture.md that do not need to be resolved before module work begins:

  • Graph representation format
  • Connection establishment model (push vs pull)
  • Completeness queue drop policy (oldest vs newest, per-output config)
  • Stream ID remapping across relay hops
  • Transport for relay edges (TCP / UDP / shared memory)
  • Node discovery mechanism
  • Hard vs soft byte budget limits