From 00560591e24de46402f35987f11247b8be0c6162 Mon Sep 17 00:00:00 2001 From: mikael-lovqvists-claude-agent Date: Wed, 25 Mar 2026 20:40:00 +0000 Subject: [PATCH] Add architecture, planning, and conventions documents Initial documentation for the multi-peer video routing system: - architecture.md covers graph model, transport protocol, relay design, and the Pi get-it-on-the-wire-first rationale - planning.md defines module build order and directory structure - conventions.md captures C11 code style, naming, error handling approach, and directory layout rules Co-Authored-By: Claude Sonnet 4.6 --- .gitignore | 1 + architecture.md | 259 ++++++++++++++++++++++++++++++++++++++++++++++++ conventions.md | 171 ++++++++++++++++++++++++++++++++ planning.md | 70 +++++++++++++ 4 files changed, 501 insertions(+) create mode 100644 .gitignore create mode 100644 architecture.md create mode 100644 conventions.md create mode 100644 planning.md diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..feead5b --- /dev/null +++ b/.gitignore @@ -0,0 +1 @@ +reference/ diff --git a/architecture.md b/architecture.md new file mode 100644 index 0000000..37694b8 --- /dev/null +++ b/architecture.md @@ -0,0 +1,259 @@ +# Video Routing System — Architecture + +## Concept + +A graph-based multi-peer video routing system where nodes are media processes and edges are transport connections. The graph carries video streams between sources, relay nodes, and sinks, with **priority** as a first-class property on paths — so that a low-latency monitoring feed and a high-quality archival feed can coexist and be treated differently by the system. + +--- + +## Design Rationale + +### Get It on the Wire First + +A key principle driving the architecture is that **capture devices should not be burdened with processing**. + +A Raspberry Pi attached to a camera (V4L2 source) is capable of pulling raw or MJPEG frames off the device, but it is likely too resource-constrained to also transcode, mux, or perform any non-trivial stream manipulation. Doing so would add latency and compete with the capture process itself. + +The preferred model is: + +1. **Pi captures and transmits raw** — reads frames directly from V4L2 (MJPEG or raw Bayer/YUV) and puts them on the wire over TCP as fast as possible, with no local transcoding +2. **A more capable machine receives and defines the stream** — a downstream node with proper CPU/GPU resources receives the raw feed and produces well-formed, containerized, or re-encoded output appropriate for the intended consumers (display, archive, relay) + +This separation means the Pi's job is purely ingestion and forwarding. It keeps the capture loop tight and latency minimal. The downstream node then becomes the "source" of record for the rest of the graph. + +This is also why the V4L2 remote control protocol is useful — the Pi doesn't need to run any control logic locally. It exposes its camera parameters over TCP, and the controlling machine adjusts exposure, white balance, codec settings, etc. remotely. The Pi just acts on the commands. + +--- + +## Graph Model + +### Nodes + +Each node is a named process instance, identified by a namespace and name (e.g. `v4l2:microscope`, `ffmpeg:ingest1`, `mpv:preview`, `archiver:main`). + +Node types: + +| Type | Role | +|---|---| +| **Source** | Produces video — V4L2 camera, file, test signal | +| **Relay** | Receives one or more input streams and distributes to one or more outputs, each with its own delivery mode and buffer; never blocks upstream | +| **Sink** | Consumes video — display, archiver, encoder output | + +A relay with multiple inputs is what would traditionally be called a mux — it combines streams from several sources and forwards them, possibly over a single transport. The dispatch and buffering logic is the same regardless of input count. + +### Edges + +An edge is a transport connection between two nodes. Edges carry: + +- The video stream itself (TCP, pipe, or other transport) +- A **priority** value +- A **transport mode** — opaque or encapsulated (see [Transport Protocol](#transport-protocol)) + +### Priority + +Priority governs how the system allocates resources and makes trade-offs when paths compete: + +- **High priority (low latency)** — frames are forwarded immediately; buffering is minimized; if a downstream node is slow it gets dropped frames, not delayed ones; quality may be lower +- **Low priority (archival)** — frames may be buffered, quality should be maximized; latency is acceptable; dropped frames are undesirable + +Priority is a property of the *path*, not of the source. The same source can feed a high-priority monitoring path and a low-priority archival path simultaneously. + +--- + +## Control Plane + +A central **message hub** coordinates the graph. Nodes self-register with an identity (`origin`, `function`, `description`) and communicate via unicast or broadcast messages. + +The hub does not dictate topology — nodes announce their capabilities and the controller assembles connections. This keeps the hub stateless with respect to stream routing; it only routes control messages. + +The hub protocol is newline-delimited JSON over TCP. V4L2 device control and enumeration are carried as control messages within the encapsulated transport rather than on a separate connection — see [Transport Protocol](#transport-protocol). + +--- + +## Ingestion Pipeline (Raspberry Pi Example) + +```mermaid +graph LR + CAM[V4L2 Camera\ndev/video0] -->|raw MJPEG| PI[Pi: forward over TCP\nno processing] + PI -->|TCP stream| INGEST[Ingest Node\nmore capable machine] + INGEST -->|well-formed stream| RELAY[Relay] + RELAY -->|high priority| DISPLAY[Display / Preview\nlow latency] + RELAY -->|low priority| ARCHIVE[Archiver\nhigh quality] + INGEST -.->|V4L2 control\nvia transport| PI + CTRL[Control Plane\nmessage hub] -.->|node control| INGEST + CTRL -.->|node control| RELAY +``` + +The Pi runs only: +- A forwarding process (e.g. `dd if=/dev/video0 | nc ` or equivalent) +- The V4L2 control endpoint (receives parameter change commands) + +Everything else happens on machines with adequate resources. + +--- + +## Transport Protocol + +Transport between nodes operates in one of two modes. The choice is per-edge and has direct implications for what the relay on that edge can do. + +### Opaque Binary Stream + +The transport forwards bytes as they arrive with no understanding of frame boundaries. The relay acts as a pure byte pipe. + +- Zero framing overhead +- Cannot drop frames (frame boundaries are unknown) +- Cannot multiplex multiple streams (no way to distinguish them) +- Cannot do per-frame accounting (byte budgets become byte-rate estimates only) +- Low-latency output is not available — the relay cannot discard a partial frame + +This mode is appropriate for simple point-to-point forwarding where the consumer handles all framing, and where the relay has no need for frame-level intelligence. + +### Frame-Encapsulated Stream + +Each message is prefixed with a small fixed-size header. This applies to both video frames and control messages — the transport is unified. + +Header fields: + +| Field | Size | Purpose | +|---|---|---| +| `message_type` | 2 bytes | Distinguishes video frame, control request, control response | +| `channel_id` | 2 bytes | For video: identifies the stream. For control: identifies the request/response pair (correlation ID) | +| `payload_length` | 4 bytes | Byte length of the following payload | + +**Message types:** + +| Value | Meaning | +|---|---| +| `0x0001` | Video frame | +| `0x0002` | Control request (JSON) | +| `0x0003` | Control response (JSON) | + +Video frame payloads are raw compressed frames. Control payloads are JSON — the same request/response structure as the V4L2 RPC protocol, but carried inline on the same connection rather than on a separate port. + +### Unified Control and Video on One Connection + +By carrying control messages on the same transport as video frames, the system avoids managing separate connections per peer. A node that receives a video stream can be queried or commanded over the same socket. + +This directly enables **remote device enumeration**: a connecting node can issue a control request asking what V4L2 devices the remote host exposes, and receive the list in a control response — before any video streams are established. Discovery and streaming share the same channel. + +The V4L2 control operations map naturally to control request/response pairs: + +| Operation | Direction | +|---|---| +| Enumerate devices | request → response | +| Get device controls (parameters, ranges, menus) | request → response | +| Get control values | request → response | +| Set control values | request → response (ack/fail) | + +Control messages are low-volume and can be interleaved with the video frame stream without meaningful overhead. + +### Capability Implications + +| Feature | Opaque | Encapsulated | +|---|---|---| +| Simple forwarding | yes | yes | +| Low-latency drop | **no** | yes | +| Per-frame byte accounting | **no** | yes | +| Multi-stream over one transport | **no** | yes | +| Sequence numbers / timestamps | **no** | yes (via extension) | +| Control / command channel | **no** | yes | +| Remote device enumeration | **no** | yes | + +The most important forcing function is **low-latency relay**: to drop a pending frame when a newer one arrives, the relay must know where frames begin and end. An opaque stream cannot support this, so any edge that requires low-latency output must use encapsulation. + +Opaque streams are a valid optimization for leaf edges where the downstream consumer (e.g. an archiver writing raw bytes to disk) does its own framing, requires no relay intelligence, and has no need for remote control. + +--- + +## Relay Design + +A relay receives frames from one or more upstream sources and distributes them to any number of outputs. Each output is independently configured with a **delivery mode** that determines how it handles the tension between latency and completeness. + +### Output Delivery Modes + +**Low-latency mode** — minimize delay, accept loss + +The output holds at most one pending frame. When a new frame arrives: +- If the slot is empty, the frame occupies it and is sent as soon as the transport allows +- If the slot is already occupied (transport not ready), the incoming frame is dropped — the pending frame is already stale enough + +The consumer always receives the most recent frame the transport could deliver. Frame loss is expected and acceptable. + +**Completeness mode** — minimize loss, accept delay + +The output maintains a queue. When a new frame arrives it is enqueued. The transport drains the queue in order. When the queue is full, a drop policy is applied — either drop the oldest frame (preserve recency) or drop the newest (preserve continuity). Which policy fits depends on the consumer: an archiver may prefer continuity; a scrubber may prefer recency. + +### Memory Model + +Compressed frames have variable sizes (I-frames vs P-frames, quality settings, scene complexity), so fixed-slot buffers waste memory unpredictably. The preferred model is **per-frame allocation** with explicit bookkeeping. + +Each allocated frame is tracked with at minimum: +- Byte size +- Sequence number or timestamp +- Which outputs still hold a reference + +Limits are enforced per output independently — not as a shared pool — so a slow completeness output cannot starve a low-latency output or exhaust global memory. Per-output limits have two axes: +- **Frame count** — cap on number of queued frames +- **Byte budget** — cap on total bytes in flight for that output + +Both limits should be configurable. Either limit being reached triggers the drop policy. + +### Congestion: Two Sides + +Congestion can arise at both ends of the relay and must be handled explicitly on each. + +**Inbound congestion (upstream → relay)** + +If the upstream source produces frames faster than any output can dispatch them: +- Low-latency outputs are unaffected by design — they always hold at most one frame +- Completeness outputs will see their queues grow; limits and drop policy absorb the excess + +The relay never signals backpressure to the upstream. It is the upstream's concern to produce frames at a sustainable rate; the relay's concern is only to handle whatever arrives without blocking. + +**Outbound congestion (relay → downstream transport)** + +If the transport layer cannot accept a frame immediately: +- Low-latency mode: the pending frame is dropped when the next frame arrives; the transport sends the newest frame it can when it becomes ready +- Completeness mode: the frame stays in the queue; the queue grows until the transport catches up or limits are reached + +The interaction between outbound congestion and the byte budget is important: a transport that is consistently slow will fill the completeness queue to its byte budget limit, at which point the drop policy engages. This is the intended safety valve — the budget defines the maximum acceptable latency inflation before the system reverts to dropping. + +### Congestion Signals + +Even though the relay does not apply backpressure, it should emit **observable congestion signals** — drop counts, queue depth, byte utilization — on the control plane so that the controller can make decisions: reduce upstream quality, reroute, alert, or adjust budgets dynamically. + +```mermaid +graph TD + UP1[Upstream Source A] -->|encapsulated stream| RELAY[Relay] + UP2[Upstream Source B] -->|encapsulated stream| RELAY + + RELAY --> LS[Low-latency Output\nsingle-slot\ndrop on collision] + RELAY --> CS[Completeness Output\nqueued\ndrop on budget exceeded] + RELAY --> OB[Opaque Output\nbyte pipe\nno frame awareness] + + LS -->|encapsulated| LC[Low-latency Consumer\neg. preview display] + CS -->|encapsulated| CC[Completeness Consumer\neg. archiver] + OB -->|opaque| RAW[Raw Consumer\neg. disk writer] + + RELAY -.->|drop count\nqueue depth\nbyte utilization| CTRL[Control Plane] +``` + +--- + +## Implementation Approach + +The system is built module by module in C11. Each translation unit is developed and validated independently before being integrated. See [planning.md](planning.md) for current status and module order, and [conventions.md](conventions.md) for code and project conventions. + +The final deliverable is a single configurable node binary. During development, each module is exercised through small driver programs that live in the development tree, not in the module directories. + +--- + +## Open Questions + +- What is the graph's representation format — in-memory object graph, serialized config, or both? +- How are connections established — does the controller push connection instructions to nodes, or do nodes pull from a known address? +- Drop policy for completeness queues: drop oldest (recency) or drop newest (continuity)? Should be per-output configurable. +- When a relay has multiple inputs on an encapsulated transport, how are streams tagged on the outbound side — same stream_id passthrough, or remapped? +- What transport is used for relay edges — TCP, UDP, shared memory for local hops? +- How are nodes discovered — static config, mDNS, manual registration? +- Should per-output byte budgets be hard limits or soft limits with hysteresis? diff --git a/conventions.md b/conventions.md new file mode 100644 index 0000000..00b3c67 --- /dev/null +++ b/conventions.md @@ -0,0 +1,171 @@ +# Conventions and Preferences + +## Language + +- **C11** throughout +- Target platform: Linux (V4L2, epoll, etc. are Linux-specific) + +--- + +## Naming + +| Kind | Convention | Example | +|---|---|---| +| Structs | `Title_Snek_Case` | `struct App_Error`, `struct Frame_Header` | +| Enums (type) | `Title_Snek_Case` | `enum Error_Code` | +| Enum values | `CAPITAL_SNEK_CASE` | `ERR_NONE`, `ERR_SYSCALL` | +| Functions | `lower_snek_case` | `v4l2_ctrl_open()`, `app_error_print()` | +| Local variables | `lower_snek_case` | `frame_count`, `device_fd` | +| Constants (`#define`, `const`) | `CAPITAL_SNEK_CASE` | `MAX_FRAME_SIZE` | +| Module-level singletons | `lower_snek_case` | (treated as functions) | + +Identifiers are prefixed with their module name where disambiguation is needed (e.g. `v4l2_ctrl_`, `media_ctrl_`, `transport_`). + +--- + +## Types + +### No typedefs for structs + +Always use the `struct` keyword at the call site. This makes composite types unambiguous without inspecting declarations. + +```c +/* correct */ +struct App_Error err = app_error_ok(); +void process(struct Frame_Header *header); + +/* wrong */ +typedef struct { ... } App_Error; +``` + +### typedefs are acceptable for + +- Enums: `typedef enum Error_Code { ... } Error_Code;` — allows using the name without `enum` keyword +- Scalar aliases where the underlying type is genuinely irrelevant to the caller (e.g. `typedef uint32_t Frame_Id;`) + +--- + +## Formatting + +- **Indentation**: tabs (display width 4) +- **Braces**: always use block braces, including single-statement bodies + +```c +/* correct */ +if (err.code != ERR_NONE) { + return err; +} + +/* wrong */ +if (err.code != ERR_NONE) + return err; +``` + +- **Quotes**: double quotes for string literals (C standard) + +--- + +## Error Handling + +Errors are returned as `struct App_Error` values. Functions that can fail return `struct App_Error` directly, or take an out-parameter for the result alongside it. + +### Structure + +```c +/* modules/common/error.h */ + +typedef enum Error_Code { + ERR_NONE = 0, + ERR_SYSCALL = 1, /* errno is meaningful */ + ERR_INVALID = 2, +} Error_Code; + +struct Syscall_Error_Detail { + int err_no; +}; + +struct Invalid_Error_Detail { + /* fields added as concrete cases arise */ +}; + +struct App_Error { + Error_Code code; + const char *file; /* __FILE__ — future: replaced by generated location code */ + int line; /* __LINE__ — future: replaced by generated location code */ + union { + struct Syscall_Error_Detail syscall; + struct Invalid_Error_Detail invalid; + } detail; +}; +``` + +### Macros + +```c +#define APP_OK \ + ((struct App_Error){ .code = ERR_NONE }) + +#define APP_IS_OK(e) \ + ((e).code == ERR_NONE) + +#define APP_ERROR(error_code, detail_field, ...) \ + ((struct App_Error){ \ + .code = (error_code), \ + .file = __FILE__, \ + .line = __LINE__, \ + .detail = { .detail_field = { __VA_ARGS__ } } \ + }) + +#define APP_SYSCALL_ERROR() \ + APP_ERROR(ERR_SYSCALL, syscall, .err_no = errno) +``` + +### Presentation + +Each error kind is handled in `app_error_print()`, which writes a human-readable description to stderr. A JSON writer (`app_error_write_json()`) will be added later and will use the same structured fields. + +### Future upgrade + +When the custom preprocessor is available, `__FILE__` and `__LINE__` will be replaced by a generated numeric location code that uniquely identifies the error site in the project. JSON consumers will see the same struct shape either way. + +--- + +## Directory Structure + +``` +video-setup/ + modules/ - translation units; each has a .h, a .c, and a Makefile + common/ - shared types (error, base definitions); no external dependencies + / + dev/ - development aids; not part of the final deliverable + cli/ - exploratory CLI drivers, one per module + experiments/ - freeform experiments + tests/ - automated tests (later) + Makefile - top-level build +``` + +Modules live only in `modules/`. CLI drivers and experiments live in `dev/`. Nothing in `dev/` is a dependency of anything in `modules/`. + +--- + +## Module Structure + +Each module directory contains: + +``` +modules// + .h - public API; minimal includes, no implementation details + .c - implementation + Makefile - builds a static object; links against common only +``` + +The corresponding CLI driver lives at `dev/cli/_cli.c`. + +--- + +## Build + +- **GNU make** with tabs for recipe indentation +- Each module builds to a static object (`.o`) +- CLI drivers link the module object(s) they need +- No CDN or vendored sources; dependencies are system libraries (libc, Linux kernel headers) diff --git a/planning.md b/planning.md new file mode 100644 index 0000000..6f1f116 --- /dev/null +++ b/planning.md @@ -0,0 +1,70 @@ +# Planning + +## Approach + +Build the system module by module in C11. Each module is a translation unit (`.h` + `.c`) with a clearly defined API. Modules are exercised by small driver programs in `dev/` before anything depends on them. This keeps each unit independently testable and prevents architectural decisions from being made prematurely in code. + +The final binary is a single configurable node program. That integration work comes after the modules are solid. + +--- + +## Directory Structure + +``` +video-setup/ + modules/ + common/ - shared definitions (error types, base types) + media_ctrl/ - Linux Media Controller API (topology, pad formats, links) + v4l2_ctrl/ - V4L2 camera controls (enumerate, get, set) + dev/ + cli/ - exploratory CLI drivers, one per module + experiments/ - freeform experiments + tests/ - automated tests (later) + Makefile + architecture.md + planning.md + conventions.md +``` + +--- + +## Module Order + +Modules are listed in intended build order. Each depends only on modules above it. + +| # | Module | Status | Notes | +|---|---|---|---| +| 1 | `common` | not started | Error types, base definitions — no dependencies | +| 2 | `media_ctrl` | not started | Media Controller API — device and topology enumeration, pad format config | +| 3 | `v4l2_ctrl` | not started | V4L2 controls — enumerate, get, set camera parameters | +| 4 | `transport` | not started | Encapsulated transport — header encode/decode, framed TCP read/write | +| 5 | `frame_alloc` | not started | Per-frame allocation with bookkeeping (byte budget, ref counting) | +| 6 | `relay` | not started | Input dispatch to output queues (low-latency and completeness modes) | +| 7 | `ingest` | not started | MJPEG frame parser (two-pass EOI state machine, opaque stream → discrete frames) | +| 8 | `archive` | not started | Write frames to disk, control messages to JSON log | + +--- + +## Dev CLI Drivers + +Each module gets a corresponding CLI driver in `dev/cli/` that exercises its API and serves as both an integration check and a useful development tool. + +| Driver | Exercises | Notes | +|---|---|---| +| `media_ctrl_cli` | `media_ctrl` | List media devices, show topology, configure pad formats | +| `v4l2_ctrl_cli` | `v4l2_ctrl` | List controls, get/set values — lightweight `v4l2-ctl` equivalent | +| `transport_cli` | `transport` | Send/receive framed messages, inspect headers | + +--- + +## Deferred Decisions + +These are open questions tracked in `architecture.md` that do not need to be resolved before module work begins: + +- Graph representation format +- Connection establishment model (push vs pull) +- Completeness queue drop policy (oldest vs newest, per-output config) +- Stream ID remapping across relay hops +- Transport for relay edges (TCP / UDP / shared memory) +- Node discovery mechanism +- Hard vs soft byte budget limits