docs: fix architecture drift — declarative model, resolved questions, corrections

- Add 'Declarative, Not Imperative' section near top explaining why the control model is wanted-state-based rather than imperative commands - Update Control Plane section: remove 'connection instructions' language, replace with wanted state; note CLI controller comes before web UI - Fix node naming example: xorg:preview instead of mpv:preview - Update ingestion diagram: 'wanted state' instead of 'connection config' - Add Per-Stream Stats note (stream_stats.h) to Node State Model - Mark GET_CONFIG_STATE / GET_RUNTIME_STATE as planned, not yet implemented - Split Open Questions: add Decided section for resolved questions (connection direction, stream ID assignment, single port, first delivery mode) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-28 23:05:43 +00:00
parent de87425083
commit 6fe45ee097
1 changed files with 31 additions and 7 deletions
--- a/architecture.md
+++ b/architecture.md
@@ -8,6 +8,14 @@ A graph-based multi-peer video routing system where nodes are media processes an

 ## Design Rationale

+### Declarative, Not Imperative
+
+The control model is **declarative**: the controller sets *wanted state* on each node ("you should be ingesting /dev/video0 and sending stream 3 to node B"), and each node is responsible for reconciling its current state toward that goal autonomously. The controller does not issue step-by-step commands like "open device", "connect to peer", "start sending".
+
+This is a deliberate architectural decision. Imperative orchestration — where the controller drives each resource transition directly — is fragile: the controller must track the state of every remote resource, handle every failure sequence, and re-issue commands when things go wrong. Declarative orchestration pushes that responsibility to the node, which is the only place with direct access to its own resources and the ability to respond to local failures (device disconnect, transport drop, process crash) without round-tripping through the controller.
+
+The practical effect: the controller writes wanted state and the node's reconciler does the rest. The controller can query both wanted and current state at any time to understand the topology and health of the network — see [Node State Model](#node-state-model).
+
 ### Get It on the Wire First

 A key principle driving the architecture is that **capture devices should not be burdened with processing**.
@@ -29,7 +37,7 @@ This is also why the V4L2 remote control protocol is useful — the Pi doesn't n

 ### Nodes

-Each node is a named process instance, identified by a namespace and name (e.g. `v4l2:microscope`, `ffmpeg:ingest1`, `mpv:preview`, `archiver:main`).
+Each node is a named process instance, identified by a namespace and name (e.g. `v4l2:microscope`, `xorg:preview`, `archiver:main`).

 Node types:

@@ -66,7 +74,9 @@ There is no central hub or broker. Nodes communicate directly with each other ov

 The controller role is a capability, not a singleton. Multiple nodes could hold it simultaneously; which one a user interacts with is a matter of which they connect to. A node that is purely a source or relay with no UI holds no controller bits.

-The practical flow is: a user starts a node with the controller role and a web interface, discovers the other nodes on the network via the multicast announcement layer, and uses the UI to configure how streams are routed between them. The controller node issues connection instructions directly to the relevant peers over the binary protocol — there is no intermediary.
+The practical flow is: a user starts a node with the controller role, discovers the other nodes on the network via the multicast announcement layer, and uses the interface to configure how streams are routed between them. The controller writes wanted state to the relevant peers over the binary protocol — each peer then reconciles its own resources autonomously. There is no intermediary and no imperative step-by-step orchestration.
+
+The first controller interface is a CLI tool (`controller_cli`), which exercises the same protocol that the eventual web UI will use. The web UI is a later addition — the protocol and node behaviour are identical either way.

 V4L2 device control and enumeration are carried as control messages within the encapsulated transport on the same connection as video — see [Transport Protocol](#transport-protocol).

@@ -80,8 +90,8 @@ graph LR
    PI -->|encapsulated stream| RELAY[Relay]
    RELAY -->|high priority| DISPLAY[Display / Preview<br>low latency]
    RELAY -->|low priority| ARCHIVE[Archiver<br>high quality]
-    CTRL[Controller node<br>web UI] -.->|V4L2 control<br>via transport| PI
-    CTRL -.->|connection config| RELAY
+    CTRL[Controller node<br>CLI or web UI] -.->|V4L2 control<br>via transport| PI
+    CTRL -.->|wanted state| RELAY
 ```

 The Pi runs a node process that dequeues V4L2 buffers and forwards each buffer as an encapsulated frame over TCP. It also exposes the V4L2 control endpoint for remote parameter adjustment.
@@ -760,13 +770,17 @@ typedef struct {

 This makes it reusable across any node component in the project — not just video ingest. The video node registers its resources (device, transport connection, display sink) and their dependencies, then calls `reconciler_tick()` on events and periodically.

+### Per-Stream Stats
+
+Live fps and throughput are tracked per stream using a header-only rolling-window tracker (`include/stream_stats.h`). It maintains a 0.5s window and recomputes `fps` and `mbps` each time `stream_stats_tick()` returns true. Stats are recorded by calling `stream_stats_record_frame()` on each frame. The tracker is used directly in the ingest and sink paths and feeds the runtime state reported to the controller.
+
 ### Node State Queries

-Two protocol commands allow the controller to query a node's state:
+Two protocol commands allow the controller to query a node's state (planned — not yet implemented in the protocol module):

 **`GET_CONFIG_STATE`** — returns the wanted state: which streams the node is configured to produce or consume, their destinations/sources, format, stream ID. This is the topology view — what is configured regardless of whether it is currently active.

-**`GET_RUNTIME_STATE`** — returns the current state: which resources are in which state, live fps/mbps per stream (from `stream_stats`), error codes for any failed resources.
+**`GET_RUNTIME_STATE`** — returns the current state: which resources are in which reconciler state, live fps/mbps per stream (from `stream_stats`), error codes for any failed resources.

 The controller queries all discovered nodes, correlates streams by ID and peer address, and reconstructs the full graph from the union of responses. No central authority is needed — the graph emerges from the node state reports.

@@ -780,10 +794,20 @@ The source node connects outbound to the sink's transport server port. A single

 ---

+## Decided
+
+These were previously open questions and are now resolved:
+
+- **Connection direction**: the source node connects outbound to the sink's transport server. The controller writes wanted state to the source node including the destination host:port; the source's reconciler establishes the connection.
+- **Stream ID assignment**: stream IDs are assigned by the controller, not generated locally by nodes. This ensures both ends of a stream report the same ID and the graph can be reconstructed by correlating node state reports.
+- **Single port per node**: one TCP listening port handles all traffic — video frames, control messages, state queries — in both directions. Dedicated per-stream ports on separate sockets are a future option but not the default.
+- **First delivery mode**: low-latency (no-buffer) mode is implemented first. No frame queue anywhere in the pipeline — V4L2 dequeue goes directly to transport send; received frames render immediately and are dropped if the display is behind.
+
+---
+
 ## Open Questions

 - What is the graph's representation format — in-memory object graph, serialized config, or both?
- How are connections established — does the controller push connection instructions to nodes, or do nodes pull from a known address?
 - Drop policy for completeness queues: drop oldest (recency) or drop newest (continuity)? Should be per-output configurable.
 - When a relay has multiple inputs on an encapsulated transport, how are streams tagged on the outbound side — same stream_id passthrough, or remapped?
 - What transport is used for relay edges — TCP, UDP, shared memory for local hops?