docs: node state model, reconciler design, graph introspection
Document the wanted/current state separation, generic resource state machine reconciler (BFS pathfinding, event + periodic tick), node state queries (GET_CONFIG_STATE / GET_RUNTIME_STATE), stream ID assignment by controller, and connection direction model. Add reconciler module to module order and reconciler_cli experiment to CLI tools table in planning.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -690,6 +690,96 @@ A fully-qualified node address is `site_id:namespace:instance`. Within a single
|
||||
|
||||
---
|
||||
|
||||
## Node State Model
|
||||
|
||||
### Wanted vs Current State
|
||||
|
||||
Each node maintains two independent views of its configuration:
|
||||
|
||||
**Wanted state** — the declared intent for this node. Set by the controller via protocol commands and persisted independently of whether the underlying resources are actually running. Examples: "ingest /dev/video0 as stream 3, send to 192.168.1.2:8001", "display stream 3 in a window". Wanted state survives connection drops, device loss, and restarts — it represents what the node *should* be doing.
|
||||
|
||||
**Current state** — what the node is actually doing right now. Derived from which file descriptors are open, which transport connections are established, which processes are running. Changes as resources are acquired or released.
|
||||
|
||||
The controller queries both views to construct the graph. Wanted state gives the topology (what is configured). Current state gives the runtime overlay (what is live, with stats).
|
||||
|
||||
This separation means the web UI can show an edge as grey (configured but not connected), green (connected and streaming), or red (configured but failed) without any special-casing — the difference is just whether wanted and current state agree.
|
||||
|
||||
### Reconciler
|
||||
|
||||
A generic reconciler closes the gap between wanted and current state. It is invoked:
|
||||
|
||||
- **On event** — transport disconnect, device error, process exit, `STREAM_OPEN` received: fast response to state changes
|
||||
- **On periodic tick** — safety net; catches external failures that produce no callback (e.g. a device that silently disappears and reappears)
|
||||
|
||||
The reconciler does not know what a "stream" or a "V4L2 device" is. It operates on abstract state machines, each representing one resource. Resources declare their states, transitions, and dependencies; the reconciler finds the path from current to wanted state and executes the transitions in order.
|
||||
|
||||
### Resource State Machines
|
||||
|
||||
Each managed resource is described as a directed graph:
|
||||
|
||||
- **Nodes** are states (e.g. `CLOSED`, `OPEN`, `STREAMING`)
|
||||
- **Edges** are transitions with associated actions (e.g. `open_fd`, `close_fd`, `connect_transport`, `spawn_process`)
|
||||
- **Dependencies** between resources constrain ordering (e.g. transport connection requires device open)
|
||||
|
||||
The state graphs are small and defined at compile time. Pathfinding is BFS — with 3–5 states per resource the cost is negligible. The benefit is that adding a new resource type (e.g. an ffmpeg subprocess for codec work) requires only defining its state graph and declaring its dependencies; the reconciler's core logic is unchanged.
|
||||
|
||||
**Example resource state graphs:**
|
||||
|
||||
V4L2 capture device:
|
||||
```
|
||||
CLOSED → OPEN → STREAMING
|
||||
```
|
||||
Transitions: `open_fd` (CLOSED→OPEN), `start_capture` (OPEN→STREAMING), `stop_capture` (STREAMING→OPEN), `close_fd` (OPEN→CLOSED).
|
||||
|
||||
Outbound transport connection:
|
||||
```
|
||||
DISCONNECTED → CONNECTING → CONNECTED
|
||||
```
|
||||
Transitions: `connect` (DISCONNECTED→CONNECTING), `connected_cb` (CONNECTING→CONNECTED), `disconnect` (CONNECTED→DISCONNECTED).
|
||||
|
||||
External codec process:
|
||||
```
|
||||
STOPPED → STARTING → RUNNING
|
||||
```
|
||||
Transitions: `spawn` (STOPPED→STARTING), `ready_cb` (STARTING→RUNNING), `kill` (RUNNING→STOPPED).
|
||||
|
||||
Dependency example: "outbound transport connection" requires "V4L2 device open". The reconciler will not attempt to connect the transport until the device is in state `OPEN` or `STREAMING`.
|
||||
|
||||
### Generic Implementation
|
||||
|
||||
The reconciler is implemented as a standalone module (`reconciler`) that is not specific to video. It operates on:
|
||||
|
||||
```c
|
||||
typedef struct {
|
||||
int state_count;
|
||||
int current_state;
|
||||
int wanted_state;
|
||||
/* transition table: [from][to] → action fn + dependency list */
|
||||
} Resource;
|
||||
```
|
||||
|
||||
This makes it reusable across any node component in the project — not just video ingest. The video node registers its resources (device, transport connection, display sink) and their dependencies, then calls `reconciler_tick()` on events and periodically.
|
||||
|
||||
### Node State Queries
|
||||
|
||||
Two protocol commands allow the controller to query a node's state:
|
||||
|
||||
**`GET_CONFIG_STATE`** — returns the wanted state: which streams the node is configured to produce or consume, their destinations/sources, format, stream ID. This is the topology view — what is configured regardless of whether it is currently active.
|
||||
|
||||
**`GET_RUNTIME_STATE`** — returns the current state: which resources are in which state, live fps/mbps per stream (from `stream_stats`), error codes for any failed resources.
|
||||
|
||||
The controller queries all discovered nodes, correlates streams by ID and peer address, and reconstructs the full graph from the union of responses. No central authority is needed — the graph emerges from the node state reports.
|
||||
|
||||
### Stream ID Assignment
|
||||
|
||||
Stream IDs are assigned by the controller, not by individual nodes. This ensures that when node A reports "I am sending stream 3 to B" and node B reports "I am receiving stream 3 from A", the IDs match and the edge can be reconstructed. Each `START_INGEST` or `START_SINK` command from the controller includes the stream ID to use.
|
||||
|
||||
### Connection Direction
|
||||
|
||||
The source node connects outbound to the sink's transport server port. A single TCP port per node is the default — all traffic (video frames, control messages, state queries) flows through it in both directions. Dedicated per-stream ports on separate listening sockets are a future option for high-bandwidth links and must be represented in state reporting so the graph reconstructs correctly regardless of which port a connection uses.
|
||||
|
||||
---
|
||||
|
||||
## Open Questions
|
||||
|
||||
- What is the graph's representation format — in-memory object graph, serialized config, or both?
|
||||
|
||||
14
planning.md
14
planning.md
@@ -58,12 +58,13 @@ Modules are listed in intended build order. Each depends only on modules above i
|
||||
| — | `node` | done | Video node binary — config, discovery, transport server, V4L2/media control request handlers |
|
||||
| 8 | `test_image` | done | Test pattern generator — colour bars, luminance ramp, grid crosshatch; YUV420/BGRA output |
|
||||
| 9 | `xorg` | done | GLFW+OpenGL viewer sink — YUV420/BGRA/MJPEG display, all scale/anchor modes, bitmap font atlas text overlays; XRandR queries and screen grab not yet implemented |
|
||||
| 10 | `frame_alloc` | not started | Per-frame allocation with bookkeeping (byte budget, ref counting) |
|
||||
| 11 | `relay` | not started | Input dispatch to output queues (low-latency and completeness modes) |
|
||||
| 12 | `ingest` | not started | V4L2 capture loop — dequeue buffers, emit one encapsulated frame per buffer |
|
||||
| 13 | `archive` | not started | Write frames to disk, control messages to binary log |
|
||||
| 14 | `codec` | not started | Per-frame encode/decode — MJPEG (libjpeg-turbo), QOI, ZSTD-raw, VA-API H.264 intra; used by screen grab source and archive |
|
||||
| 15 | `web node` | not started | Node.js/Express peer — speaks binary protocol on socket side, HTTP/WebSocket to browser; `protocol.mjs` mirrors C protocol module |
|
||||
| 10 | `reconciler` | not started | Generic wanted/current state machine reconciler — resource state graphs, BFS pathfinding, event + periodic tick; used by node to manage V4L2 devices, transport connections, and future resources (codec processes etc.) |
|
||||
| 11 | `frame_alloc` | not started | Per-frame allocation with bookkeeping (byte budget, ref counting) |
|
||||
| 12 | `relay` | not started | Input dispatch to output queues (low-latency and completeness modes) |
|
||||
| 13 | `ingest` | not started | V4L2 capture loop — dequeue buffers, emit one encapsulated frame per buffer |
|
||||
| 14 | `archive` | not started | Write frames to disk, control messages to binary log |
|
||||
| 15 | `codec` | not started | Per-frame encode/decode — MJPEG (libjpeg-turbo), QOI, ZSTD-raw, VA-API H.264 intra; used by screen grab source and archive |
|
||||
| 16 | `web node` | not started | Node.js/Express peer — speaks binary protocol on socket side, HTTP/WebSocket to browser; `protocol.mjs` mirrors C protocol module |
|
||||
| — | `mjpeg_scan` | future | EOI marker scanner for misbehaving hardware that does not deliver clean per-buffer frames; not part of the primary pipeline |
|
||||
|
||||
---
|
||||
@@ -84,6 +85,7 @@ Each module gets a corresponding CLI driver that exercises its API and serves as
|
||||
| `v4l2_view_cli` | V4L2 + `xorg` | Live camera viewer — auto-selects highest-FPS format, FPS/format overlay; bypasses node system |
|
||||
| `stream_send_cli` | V4L2 + `transport` + `protocol` | Capture MJPEG from V4L2, connect to receiver, send VIDEO_FRAME messages; prints fps/Mbps stats |
|
||||
| `stream_recv_cli` | `transport` + `protocol` + `xorg` | Listen for incoming VIDEO_FRAME stream, display in viewer; fps/Mbps overlay; threaded transport→GL handoff |
|
||||
| `reconciler_cli` | `reconciler` | Simulated state machine experiment — define resources with fake transitions, drive reconciler via CLI commands; validates the generic reconciler before wiring into the node |
|
||||
|
||||
### Web UI (`dev/web/`)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user