diff --git a/README.md b/README.md index 6733603..7307837 100644 --- a/README.md +++ b/README.md @@ -7,6 +7,7 @@ Designed to run on resource-constrained hardware (Raspberry Pi capturing raw MJP ## Documentation - [architecture.md](architecture.md) — system design: graph model, transport protocol, relay design, codec layer, discovery, multi-site plan, device resilience, X11 integration +- [docs/protocol.md](docs/protocol.md) — wire protocol reference: frame format, all message types, payload schemas, stream lifecycle, discovery - [planning.md](planning.md) — module build order and current status - [conventions.md](conventions.md) — C code and project conventions diff --git a/architecture.md b/architecture.md index e635e2a..1bd4c57 100644 --- a/architecture.md +++ b/architecture.md @@ -386,11 +386,11 @@ Scale and crop are applied at render time — the incoming frame is stretched or Audio streams are not in scope for the initial implementation but the transport is designed to accommodate them without structural changes. -The `channel_id` field already provides stream multiplexing on a single connection. A future audio channel is just another channel on an existing transport connection — no new connection type is needed. The message type table has room for an `audio_frame` type alongside `video_frame`. +A future audio stream is just another message type on an existing transport connection — no new connection type or header field is needed. `stream_id` in the payload already handles multiplexing. The message type table has room for an `audio_frame` type alongside `video_frame`. The main open question is codec and container: raw PCM is trivial to handle but large; compressed formats (Opus, AAC) need framing conventions. This is deferred until video is solid. -The frame allocator, relay, and archive modules should not make assumptions that `channel_id` implies video — they operate on opaque byte payloads with a message type and length, so audio frames will pass through the same infrastructure unchanged. +The frame allocator, relay, and archive modules should not assume that a frame implies video — they operate on opaque byte payloads with a message type and length, so audio frames will pass through the same infrastructure unchanged. --- diff --git a/docs/protocol.md b/docs/protocol.md new file mode 100644 index 0000000..a9348b7 --- /dev/null +++ b/docs/protocol.md @@ -0,0 +1,262 @@ +# Protocol Reference + +This document describes the full wire protocol used between nodes. All multi-byte integers are **little-endian**. Serialization primitives (`put_u16`, `get_u32`, etc.) are defined in [`serial.h`](../../include/serial.h). + +--- + +## Layers + +```mermaid +flowchart TD + A[TCP connection] --> B[Transport layer
frame header + length-prefixed payload] + B --> C[Message type dispatcher
routes payload to handler] + C --> D1[Video frame handler
stream_id + compressed data] + C --> D2[Control request/response handler
request_id + command fields] + C --> D3[Stream event handler
stream_id + event_code] +``` + +- **TCP** — reliable byte stream; provides ordering and delivery but no message boundaries +- **Transport layer** — adds a 6-byte header to every message, giving a length so any node can skip or forward unknown messages +- **Message type dispatcher** — reads `message_type` and routes the payload to the appropriate handler +- **Handlers** — interpret the payload according to their own schema; the transport layer has no knowledge of handler internals + +--- + +## Transport Layer + +### Frame Format + +Every message on the wire is a frame: + +``` ++------------------+------------------------+--------------------+ +| message_type: u16 | payload_length: u32 | payload: bytes ... | ++------------------+------------------------+--------------------+ + 2 bytes 4 bytes payload_length bytes +``` + +Total header size: **6 bytes**. + +`payload_length` is the byte count of the payload only — it does not include the 6-byte header itself. + +A node that does not recognise `message_type` can skip the frame by consuming exactly `payload_length` bytes and discarding them. This allows relays and future nodes to forward or ignore unknown message types without understanding their structure. + +### Byte Order + +All fields are little-endian. This applies to the header and to all fields within payloads. + +### Connection Model + +Each TCP connection carries a single logical channel between two peers. Multiple streams (video, control, events) are multiplexed on the same connection by `message_type` and by stream/request identifiers inside payloads. There is no connection-level stream identifier in the header — that information belongs to the payload. + +--- + +## Message Types + +| Value | Name | Description | +|---|---|---| +| `0x0001` | `VIDEO_FRAME` | One compressed video frame for a stream | +| `0x0002` | `CONTROL_REQUEST` | Request from one node to another | +| `0x0003` | `CONTROL_RESPONSE` | Response to a prior control request | +| `0x0004` | `STREAM_EVENT` | Lifecycle signal for a stream (interrupted, resumed) | +| `0x0010` | `DISCOVERY_ANNOUNCE` | UDP multicast node announcement (see [Discovery](#discovery)) | + +Values not listed are reserved. A node receiving an unknown type must skip the payload (`payload_length` bytes) and continue reading. + +--- + +## Payload Schemas + +### `VIDEO_FRAME` (0x0001) + +``` ++---------------+----------------------------------------------+ +| stream_id: u16 | frame data (compressed, codec per stream_open) | ++---------------+----------------------------------------------+ + 2 bytes payload_length - 2 bytes +``` + +`stream_id` identifies which video stream this frame belongs to. The codec is established at stream open time (see [Stream Lifecycle](#stream-lifecycle)) and does not appear in every frame. + +### `CONTROL_REQUEST` (0x0002) + +``` ++------------------+----------------+---------------------------+ +| request_id: u16 | command: u16 | command-specific fields | ++------------------+----------------+---------------------------+ + 2 bytes 2 bytes remaining bytes +``` + +`request_id` is chosen by the sender and echoed in the matching `CONTROL_RESPONSE`. It is used to correlate responses to requests when multiple requests are in flight simultaneously. + +`command` values: + +| Value | Command | Description | +|---|---|---| +| `0x0001` | `STREAM_OPEN` | Open a new video stream on this connection | +| `0x0002` | `STREAM_CLOSE` | Close a video stream | +| `0x0003` | `ENUM_DEVICES` | List V4L2 devices on the remote node | +| `0x0004` | `ENUM_CONTROLS` | List V4L2 controls for a device | +| `0x0005` | `GET_CONTROL` | Get a V4L2 control value | +| `0x0006` | `SET_CONTROL` | Set a V4L2 control value | +| `0x0007` | `ENUM_MONITORS` | List X11 monitors (XRandR) on the remote node | + +### `CONTROL_RESPONSE` (0x0003) + +``` ++------------------+---------------+---------------------------+ +| request_id: u16 | status: u16 | response-specific fields | ++------------------+---------------+---------------------------+ + 2 bytes 2 bytes remaining bytes +``` + +`request_id` matches the originating request. `status` values: + +| Value | Meaning | +|---|---| +| `0x0000` | OK | +| `0x0001` | Error — generic failure | +| `0x0002` | Error — unknown command | +| `0x0003` | Error — invalid parameters | +| `0x0004` | Error — resource not found | + +### `STREAM_EVENT` (0x0004) + +``` ++---------------+------------------+---------------------------+ +| stream_id: u16 | event_code: u8 | event-specific fields | ++---------------+------------------+---------------------------+ + 2 bytes 1 byte remaining bytes +``` + +`event_code` values: + +| Value | Name | Meaning | +|---|---|---| +| `0x01` | `STREAM_INTERRUPTED` | Device lost; frames will stop. Receiver should reset parser state and discard any partial frame. | +| `0x02` | `STREAM_RESUMED` | Device recovered; a clean frame follows. | + +--- + +## Stream Lifecycle + +Before video frames can flow, a stream must be opened. This establishes the codec and pixel format so receivers can decode frames without per-frame metadata. + +### Opening a Stream (`STREAM_OPEN` request) + +The sender issues a `CONTROL_REQUEST` with command `STREAM_OPEN`: + +``` ++------------------+------------------+-----------------+------------------+-------------------+----------------+ +| request_id: u16 | command: u16 | stream_id: u16 | format: u16 | pixel_format: u16 | origin: u16 | ++------------------+------------------+-----------------+------------------+-------------------+----------------+ +``` + +| Field | Description | +|---|---| +| `stream_id` | Chosen by the sender; identifies this stream on subsequent `VIDEO_FRAME` and `STREAM_EVENT` messages | +| `format` | Wire format of the compressed frame data (see [Codec Formats](#codec-formats)) | +| `pixel_format` | Pixel layout for raw formats; zero for compressed formats (see [Pixel Formats](#pixel-formats)) | +| `origin` | How the frames were produced; informational only, does not affect decoding (see [Origins](#origins)) | + +The receiver responds with `CONTROL_RESPONSE`. On `status = OK` the stream is open and `VIDEO_FRAME` messages with that `stream_id` may follow immediately. + +### Closing a Stream (`STREAM_CLOSE` request) + +``` ++------------------+------------------+-----------------+ +| request_id: u16 | command: u16 | stream_id: u16 | ++------------------+------------------+-----------------+ +``` + +After a close response, no further `VIDEO_FRAME` messages should be sent for that `stream_id`. + +--- + +## Codec Formats + +Carried in the `format` field of `STREAM_OPEN`. + +| Value | Format | +|---|---| +| `0x0001` | MJPEG | +| `0x0002` | H.264 | +| `0x0003` | H.265 / HEVC | +| `0x0004` | AV1 | +| `0x0005` | FFV1 | +| `0x0006` | ProRes | +| `0x0007` | QOI | +| `0x0008` | Raw pixels (requires `pixel_format`) | +| `0x0009` | Raw pixels + ZSTD (requires `pixel_format`) | + +## Pixel Formats + +Carried in the `pixel_format` field of `STREAM_OPEN`. Zero and ignored for compressed formats. + +| Value | Layout | +|---|---| +| `0x0001` | BGRA 8:8:8:8 | +| `0x0002` | RGBA 8:8:8:8 | +| `0x0003` | BGR 8:8:8 | +| `0x0004` | YUV 4:2:0 planar | +| `0x0005` | YUV 4:2:2 packed | + +## Origins + +Carried in the `origin` field of `STREAM_OPEN`. Informational — does not affect decoding. + +| Value | Origin | +|---|---| +| `0x0001` | Device native (camera or capture card encoded it directly) | +| `0x0002` | libjpeg-turbo | +| `0x0003` | ffmpeg (libavcodec) | +| `0x0004` | ffmpeg (subprocess) | +| `0x0005` | VA-API direct | +| `0x0006` | NVENC direct | +| `0x0007` | Software (other) | + +--- + +## Discovery + +Node discovery uses UDP multicast. The wire format is a standard transport frame (same 6-byte header) sent to the multicast group rather than a TCP peer. + +### Announcement Frame (`DISCOVERY_ANNOUNCE`, 0x0010) + +Sent periodically by every node and immediately on startup. + +``` ++----------------------+------------+-----------------+------------------+-------------------+------------------+ +| protocol_version: u8 | site_id: u16 | tcp_port: u16 | function_flags: u16 | name_len: u8 | name: bytes | ++----------------------+------------+-----------------+------------------+-------------------+------------------+ +``` + +| Field | Description | +|---|---| +| `protocol_version` | Wire format version; currently `1` | +| `site_id` | Site this node belongs to; `0` = local / unassigned | +| `tcp_port` | Port where this node accepts transport connections | +| `function_flags` | Bitfield of node capabilities (see below) | +| `name_len` | Byte length of the following name string | +| `name` | Node name in `namespace:instance` form, e.g. `v4l2:microscope` | + +`function_flags` bits: + +| Bit | Mask | Role | +|---|---|---| +| 0 | `0x0001` | Source — produces video | +| 1 | `0x0002` | Relay — receives and distributes streams | +| 2 | `0x0004` | Sink — consumes video | +| 3 | `0x0008` | Controller — has a user-facing control interface | + +A node may set multiple bits. + +### Multicast Parameters + +| Parameter | Value | +|---|---| +| Group | `224.0.0.251` | +| Port | `5353` | +| TTL | `1` (LAN only) | + +No Avahi or Bonjour dependency — nodes open a raw UDP multicast socket directly using standard POSIX APIs.