# Node Discovery and Multi-Site See [Architecture Overview](../architecture.md). ## Node Discovery Standard mDNS (RFC 6762) uses UDP multicast over `224.0.0.251:5353` with DNS-SD service records. The wire protocol is well-defined and the multicast group is already in active use on most LANs. The standard service discovery stack (Avahi, Bonjour, `nss-mdns`) provides that transport but brings significant overhead: persistent daemons, D-Bus dependencies, complex configuration surface, and substantial resident memory. None of that is needed here. The approach: **reuse the multicast transport, define our own wire format**. Rather than DNS wire format, node announcements are encoded as binary frames using the same serialization layer (`serial`) and frame header used for video transport. A node joins the multicast group, broadcasts periodic announcements, and listens for announcements from peers. ### Announcement Frame | Field | Size | Purpose | |---|---|---| | `message_type` | 2 bytes | Discovery message type (e.g. `0x0010` for node announcement) | | `channel_id` | 2 bytes | Reserved / zero | | `payload_length` | 4 bytes | Byte length of payload | | Payload | variable | Encoded node identity and capabilities | Payload fields: | Field | Type | Purpose | |---|---|---| | `protocol_version` | u8 | Wire format version | | `site_id` | u16 | Site this node belongs to (`0` = local / unassigned) | | `tcp_port` | u16 | Port where this node accepts transport connections | | `function_flags` | u16 | Bitfield declaring node capabilities (see below) | | `name_len` | u8 | Length of name string | | `name` | bytes | Node name (`namespace:instance`, e.g. `v4l2:microscope`) | `function_flags` bits: | Bit | Mask | Meaning | |---|---|---| | 0 | `0x0001` | Source — produces video | | 1 | `0x0002` | Relay — receives and distributes streams | | 2 | `0x0004` | Sink — consumes video (display, archiver, etc.) | | 3 | `0x0008` | Controller — participates in control plane coordination | A node may set multiple bits — a relay that also archives sets both `RELAY` and `SINK`. ### Behaviour - Nodes send announcements periodically (e.g. every 5 s) and immediately on startup via multicast - No daemon — the node process itself sends and listens; no background service required - On receiving an announcement the control plane records the peer (address, port, name, function) and can initiate a transport connection if needed - A node going silent for a configured number of announcement intervals is considered offline - Announcements are informational only — the hub validates identity at connection time #### Targeted replies Multicast is only used for the periodic keep-alive broadcast. When a node receives an announcement from a peer it does not yet know, or detects that a known peer has restarted (its `site_id` changed for the same address and port), it sends an **immediate unicast reply** directly back to that peer's IP address. This ensures the new or restarted peer learns about this node quickly without waiting up to `interval_ms`, while avoiding a multicast blast that would unnecessarily wake every other node on the subnet. Steady-state keepalive packets from already-known peers do not trigger any reply. ### No Avahi/Bonjour Dependency The system does not link against, depend on, or interact with Avahi or Bonjour. It opens a raw UDP multicast socket directly, which requires only standard POSIX socket APIs. This keeps the runtime dependency footprint minimal and the behaviour predictable. --- ## Multi-Site (Forward Compatibility) The immediate use case is a single LAN. A planned future use case is **site-to-site linking** — two independent networks (e.g. a lab and a remote location) connected by a tunnel (SSH port-forward, WireGuard, etc.), where nodes on both sites are reachable from either side. ### Site Identity Every node carries a `site_id` (`u16`) in its announcement. In a single-site deployment this is always `0`. When sites are joined, each site is assigned a distinct non-zero ID; nodes retain their IDs across the join and are fully addressable by `(site_id, name)` from anywhere in the combined network. This field is reserved from day one so that multi-site never requires a wire format change or a rename of existing identifiers. ### Site Gateway Node A site gateway is a node that participates in both networks simultaneously — it has a connection on the local transport and a connection over the inter-site tunnel. It: - Bridges discovery announcements between sites (rewriting `site_id` appropriately) - Forwards encapsulated transport frames across the tunnel on behalf of cross-site edges - Is itself a named node, so the control plane can see and reason about it The tunnel transport is out of scope for now. The gateway is a node type, not a special infrastructure component — it uses the same wire protocol as everything else. ### Site ID Translation Both sides of a site-to-site link will independently default to `site_id = 0`. A gateway cannot simply forward announcements across the boundary — every node on both sides would appear as site 0 and be indistinguishable. The gateway is responsible for **site ID translation**: it assigns a distinct non-zero `site_id` to each side of the link and rewrites the `site_id` field in all announcements and any protocol messages that carry a `site_id` as they cross the boundary. From each side's perspective, remote nodes appear with the translated ID assigned by the gateway; local nodes retain their own IDs. This means `site_id = 0` should be treated as "local / unassigned" and never forwarded across a site boundary without translation. A node that receives an announcement with `site_id = 0` on a cross-site link should treat it as a protocol error from the gateway. ### Addressing A fully-qualified node address is `site_id:namespace:instance`. Within a single site, `site_id` is implicit and can be omitted. The control plane and discovery layer must store `site_id` alongside every peer record from the start, even if it is always `0`, so that the upgrade to multi-site addressing requires only configuration and a gateway node — not code changes.