Document that immediate re-announcements go directly to the triggering peer
(unicast) rather than to the multicast group, and explain the two conditions
that trigger a reply: new peer and restarted peer (site_id change).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When we see a new peer or detect a restart (site_id change for known addr+port),
send the announcement directly to that host via unicast instead of broadcasting
to the multicast group. This avoids waking every other node on the subnet for a
reply that is only relevant to one machine.
The periodic multicast announcements continue unchanged for initial discovery.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The original unconditional cond_signal (every received packet) caused a
multicast storm: each node instantly reflected every announcement back as its
own, creating a tight loop at wire speed.
The previous fix (gate on is_new only) broke the restart case: a peer that
restarts with the same addr+port is already in the table so is_new stays 0,
meaning we'd wait up to interval_ms before that peer learned about us.
Correct fix: also signal when site_id changes for a known addr+port entry,
which reliably indicates a restart. Steady-state keepalive packets (same
site_id) no longer trigger re-announcement.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The announce thread was being woken (triggering a multicast send) on every
received announcement packet, including ones from already-known peers. With
two or more nodes this created a feedback loop: each incoming packet triggered
an outbound multicast which triggered another incoming packet on the peer, and
so on at full CPU/network speed.
Gate the cond_signal on is_new so we only fast-announce when a genuinely new
peer is seen. The periodic interval-based announcement continues to handle
keepalives and reconnections for existing peers.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
scale is scale_mode, zoom_factor is a separate multiplier — they compose
rather than conflict.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Current anchor system only handles fixed alignment; free mode needed
for arbitrary pan offset + zoom level, e.g. microscope inspection.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add docs/cli/ entries for:
transport_cli, discovery_cli, config_cli, protocol_cli, query_cli,
test_image_cli, xorg_cli, v4l2_view_cli, stream_send_cli,
stream_recv_cli, reconciler_cli, controller_cli
Each doc covers: description, build instructions, full usage with all
options and defaults, example output, and a relationship note pointing
to related tools. controller_cli includes the display control IDs table
and notes its temporary status.
README.md: convert all CLI tool entries to links.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Display controls (enum/get/set):
- Add PROTO_DISPLAY_CTRL_SCALE/ANCHOR/NO_SIGNAL_FPS constants to protocol.h
- handle_enum_controls: if device index maps to an active display slot,
return the three display controls (scale, anchor, no_signal_fps)
- handle_get_control: read display control values from slot under mutex
- handle_set_control: write display control values to slot under mutex;
scale/anchor are applied to the viewer by display_loop_tick each tick
Device IDs in enum-devices output:
- Proto_Display_Device_Info gains device_id field (wire format +2 bytes)
- handle_enum_devices computes device_id = total_v4l2 + display_index
- on_video_node/on_standalone callbacks take int* userdata to print [idx]
- on_display prints [device_id] from the wire field
Bug fix — protocol error on invalid device index:
- proto_read_enum_controls_response: early-return APP_OK after reading
status if status != OK; error responses have no count/data fields, so
the CUR_CHECK on count was failing with "payload too short"
Helpers added to main.c:
- count_v4l2_devices(): sum of media vnodes + standalone
- find_display_by_device_idx(): maps flat index to Display_Slot
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Makefile:
- Add reconciler and ingest to the `modules` target; they were only built
as side-effects of `make node`, making `make modules` incomplete
planning.md:
- Add 4 missing CLI drivers: discovery_cli, config_cli, protocol_cli,
query_cli (all existed in code and dev/cli/Makefile but were absent)
- Add header-only utilities table: stream_stats.h, v4l2_fmt.h
README.md:
- Add transport_cli, discovery_cli, config_cli, protocol_cli, query_cli
to CLI tools list
conventions.md:
- Add ERR_NOT_FOUND to Error_Code enum example
- Replace placeholder Invalid_Error_Detail with actual fields
(config_line, message) that have been in use since config module
- Add missing error macros: APP_INVALID_ERROR, APP_INVALID_ERROR_MSG,
APP_NOT_FOUND_ERROR
- Update directory structure: node/ description (was "later"), add web/
and tools/ entries
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The long-term replacement is a dedicated controller binary outside dev/cli
that maintains simultaneous connections to all discovered nodes and addresses
commands by peer index — mirroring the web UI model rather than the current
single-active-connection design.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
transport_conn_close previously called close(conn->fd), but the detached
read thread also calls close(conn->fd) when it exits. If the kernel reused
the fd number before the read thread ran, the thread's close() would hit
the new connection — explaining connections that appeared to not terminate.
Fix: use shutdown(SHUT_RDWR) instead. This signals EOF to the remote end
and unblocks the blocked read() without releasing the fd. The read thread
remains the sole owner of the fd and is the only one to call close().
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- controller_cli: drain semaphore and reset pending_cmd in do_connect
so stale posts from old connection don't unblock the next command
- protocol: add Proto_Display_Device_Info; extend
proto_write_enum_devices_response and proto_read_enum_devices_response
with display section; backward-compatible (absent in older messages)
- node: handle_enum_devices snapshots active Display_Slots under mutex
and includes them in the response
- controller_cli: on_display callback prints display window info in
enum-devices output
- query_cli: updated to pass NULL on_display (no display interest)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace double wall-time with uint64_t monotonic milliseconds for
last_frame_ms and last_no_signal_ms. Integer ms is the right type
for a threshold comparison — no floating point needed.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The loop runs at ~200Hz; frames arrive at ~30fps. Most iterations have no
pending frame even during active streaming, so no-signal was rendering
between real frames. Fix: track last_frame_t and suppress no-signal while
a live stream is present (< 1s since last frame).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CLOCK_MONOTONIC returns seconds since boot (~50000+s on a running system).
At that magnitude, float32 loses fractional precision in the hash function
and all cells evaluate to near-zero, producing a black screen instead of noise.
Wrapping to fmod(now, 1000.0) keeps the value small enough for the shader.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two nodes on the same host with the same name (e.g. unnamed:0) would
collide — the second announcement just updated the first entry's port.
Peer identity is addr+port; name is metadata, not identity.
Same fix applied to the self-skip check.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- readline replaces fgets — line editing and command history
- Discovery runs at startup (always); discovered peers print inline as they appear
- --host is now optional; without it, starts in discovery-only mode
- New REPL commands:
peers list discovered nodes with index
connect connect to first discovered peer
connect <idx> connect to peer by index
connect <host:port> connect directly
- connect switching closes the old connection before opening the new one
- Commands that require a connection print "not connected" when conn is NULL
- Makefile: add $(DISCOVERY_OBJ) and -lreadline to controller_cli link
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When a viewer window has no incoming stream, renders animated analog-TV
noise (hash-based, scanlines, phosphor tint) at configurable fps (default
15) with a centred "NO SIGNAL" text overlay.
- xorg: FRAG_NOSIGNAL_SRC shader + xorg_viewer_render_no_signal(v, time, noise_res)
- main: Display_Slot gains no_signal_fps + last_no_signal_t; display_loop_tick
drives no-signal render on idle slots via clock_gettime rate limiting
- protocol: START_DISPLAY extended by 2 bytes — no_signal_fps (0=default 15)
+ reserved; reader is backward-compatible (defaults 0 if length < 18)
- controller_cli: no_signal_fps optional arg on start-display
- docs: protocol.md updated with new field
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add missing modules (config, discovery, reconciler, ingest) and update
node description from "later" to reflect its current done state.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- protocol.md: add START_DISPLAY (0x000A) and STOP_DISPLAY (0x000B) wire
schemas and field descriptions; add both to command table
- xorg.md: add 'Multiple windows' section covering glfwPollEvents global
behaviour, per-context glfwMakeContextCurrent requirement, and
glfwInit/glfwTerminate ref-counting; includes the gotcha that
short-circuiting the event loop can starve non-polled windows
- planning.md: add cooperative capture release deferred decision;
add xorg viewer remote controls (zoom, pan, scale, future shader
post-processing) to deferred decisions; note xorg viewer controls
not yet exposed remotely in module table
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Protocol:
- Add PROTO_CMD_START_DISPLAY (0x000A) and PROTO_CMD_STOP_DISPLAY (0x000B)
with write/read functions; Proto_Start_Display carries stream_id, window
position/size, scale and anchor; PROTO_DISPLAY_SCALE_*/ANCHOR_* constants
Node display sink:
- Display_Slot struct with wanted_state/current_state (DISP_CLOSED/DISP_OPEN);
handlers set wanted state, display_loop_tick on main thread reconciles
- Up to MAX_DISPLAYS (4) simultaneous viewer windows
- on_frame routes incoming VIDEO_FRAME messages to matching display slot;
transport thread deposits payload, main thread consumes without holding lock
during JPEG decode/upload
- Main thread runs GL event loop when xorg is available; headless fallback
joins reconciler timer thread as before
Xorg multi-window:
- Ref-count glfwInit/glfwTerminate via glfw_acquire/glfw_release so closing
one viewer does not terminate GLFW for remaining windows
- Add glfwMakeContextCurrent before GL calls in push_yuv420, push_bgra,
push_mjpeg and poll so each viewer uses its own GL context correctly
Transport random port:
- Bind port 0 lets the OS assign a free port; getsockname reads it back
into server->bound_port after bind
- Add transport_server_get_port() accessor
- Default tcp_port changed from 8000 to 0 (random); node prints actual
port after server start so it is always visible in output
- Add --port PORT CLI override (before config-file argument)
controller_cli:
- Add start-display and stop-display commands
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add 'force' phony prerequisite to all sub-make delegation rules in
dev/cli/Makefile and src/node/Makefile so the sub-make is always
invoked and can check source timestamps itself; previously a stale
.o would never be rebuilt by a dependent Makefile
- Move stream_stats_record_frame inside the successful send branch in
on_ingest_frame so stats reflect actual delivered frames rather than
capture throughput; avoids misleading Mbps readings when the
transport is disconnected
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Connects to a running video node by host:port. Supports:
enum-devices, enum-controls, get-control, set-control,
start-ingest, stop-ingest
Uses semaphore-based request/response synchronisation (same pattern as
query_cli). start-ingest maps directly to the new START_INGEST protocol
command with optional format/size/fps args; defaults to auto-select.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Each ingest stream gets two reconciler resources (device, transport) with
dependencies: transport waits for device OPEN (needs format for STREAM_OPEN),
device waits for transport CONNECTED before starting capture.
START_INGEST sets wanted state and triggers a tick; the reconciler drives
device CLOSED→OPEN→STREAMING and transport DISCONNECTED→CONNECTED over
subsequent ticks. STOP_INGEST reverses both.
External events (transport drop, ingest thread error) use
reconciler_force_current to push state backward; the periodic 500ms timer
thread re-drives toward wanted state automatically.
All 8 stream slots are pre-allocated at startup. on_ingest_frame sends
VIDEO_FRAME messages over the outbound transport connection, protected by
a per-stream conn_mutex.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
START_INGEST carries stream_id, format/width/height/fps, dest_host:port,
transport_mode (encapsulated or opaque), and device_path. All format fields
default to 0 (auto-select). STOP_INGEST carries stream_id only.
Both commands set wanted state on the node; reconciliation is asynchronous.
Protocol doc updated with wire schemas for both commands.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
reconciler: generic resource state machine — BFS pathfinding from current
to wanted state, dependency constraints, event/periodic tick model.
reconciler_cli exercises it with simulated device/transport/stream resources.
ingest: V4L2 capture module — open device, negotiate MJPEG format, MMAP
buffer pool, capture thread with on_frame callback. start/stop lifecycle
designed for reconciler management. Transport-agnostic: caller wires
on_frame to proto_write_video_frame.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
architecture.md is now a concise overview (~155 lines) with a
Documentation section linking to all sub-docs.
New sub-docs in docs/:
transport.md — wire modes, frame header, serialization, web peer
relay.md — delivery modes, memory model, congestion, scheduler
codec.md — stream metadata, format negotiation, codec backends
xorg.md — screen grab, viewer sink, render loop, overlays
discovery.md — multicast announcements, multi-site, site gateways
node-state.md — wanted/current state, reconciler, stats, queries
device-resilience.md — device loss handling, stream events, audio (future)
All cross-references updated to file links. Every sub-doc links back
to architecture.md. docs/transport.md links to docs/protocol.md.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Graph representation is plain ESM objects in the web interface.
No special format needed. Graph reconstruction, topology diffing,
and layout logic belong in ESM rather than C. Future TUI/CLI tools
reuse the same ESM libraries via Node.js.
No open questions remain.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Drop policy (per-output configurable), stream ID passthrough at relay,
TCP-only transport for now, soft byte budget limits with hysteresis,
and relay scheduler (strict priority first, pluggable interface) were
all already decided — move them out of Open Questions.
Only genuinely open question remaining: graph representation format.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add 'Declarative, Not Imperative' section near top explaining why the
control model is wanted-state-based rather than imperative commands
- Update Control Plane section: remove 'connection instructions' language,
replace with wanted state; note CLI controller comes before web UI
- Fix node naming example: xorg:preview instead of mpv:preview
- Update ingestion diagram: 'wanted state' instead of 'connection config'
- Add Per-Stream Stats note (stream_stats.h) to Node State Model
- Mark GET_CONFIG_STATE / GET_RUNTIME_STATE as planned, not yet implemented
- Split Open Questions: add Decided section for resolved questions
(connection direction, stream ID assignment, single port, first delivery mode)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Document the wanted/current state separation, generic resource state
machine reconciler (BFS pathfinding, event + periodic tick), node state
queries (GET_CONFIG_STATE / GET_RUNTIME_STATE), stream ID assignment
by controller, and connection direction model.
Add reconciler module to module order and reconciler_cli experiment
to CLI tools table in planning.md.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add stream_send_cli (V4L2 capture → TCP → VIDEO_FRAME) and
stream_recv_cli (TCP → threaded frame slot → GLFW display) to
exercise end-to-end streaming between two nodes on the same machine
or across the network.
Add include/stream_stats.h (header-only rolling-window fps/Mbps tracker)
and include/v4l2_fmt.h (header-only V4L2 format enumeration shared between
v4l2_view_cli and stream_send_cli). Refactor v4l2_view_cli to use the
shared header.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- tools/gen_font_atlas: Python/Pillow build tool — skyline packs DejaVu
Sans glyphs 32-255 into a grayscale atlas, emits build/gen/font_atlas.h
with pixel data and Font_Glyph[256] metrics table
- xorg: bitmap font atlas text overlay rendering (GL_R8 atlas texture,
alpha-blended glyph quads, dark background rect per overlay)
- xorg: add xorg_viewer_set_overlay_text / clear_overlays API
- xorg: add xorg_viewer_handle_events for streaming use (events only,
no redundant render)
- xorg_cli: show today's date as white text overlay
- v4l2_view_cli: new tool — V4L2 capture with format auto-selection
(highest FPS then largest resolution), MJPEG/YUYV, measured FPS overlay
- docs: update README, planning, architecture to reflect current status
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
make — builds with glfw, vulkan, turbojpeg, xorg, vaapi
make FEATURES= — headless build with no optional dependencies
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
libjpeg-turbo can decompress directly to planar YUV, bypassing CPU-side
color conversion entirely. Document the precise pipeline: separate Y/Cb/Cr
GL_RED textures, BT.601 matrix in fragment shader, SIMD Huffman+DCT on CPU only.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace XShmPutImage approach with GLFW+OpenGL as the initial renderer.
Documents the two-renderer plan: GLFW handles window/input for both;
only the rendering backend differs. Notes that both renderers should
conform to the same internal interface for swappability.
Adds input event forwarding (keyboard/mouse → INPUT_EVENT upstream)
as a first-class capability of the viewer sink.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Documents multi-input scheduling as a distinct concern from delivery
mode. Covers strict priority, round-robin, weighted round-robin, deficit
round-robin, and source suppression policies. Notes that the relay module
should expose a pluggable scheduler interface. Adds scheduler policy
selection to Open Questions.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
README status table was showing transport/discovery/protocol/node as
"not started" when all are done. Added summary sentence, notes column,
config and dev/web rows. Fixed dev/web structure description.
planning.md: removed stale "prerequisite" note about web UI — it is
already implemented and working.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>