From 8260d456aa37968f7c7317e63d180a94ed87fed7 Mon Sep 17 00:00:00 2001 From: mikael-lovqvists-claude-agent Date: Wed, 25 Mar 2026 22:47:06 +0000 Subject: [PATCH] Add ffmpeg as codec backend; extend codec ID table with archival formats 0x01xx range reserved for ffmpeg-backed formats (H.265, AV1, FFV1, ProRes). Documents libavcodec vs subprocess trade-offs: subprocess suits archival completeness paths, libavcodec suits low-latency encode. Receiver only cares about wire format, not which encoder produced it. Co-Authored-By: Claude Sonnet 4.6 --- architecture.md | 28 +++++++++++++++++++++++++++- 1 file changed, 27 insertions(+), 1 deletion(-) diff --git a/architecture.md b/architecture.md index 5519ba5..823fb2c 100644 --- a/architecture.md +++ b/architecture.md @@ -256,8 +256,12 @@ Receivers must know what format a frame payload is in. This is communicated at s | `0x0002` | QOI — lossless, single-header implementation, fast; good for screen content | | `0x0003` | Raw pixels + ZSTD — lossless; raw BGRA/RGBA compressed with ZSTD at a low level | | `0x0004` | H.264 intra — single I-frames via VA-API hardware encode; high compression, GPU required | +| `0x0100` | H.265 / HEVC — via ffmpeg (libavcodec or subprocess); hardware or software encode | +| `0x0101` | AV1 — via ffmpeg; best compression, hardware encode on modern GPUs | +| `0x0102` | FFV1 — via ffmpeg; lossless archival format | +| `0x0103` | ProRes — via ffmpeg; near-lossless, post-production compatible | -V4L2 camera streams typically arrive pre-encoded as MJPEG from hardware; no encode step is needed on that path. The codec module is primarily used by the screen grab source. +V4L2 camera streams typically arrive pre-encoded as MJPEG from hardware; no encode step is needed on that path. The `0x01xx` range is reserved for ffmpeg-backed formats; the receiver cares only about the wire format, not which encoder produced it. ### Format Negotiation @@ -279,6 +283,28 @@ ZSTD at compression level 1 is extremely fast and can achieve meaningful ratios Intra-only H.264 via VA-API gives very high compression with GPU offload. This is the most complex option to set up and introduces a GPU dependency, but may be worthwhile for high-resolution grabs over constrained links. Deferred until simpler codecs are validated. +### ffmpeg Backend + +ffmpeg (via libavcodec or subprocess) is a practical escape hatch that gives access to a large number of codecs, container formats, and hardware acceleration paths without implementing them from scratch. It is particularly useful for archival formats where the encode latency of a more complex codec is acceptable. + +**Integration options:** + +- **libavcodec** — link directly against the library; programmatic API, tight integration, same process; introduces a large build dependency but gives full control over codec parameters and hardware acceleration (NVENC, VA-API, VideoToolbox, etc.) +- **subprocess pipe** — spawn `ffmpeg`, pipe raw frames to stdin, read encoded output from stdout; simpler, no build dependency, more isolated from the rest of the node process; latency is higher due to process overhead but acceptable for archival paths where real-time delivery is not required + +The subprocess approach fits naturally into the completeness output path of the relay: frames arrive in order, there is no real-time drop pressure, and the ffmpeg process can be restarted independently if it crashes without taking down the node. libavcodec is the better fit for low-latency encoding (e.g. screen grab over a constrained link). + +**Archival formats of interest:** + +| Format | Notes | +|---|---| +| H.265 / HEVC | ~50% better compression than H.264 at same quality; NVENC and VA-API hardware support widely available | +| AV1 | Best open-format compression; software encode is slow, hardware encode (AV1 NVENC on RTX 30+) is fast | +| FFV1 | Lossless, designed for archival; good compression for video content; the format used by film archives | +| ProRes | Near-lossless, widely accepted in post-production toolchains; large files but easy to edit downstream | + +The codec identifier table uses the `0x01xx` range for ffmpeg-backed formats to distinguish them from native implementations. The actual format is fixed at stream open time via `stream_open` — the receiver does not need to know whether the encoder is libavcodec or a native implementation, only what the wire format is. + --- ## X11 / Xorg Integration