feat: xorg text overlays, font atlas generator, v4l2_view_cli
- tools/gen_font_atlas: Python/Pillow build tool — skyline packs DejaVu Sans glyphs 32-255 into a grayscale atlas, emits build/gen/font_atlas.h with pixel data and Font_Glyph[256] metrics table - xorg: bitmap font atlas text overlay rendering (GL_R8 atlas texture, alpha-blended glyph quads, dark background rect per overlay) - xorg: add xorg_viewer_set_overlay_text / clear_overlays API - xorg: add xorg_viewer_handle_events for streaming use (events only, no redundant render) - xorg_cli: show today's date as white text overlay - v4l2_view_cli: new tool — V4L2 capture with format auto-selection (highest FPS then largest resolution), MJPEG/YUYV, measured FPS overlay - docs: update README, planning, architecture to reflect current status Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -418,7 +418,7 @@ The initial implementation uses **GLFW** for window and input management and **O
|
||||
GLFW handles window creation, the event loop, resize, and input callbacks — it also supports Vulkan surface creation using the same API, which makes a future renderer swap straightforward. Input events (keyboard, mouse) are normalised by GLFW before being encoded as protocol messages.
|
||||
|
||||
The OpenGL renderer:
|
||||
1. For **MJPEG**: calls `tjDecompressToYUV2` (libjpeg-turbo) to decompress directly to planar YUV — no CPU-side color conversion. JPEG stores YCbCr internally so this is the minimal decode path: Huffman + DCT output lands directly in YUV planes.
|
||||
1. For **MJPEG**: calls `tjDecompressToYUVPlanes` (libjpeg-turbo) to decompress directly to planar YUV — no CPU-side color conversion. JPEG stores YCbCr internally so this is the minimal decode path: Huffman + DCT output lands directly in YUV planes.
|
||||
2. Uploads Y, Cb, Cr as separate `GL_RED` textures (chroma at half resolution for 4:2:0 / 4:2:2 as delivered by most V4L2 cameras).
|
||||
3. Fragment shader samples the three planes and applies the BT.601 matrix to produce RGB — a few lines of GLSL.
|
||||
4. Scaling and filtering happen in the same shader pass.
|
||||
@@ -428,17 +428,19 @@ For **raw pixel formats** (BGRA, YUV planar from the wire): uploaded directly wi
|
||||
|
||||
This keeps CPU load minimal — the only CPU work for MJPEG is Huffman decode and DCT, which libjpeg-turbo runs with SIMD. All color conversion and scaling is on the GPU.
|
||||
|
||||
#### Text overlays (future)
|
||||
#### Text overlays
|
||||
|
||||
Two tiers are planned, implemented in order:
|
||||
Two tiers, implemented in order:
|
||||
|
||||
**Tier 1 — bitmap font atlas (initial)**
|
||||
**Tier 1 — bitmap font atlas (done)**
|
||||
|
||||
A build-time script (Python Pillow) renders glyphs from a TTF font into a packed PNG atlas and emits a metadata file (JSON or generated C header) with per-glyph UV rects and advance widths. At runtime the atlas is uploaded as a `GL_RGBA` texture and each character is rendered as a small quad, alpha-blended over the frame. Simple skyline packing keeps the atlas compact.
|
||||
`tools/gen_font_atlas/gen_font_atlas.py` (Python/Pillow) renders glyphs 32–255 from DejaVu Sans at 16pt into a packed grayscale atlas using a skyline bin packer and emits `build/gen/font_atlas.h` — a C header with the pixel data as a `static const uint8_t` array and a `Font_Glyph[256]` metrics table indexed by codepoint.
|
||||
|
||||
The generator lives in `tools/gen_font_atlas/` and runs as part of `make build`. Sufficient for ASCII overlays: timestamps, stream labels, debug info.
|
||||
At runtime the atlas is uploaded as a `GL_R8` texture. Each overlay is rendered as a batch of alpha-blended glyph quads preceded by a semi-transparent dark background rect (using a separate minimal screen-space rect shader driven by `gl_VertexID`). The public API is `xorg_viewer_set_overlay_text(v, idx, x, y, text, r, g, b)` and `xorg_viewer_clear_overlays(v)`. Up to 8 independent overlays are supported.
|
||||
|
||||
**Tier 2 — HarfBuzz + FreeType (later)**
|
||||
The generator runs automatically as a `make` dependency before compiling `xorg.c`. The Pillow build tool is the only Python dependency; there are no runtime font deps.
|
||||
|
||||
**Tier 2 — HarfBuzz + FreeType (future)**
|
||||
|
||||
A proper runtime font stack for full typography: correct shaping, kerning, ligatures, bidirectional text, non-Latin scripts. Added as a feature flag with its own runtime deps alongside the blit path.
|
||||
|
||||
@@ -446,20 +448,27 @@ When Tier 2 is implemented, the Pillow build dependency may be replaced by a pur
|
||||
|
||||
#### Render loop
|
||||
|
||||
The viewer is driven by incoming frames rather than a fixed-rate loop. The intended pattern for callers:
|
||||
The viewer is driven by incoming frames rather than a fixed-rate loop. Two polling functions are provided depending on the use case:
|
||||
|
||||
**Static image / test tool** — `xorg_viewer_poll(v)` processes events then re-renders from existing textures:
|
||||
|
||||
```c
|
||||
while (xorg_viewer_poll(v)) {
|
||||
if (new_frame_available()) {
|
||||
xorg_viewer_push_yuv420(v, ...); /* upload + render */
|
||||
while (xorg_viewer_poll(v)) { /* wait for close */ }
|
||||
```
|
||||
|
||||
**Live stream** — the push functions (`push_yuv420`, `push_mjpeg`, etc.) already upload and render. Use `xorg_viewer_handle_events(v)` to process window events without an extra render:
|
||||
|
||||
```c
|
||||
while (1) {
|
||||
/* block on V4L2/network fd until frame or timeout */
|
||||
if (frame_available) {
|
||||
xorg_viewer_push_mjpeg(v, data, size); /* upload + render */
|
||||
}
|
||||
/* no new frame → no redundant GPU work */
|
||||
if (!xorg_viewer_handle_events(v)) { break; }
|
||||
}
|
||||
```
|
||||
|
||||
`xorg_viewer_poll` calls `glfwPollEvents` which dispatches input and resize events. A `framebuffer_size_callback` registered on the window calls `render()` synchronously during the resize, so the image tracks the window edge without a one-frame lag. This avoids both a busy render loop and the latency of waiting for the next poll iteration.
|
||||
|
||||
For a static image (test tool, paused stream), `glfwWaitEventsTimeout(interval)` is a better substitute for `glfwPollEvents` — it sleeps until an event arrives or the timeout expires, eliminating idle CPU usage.
|
||||
A `framebuffer_size_callback` registered on the window calls `render()` synchronously during resize, so the image tracks the window edge without a one-frame lag.
|
||||
|
||||
Threading note: the GL context must be used from the thread that created it. In the video node, incoming frames arrive on a network receive thread. A frame queue between the receive thread and the render thread (which owns the GL context) is the correct model — the render thread drains the queue each poll iteration rather than having the network thread call push functions directly.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user