2af47373c4270c86aea297bc3fc25d1847d1805f
Replaces the old stdin/stdout transcription-only server. Now handles the full pipeline in Python: - Launches parec or arecord for mic capture - Runs Silero VAD (via silero-vad, already a faster-whisper dep — no sherpa-onnx needed) - Pre-roll ring buffer (0.2s) prepended to each segment for context - Transcribes with faster-whisper in a separate thread (GPU not blocking VAD) - Emits JSON line events to stdout: ready, vad_start, vad_end, transcript, error Event protocol is designed to map directly to WebSocket subscriptions later. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Voice to text interface
Overview
This project aims to provide voice to text using faster-whisper as backend.
Origin
This project started as a vibe-coded experiment but this version is somewhat more hands on.
Setup
Setup venv for python
We will have two different setups here depending on if you want to build ctranslate2 locally or not. This shall be documented.
Description
Voice to text server using [faster-whisper](https://github.com/SYSTRAN/faster-whisper)
Languages
Python
56.7%
Shell
43.3%