20873be786240d073e14394cae2940650dfb5a93
- README explaining experimental/transparency purpose - faster-whisper STT backend (fw-stt.mjs, faster-whisper-server.py, install-faster-whisper.sh) - Bug fixes: Buffer alignment in on_audio, --debug-waveform URL parsing, silent fetch errors, instant dispatch timer leak - Global uncaughtException/unhandledRejection handlers in query-demo.mjs - Design docs: CHANGELOG, COMMAND-DISPATCH, INTERFACE-THEORY, VOICE-POLICY - Systemd service unit templates Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Voice Pipeline Experiment
This repository is an active experiment. It is published for transparency and reference — not as a finished or production-ready project. Expect rough edges, dead ends, work-in-progress notes, and design docs that describe things not yet built.
What this is
A local voice interface for Claude Code: speak a query, it transcribes, classifies intent, and dispatches to Claude. The stack runs entirely on local hardware — no cloud STT/TTS.
Core components:
| File | Role |
|---|---|
query-demo.mjs |
Main entry point — mic → STT → query state machine → dispatch |
lib/pending-query.mjs |
Query state machine: wake word, silence timer, send/cancel/pause |
lib/stt.mjs |
Silero VAD + Whisper STT (sherpa-onnx backend) |
lib/fw-stt.mjs |
faster-whisper STT backend with word-level timestamps |
tts-server.mjs |
TTS HTTP server (Chatterbox model, voice switching) |
lib/tts-client.mjs |
HTTP client for the TTS server |
voices.yaml |
Named voice configuration |
faster-whisper-server.py |
Python subprocess for faster-whisper transcription |
install-faster-whisper.sh |
Builds ctranslate2 from source for CUDA 13 compatibility |
download-models.sh |
Downloads Whisper and VAD models |
query-demo.service / tts-server.service |
Systemd unit templates |
Status
Working but experimental. The design docs (CLEANUP-PLAN.md, COMMAND-DISPATCH.md, etc.) describe architectural directions not yet implemented. The project will eventually split into separate focused repositories.
Offshoots
Links to derived, cleaner projects will be added here as they become ready.
Requirements
- Node.js (ESM)
- Python 3 with a venv (
setup-venv.shorinstall-faster-whisper.sh) - CUDA GPU (for faster-whisper backend)
- PulseAudio or ALSA for mic capture
sherpa-onnx-nodenpm package- Chatterbox TTS model
Description
Voice pipeline experiment: local STT/TTS interface for Claude Code. Published for transparency — work in progress.
Languages
JavaScript
78.8%
Python
14.4%
Shell
6.8%