# Voice Pipeline Experiment This repository is an active experiment. It is published for **transparency and reference** — not as a finished or production-ready project. Expect rough edges, dead ends, work-in-progress notes, and design docs that describe things not yet built. ## What this is A local voice interface for [Claude Code](https://github.com/anthropics/claude-code): speak a query, it transcribes, classifies intent, and dispatches to Claude. The stack runs entirely on local hardware — no cloud STT/TTS. Core components: | File | Role | |------|------| | `query-demo.mjs` | Main entry point — mic → STT → query state machine → dispatch | | `lib/pending-query.mjs` | Query state machine: wake word, silence timer, send/cancel/pause | | `lib/stt.mjs` | Silero VAD + Whisper STT (sherpa-onnx backend) | | `lib/fw-stt.mjs` | faster-whisper STT backend with word-level timestamps | | `tts-server.mjs` | TTS HTTP server (Chatterbox model, voice switching) | | `lib/tts-client.mjs` | HTTP client for the TTS server | | `voices.yaml` | Named voice configuration | | `faster-whisper-server.py` | Python subprocess for faster-whisper transcription | | `install-faster-whisper.sh` | Builds ctranslate2 from source for CUDA 13 compatibility | | `download-models.sh` | Downloads Whisper and VAD models | | `query-demo.service` / `tts-server.service` | Systemd unit templates | ## Status Working but experimental. The design docs (`CLEANUP-PLAN.md`, `COMMAND-DISPATCH.md`, etc.) describe architectural directions not yet implemented. The project will eventually split into separate focused repositories. ## Offshoots Links to derived, cleaner projects will be added here as they become ready. ## Requirements - Node.js (ESM) - Python 3 with a venv (`setup-venv.sh` or `install-faster-whisper.sh`) - CUDA GPU (for faster-whisper backend) - PulseAudio or ALSA for mic capture - `sherpa-onnx-node` npm package - Chatterbox TTS model