mikael-lovqvists-claude-agent/claude-voice-experiment

Go to file

mikael-lovqvists-claude-agent 20873be786 Add README, faster-whisper backend, and session fixes

- README explaining experimental/transparency purpose
- faster-whisper STT backend (fw-stt.mjs, faster-whisper-server.py, install-faster-whisper.sh)
- Bug fixes: Buffer alignment in on_audio, --debug-waveform URL parsing, silent fetch errors, instant dispatch timer leak
- Global uncaughtException/unhandledRejection handlers in query-demo.mjs
- Design docs: CHANGELOG, COMMAND-DISPATCH, INTERFACE-THEORY, VOICE-POLICY
- Systemd service unit templates

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-06-07 06:39:14 +00:00

demos

Initial commit — voice pipeline experiment

2026-05-30 04:48:54 +00:00

lib

Add README, faster-whisper backend, and session fixes

2026-06-07 06:39:14 +00:00

.gitignore

Add README, faster-whisper backend, and session fixes

2026-06-07 06:39:14 +00:00

acting-demo-bark.mjs

Initial commit — voice pipeline experiment

2026-05-30 04:48:54 +00:00

acting-demo-chatterbox.mjs

Initial commit — voice pipeline experiment

2026-05-30 04:48:54 +00:00

acting-demo.mjs

Initial commit — voice pipeline experiment

2026-05-30 04:48:54 +00:00

bark-server.py

Initial commit — voice pipeline experiment

2026-05-30 04:48:54 +00:00

CHANGELOG.md

Add README, faster-whisper backend, and session fixes

2026-06-07 06:39:14 +00:00

chatterbox-server.py

Add Pending_Query class and voice interaction improvements

2026-05-31 03:59:11 +00:00

CLEANUP-PLAN.md

Add README, faster-whisper backend, and session fixes

2026-06-07 06:39:14 +00:00

COMMAND-DISPATCH.md

Add README, faster-whisper backend, and session fixes

2026-06-07 06:39:14 +00:00

demo-bark.mjs

Initial commit — voice pipeline experiment

2026-05-30 04:48:54 +00:00

demo-kokoro.mjs

Initial commit — voice pipeline experiment

2026-05-30 04:48:54 +00:00

download-models.sh

Initial commit — voice pipeline experiment

2026-05-30 04:48:54 +00:00

example.sh

Add README, faster-whisper backend, and session fixes

2026-06-07 06:39:14 +00:00

faster-whisper-server.py

Add README, faster-whisper backend, and session fixes

2026-06-07 06:39:14 +00:00

install-faster-whisper.sh

Add README, faster-whisper backend, and session fixes

2026-06-07 06:39:14 +00:00

INTERFACE-THEORY.md

Add README, faster-whisper backend, and session fixes

2026-06-07 06:39:14 +00:00

listen.mjs

Initial commit — voice pipeline experiment

2026-05-30 04:48:54 +00:00

LLM-ROUTING.md

Initial commit — voice pipeline experiment

2026-05-30 04:48:54 +00:00

NOTES.md

Note Claude Code session name; update voices, presentation, todo

2026-05-30 06:49:11 +00:00

package.json

Initial commit — voice pipeline experiment

2026-05-30 04:48:54 +00:00

PRESENTATION.md

Add README, faster-whisper backend, and session fixes

2026-06-07 06:39:14 +00:00

query-demo.mjs

Add README, faster-whisper backend, and session fixes

2026-06-07 06:39:14 +00:00

query-demo.service

Add README, faster-whisper backend, and session fixes

2026-06-07 06:39:14 +00:00

README.md

Add README, faster-whisper backend, and session fixes

2026-06-07 06:39:14 +00:00

requirements.txt

Initial commit — voice pipeline experiment

2026-05-30 04:48:54 +00:00

setup-venv.sh

Initial commit — voice pipeline experiment

2026-05-30 04:48:54 +00:00

speak-as.mjs

Initial commit — voice pipeline experiment

2026-05-30 04:48:54 +00:00

SPEAKER-DIARIZATION.md

Initial commit — voice pipeline experiment

2026-05-30 04:48:54 +00:00

test-chime.mjs

Add Pending_Query class and voice interaction improvements

2026-05-31 03:59:11 +00:00

TODO.md

Add README, faster-whisper backend, and session fixes

2026-06-07 06:39:14 +00:00

tts-server.mjs

Add README, faster-whisper backend, and session fixes

2026-06-07 06:39:14 +00:00

tts-server.service

Add README, faster-whisper backend, and session fixes

2026-06-07 06:39:14 +00:00

voice-buddy.mjs

Add Pending_Query class and voice interaction improvements

2026-05-31 03:59:11 +00:00

VOICE-POLICY.md

Add README, faster-whisper backend, and session fixes

2026-06-07 06:39:14 +00:00

VOICES.md

Note Claude Code session name; update voices, presentation, todo

2026-05-30 06:49:11 +00:00

voices.yaml

Add Pending_Query class and voice interaction improvements

2026-05-31 03:59:11 +00:00

WORKFLOWS.md

Add WORKFLOWS.md — use cases and workflow descriptions

2026-05-30 05:32:32 +00:00

README.md

Voice Pipeline Experiment

This repository is an active experiment. It is published for transparency and reference — not as a finished or production-ready project. Expect rough edges, dead ends, work-in-progress notes, and design docs that describe things not yet built.

What this is

A local voice interface for Claude Code: speak a query, it transcribes, classifies intent, and dispatches to Claude. The stack runs entirely on local hardware — no cloud STT/TTS.

Core components:

File	Role
`query-demo.mjs`	Main entry point — mic → STT → query state machine → dispatch
`lib/pending-query.mjs`	Query state machine: wake word, silence timer, send/cancel/pause
`lib/stt.mjs`	Silero VAD + Whisper STT (sherpa-onnx backend)
`lib/fw-stt.mjs`	faster-whisper STT backend with word-level timestamps
`tts-server.mjs`	TTS HTTP server (Chatterbox model, voice switching)
`lib/tts-client.mjs`	HTTP client for the TTS server
`voices.yaml`	Named voice configuration
`faster-whisper-server.py`	Python subprocess for faster-whisper transcription
`install-faster-whisper.sh`	Builds ctranslate2 from source for CUDA 13 compatibility
`download-models.sh`	Downloads Whisper and VAD models
`query-demo.service` / `tts-server.service`	Systemd unit templates

Status

Working but experimental. The design docs (CLEANUP-PLAN.md, COMMAND-DISPATCH.md, etc.) describe architectural directions not yet implemented. The project will eventually split into separate focused repositories.

Offshoots

Links to derived, cleaner projects will be added here as they become ready.

Requirements

Node.js (ESM)
Python 3 with a venv (setup-venv.sh or install-faster-whisper.sh)
CUDA GPU (for faster-whisper backend)
PulseAudio or ALSA for mic capture
sherpa-onnx-node npm package
Chatterbox TTS model