claude-voice-experiment

mikael-lovqvists-claude-agent/claude-voice-experiment

Author	SHA1	Message	Date
mikael-lovqvists-claude-agent	20873be786	Add README, faster-whisper backend, and session fixes - README explaining experimental/transparency purpose - faster-whisper STT backend (fw-stt.mjs, faster-whisper-server.py, install-faster-whisper.sh) - Bug fixes: Buffer alignment in on_audio, --debug-waveform URL parsing, silent fetch errors, instant dispatch timer leak - Global uncaughtException/unhandledRejection handlers in query-demo.mjs - Design docs: CHANGELOG, COMMAND-DISPATCH, INTERFACE-THEORY, VOICE-POLICY - Systemd service unit templates Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 06:39:14 +00:00
mikael-lovqvists-claude-agent	a7fa2fd218	Add Pending_Query class and voice interaction improvements - lib/pending-query.mjs: new state machine for query accumulation wake word, silence timer, send/cancel/pause/resume, instant dispatch, mode toggle (always listen / stop listening), mode query - query-demo.mjs: refactored to use Pending_Query; wake word on by default with silence timer; chimes for dispatch/working/cancel/activate - tts-server.mjs: track last_speak_at, expose /activity endpoint, chime playback via Python queue (soundfile + librosa), preload on startup - chatterbox-server.py: chime and preload commands via stdin protocol - lib/chatterbox-tts.mjs: play_chime and preload_chime methods - test-chime.mjs: simple chime test script - voices.yaml: configured ready/cancel/working/dispatch chimes - CLEANUP-PLAN.md: updated with current state, command vocabulary, future plans Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 03:59:11 +00:00
mikael-lovqvists-claude-agent	4fdad055e4	Note Claude Code session name; update voices, presentation, todo Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 06:49:11 +00:00
mikael-lovqvists-claude-agent	ecbbaef6bc	Add PRESENTATION.md and update voices/todo notes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 06:40:22 +00:00
mikael-lovqvists-claude-agent	9d2789c160	Add WORKFLOWS.md — use cases and workflow descriptions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 05:32:32 +00:00
mikael-lovqvists-claude-agent	4dad948b73	Remove chimes directory fallback — YAML only Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 05:26:40 +00:00
mikael-lovqvists-claude-agent	0a16bd6da3	Chime event → file mapping in voices.yaml YAML chimes section maps event names to arbitrary file paths. Falls back to chimes/<name>.wav or .ogg if not listed. Allows custom locations and formats without renaming files. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 05:25:47 +00:00
mikael-lovqvists-claude-agent	688549a6c3	Add chime endpoint to TTS server POST /chime {name} plays chimes/<name>.wav or .ogg via pacat. Goes through the same queue as speak so playback stays ordered. chimes/ directory holds the audio files (not committed). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 05:23:18 +00:00
mikael-lovqvists-claude-agent	8604d7ea51	Debounce speech detection — require 3 consecutive loud chunks (~150ms) A single transient noise no longer resets the silence timer. Only sustained audio energy counts as speech. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 05:12:03 +00:00
mikael-lovqvists-claude-agent	9d2ffd1b0d	Fix silence timeout never firing — gate on audio amplitude on_audio was resetting the timer on every chunk including silence, so the timeout never fired. Now passes the raw chunk and checks RMS; only resets if energy is above 0.02 (speech, not ambient silence). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 05:09:20 +00:00
mikael-lovqvists-claude-agent	778672ebfe	Prompt: instruct Claude to speak acknowledgement before long tasks Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 04:51:24 +00:00
mikael-lovqvists-claude-agent	db8889aeed	Initial commit — voice pipeline experiment STT (Silero VAD + Whisper via sherpa-onnx), Chatterbox TTS HTTP server, query completeness classifier (Ollama), multi-voice demo scripts, and planning docs. Kept as reference; clean rewrite planned in separate repos. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 04:48:54 +00:00

12 Commits