mikael-lovqvists-claude-agent/claude-voice-experiment

Files

mikael-lovqvists-claude-agent db8889aeed Initial commit — voice pipeline experiment

STT (Silero VAD + Whisper via sherpa-onnx), Chatterbox TTS HTTP server,
query completeness classifier (Ollama), multi-voice demo scripts, and
planning docs. Kept as reference; clean rewrite planned in separate repos.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-05-30 04:48:54 +00:00

1.9 KiB

Raw Blame History

Voice Clone Wishlist

Voices to prepare reference clips for.

Ready

Voice	Character / Show	Notes
Rommie	Andromeda — ship AI / android	`/home/devilholk/Documents/rommie-sample.wav` — working well
Wilford Brimley (Harold W. Smith)	Remo Williams: The Adventure Begins (1985) — head of CURE	Turned out well — gravelly authoritative delivery, dry clean speech

To Do

Voice	Character / Show	Notes
Fred Ward	Remo Williams: The Adventure Begins (1985) — Remo Williams himself	Distinctive gravelly voice, same film as Brimley so similar source quality
Alan Scarfe	TNG (Romulan), Andromeda S5 (Flavin)	Deep, authoritative voice — hunt for quiet scenes without ship hum
John Fleck	Enterprise (Silik the Suliban)	Distinctive raspy voice — Suliban scenes may have atmosphere noise
Steve Bacic	Andromeda (Telemachus Rhade)	—
Alex Diakun	Andromeda (Perseid character, name ~"Atune"), Stargate SG-1 (science role)	Prolific Vancouver sci-fi character actor, often plays scientists/scholars — distinctive voice
Unknown actress	Andromeda S4/S5 (Dylan's love interest), Babylon 5 (Mars rebellion leader)	Possibly Marjorie Monaghan (Number One in B5) — unconfirmed
Claudia Christian	Babylon 5 — Commander Ivanova	User already has a voice clip ready

Notes

Animated/cartoon voices (e.g. Darkwing Duck) don't clone well — too far outside natural human speech distribution
Compressed/heavily post-processed audio (spaceship hum, background score) degrades results even after noise reduction
OGG vs WAV quality difference is likely source quality, not encoding — soundfile handles both
Voice cloning quality scales with both clip length and emotional range — varied prosody (questions, statements, different tones) gives the model more to anchor on than flat monotone

1.9 KiB Raw Blame History

Voice Clone Wishlist

Ready

To Do

Notes

1.9 KiB

Raw Blame History