Files

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-05-30 06:40:22 +00:00

Voice Clone Wishlist

Voices to prepare reference clips for.

Ready

Voice	Character / Show	Notes
Rommie	Andromeda — ship AI / android	`/home/devilholk/Documents/rommie-sample.wav` — working well
Wilford Brimley (Harold W. Smith)	Remo Williams: The Adventure Begins (1985) — head of CURE	Turned out well — gravelly authoritative delivery, dry clean speech

Voice	Character / Show	Notes
Fred Ward	Remo Williams: The Adventure Begins (1985) — Remo Williams himself	Distinctive gravelly voice, same film as Brimley so similar source quality
Alan Scarfe	TNG (Romulan), Andromeda S5 (Flavin)	Deep, authoritative voice — hunt for quiet scenes without ship hum
John Fleck	Enterprise (Silik the Suliban)	Distinctive raspy voice — Suliban scenes may have atmosphere noise
Steve Bacic	Andromeda (Telemachus Rhade)	—
Alex Diakun	Andromeda (Perseid character, name ~"Atune"), Stargate SG-1 (science role)	Prolific Vancouver sci-fi character actor, often plays scientists/scholars — distinctive voice
Tim Russ (Tuvok)	Star Trek Voyager — Tuvok; also TNG "Starship Mine" as a human mercenary	Measured, deep Vulcan delivery — try ready room/quarters scenes for less ship hum. TNG villain role offers different vocal performance from same actor.
Unknown actress	Andromeda S4/S5 (Dylan's love interest), Babylon 5 (Mars rebellion leader)	Possibly Marjorie Monaghan (Number One in B5) — unconfirmed
Claudia Christian	Babylon 5 — Commander Ivanova	Clip ready but mispronounces her own name — splits "Ivan" and "Nova" instead of correct stress (i-VA-no-va). Find a clip where she introduces herself: "Commander Susan Ivanova" or "Ivanova speaking".

Self-introduction in reference clip — include a clip of the character saying their own name, so the model learns the correct pronunciation. Many characters have distinctive names with non-obvious stress. Introductory scenes ("I'm Commander Ivanova", "The name's Remmie") are ideal for this.
Animated/cartoon voices (e.g. Darkwing Duck) don't clone well — too far outside natural human speech distribution
Compressed/heavily post-processed audio (spaceship hum, background score) degrades results even after noise reduction
OGG vs WAV quality difference is likely source quality, not encoding — soundfile handles both
Voice cloning quality scales with both clip length and emotional range — varied prosody (questions, statements, different tones) gives the model more to anchor on than flat monotone