f6ff8c72e854f02d36e7cfeb3e1688bdba296d59
Replace stdin/stdout JSON line protocol with a stdlib HTTP server
(ThreadingHTTPServer). Three endpoints: POST /speak, /chime, /preload.
All return {"status": "ok"} after audio is queued for playback.
TTS generation is serialized via a threading.Lock; concurrent chime/preload
requests are handled without waiting for generation.
Add examples/speak.mjs, chime.mjs, voice-clone.mjs using Node.js built-in
fetch (no libraries required, Node 18+).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Text to voice interface
Overview
This project aims to provide text to voice with voice cloning ability. It is using chatterbox as backend.
Origin
This project started as a vibe-coded experiment but this version is somewhat more hands on.
Setup
Setup venv for python
Run setup-venv.sh.
Note
The default location is a directory called
venvthat is created next to the script, but you can override it by using the environment variablePYTHON_ENVto point to a different location.PYTHON_ENV='/some/path' ./setup-venv.sh
Environment
| Variable | Purpose |
|---|---|
HF_TOKEN_FILE |
Used to resolve a file for the HF_TOKEN secret that is used to download models from Hugging Face. If it is not set it defaults to ~/.secrets/hugging-face.token. |
HF_HUB_CACHE |
Location for hugging face model cache, defaults to ~/.cache/huggingface/hub. |
Description
Languages
Python
95.5%
Shell
4.5%