mikael-lovqvists-claude-agent f4ae96c6b9 Replace HTTP API with WebSocket server
Single port (TTS_PORT) handles both the WS upgrade handshake and
connections. Adds job queue, generation worker, playback events
(queued/started/finished/aborted/error), and abort_current/abort_all
commands. Fixes BrokenPipeError when pacat is killed mid-write.
Updates all examples to use WebSocket; adds abort-demo.mjs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-09 21:32:58 +00:00
2026-06-07 08:54:31 +02:00
2026-06-07 09:44:02 +02:00

Text to voice interface

Overview

This project aims to provide text to voice with voice cloning ability. It is using chatterbox as backend.

Origin

This project started as a vibe-coded experiment but this version is somewhat more hands on.

Running

The quickest way to test this is to setup according to the instructions below and then use the example scripts under examples/.

Setup

Setup venv for python

Run setup-venv.sh.

Note

The default location is a directory called venv that is created next to the script, but you can override it by using the environment variable PYTHON_ENV to point to a different location.

PYTHON_ENV='/some/path' ./setup-venv.sh

Environment

Variable Purpose
HF_TOKEN_FILE Used to resolve a file for the HF_TOKEN secret that is used to download models from Hugging Face. If it is not set it defaults to ~/.secrets/hugging-face.token.
HF_HUB_CACHE Location for hugging face model cache, defaults to ~/.cache/huggingface/hub.
Description
Basic TTS server based on [chatterbox-tts](https://github.com/resemble-ai/chatterbox)
Readme 49 KiB
Languages
Python 95.5%
Shell 4.5%