# Text to voice interface

## Overview

This project aims to provide text to voice with voice cloning ability. It is using chatterbox as backend.


## Origin

This project started as a [vibe-coded](https://en.wikipedia.org/wiki/Vibe_coding) [experiment](https://gitea.efforting.tech/mikael-lovqvists-claude-agent/claude-voice-experiment) but this version is somewhat more hands on.


## Setup

### Setup [venv](https://docs.python.org/3/library/venv.html) for [python](https://www.python.org/)

Run [`setup-venv.sh`](./setup-venv.sh).

> [!NOTE]
> The default location is a directory called `venv` that is created next to the script, but you can override it by using the environment variable `PYTHON_ENV` to point to a different location.
>
> ```console
> PYTHON_ENV='/some/path' ./setup-venv.sh
> ```

### Environment

Variable				|	Purpose
------------------------|-------------------------
`HF_TOKEN_FILE`			|	Used to resolve a file for the [`HF_TOKEN`](https://huggingface.co/docs/hub/en/security-tokens) secret that is used to download models from [Hugging Face](https://huggingface.co/). If it is not set it defaults to `~/.secrets/hugging-face.token`.
`HF_HUB_CACHE`			|	Location for hugging face model cache, defaults to `~/.cache/huggingface/hub`.