forked from efforting.tech/stt-server
Add model selection and compute type sections to README
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
29
README.md
29
README.md
@@ -15,4 +15,31 @@ This project started as a [vibe-coded](https://en.wikipedia.org/wiki/Vibe_coding
|
||||
|
||||
### Setup [venv](https://docs.python.org/3/library/venv.html) for [python](https://www.python.org/)
|
||||
|
||||
We will have two different setups here depending on if you want to build ctranslate2 locally or not. This shall be documented.
|
||||
We will have two different setups here depending on if you want to build ctranslate2 locally or not. This shall be documented.
|
||||
|
||||
|
||||
## Model selection
|
||||
|
||||
Pass `--model <name>` to `stt-server.py`. Models are downloaded automatically from HuggingFace on first use.
|
||||
|
||||
| Model | VRAM | Quality | Notes |
|
||||
|-------|------|---------|-------|
|
||||
| `base.en` | ~1 GB | Low | Default. Fast, but struggles with similar-sounding consonants (V/B/D). |
|
||||
| `small.en` | ~2 GB | Medium | Noticeable improvement over base for most speech. |
|
||||
| `medium.en` | ~5 GB | Good | Recommended starting point for production use. |
|
||||
| `large-v3` | ~10 GB | Best | Highest accuracy, use if VRAM allows. |
|
||||
|
||||
English-only models (`.en` suffix) are faster and more accurate than multilingual models for English speech.
|
||||
|
||||
|
||||
## Compute type
|
||||
|
||||
Pass `--compute-type <type>` to control the numeric precision used during inference.
|
||||
|
||||
| Type | Notes |
|
||||
|------|-------|
|
||||
| `int8_float16` | Default. Good balance of speed and accuracy on modern GPUs. |
|
||||
| `float16` | Slightly better accuracy, higher VRAM usage. |
|
||||
| `int8` | CPU-friendly, lower quality. |
|
||||
|
||||
If you see a CUDA error about mismatched library versions at startup, use `setup-venv-local-build.sh` to build ctranslate2 against your system CUDA version rather than using the PyPI wheel.
|
||||
Reference in New Issue
Block a user