Files
kokoro/README.md
Scott Garren 0614418dd4
Some checks failed
Build and Push Docker Image / build (push) Failing after 11m10s
Update README.md
2026-04-08 17:39:04 -04:00

75 lines
2.3 KiB
Markdown

# kokoro-rocm-wyoming
A Docker image running [Kokoro-82M](https://huggingface.co/hexgrad/Kokoro-82M) TTS on AMD GPUs via ROCm, with a [Wyoming protocol](https://github.com/rhasspy/wyoming) server for Home Assistant integration.
## Stack
| Component | Version |
|-----------|---------|
| ROCm | 6.1.2 |
| PyTorch | 2.5.1 |
| Target GPU | AMD RX 6700 XT (gfx1031) |
| Kokoro model | hexgrad/Kokoro-82M |
| Protocol | Wyoming (TCP, port 10200) |
## Quick start
```bash
docker compose up -d
```
The Wyoming server will be available at `<host-ip>:10200`.
## Home Assistant setup
1. In Home Assistant, go to **Settings → Devices & Services → Add Integration**
2. Search for **Wyoming Protocol**
3. Enter your host IP and port `10200`
4. Kokoro voices will appear in your voice assistant configuration
## Configuration
Edit `config.yaml` before building to change the default voice, language, speed, or the list of voices advertised to Home Assistant
```yaml
tts:
device: cuda # ROCm presents as 'cuda' to PyTorch via HIP
language: a # a=American English, b=British English, etc.
default_voice: af_heart
default_speed: 1.0
voices:
- name: af_heart
description: "Heart (Female, American English)"
language: en-us
# add more voices here
```
Available language codes: `a` (American English), `b` (British English), `e` (Spanish), `f` (French), `h` (Hindi), `i` (Italian), `j` (Japanese), `p` (Portuguese), `z` (Mandarin).
## Building
The image is built automatically by Gitea Actions on every push to `main` and on `v*` tags. To build locally:
```bash
docker build -t kokoro-rocm-wyoming .
```
Model weights are downloaded from HuggingFace at build time. Voice files are fetched on first use and cached in the `hf_cache` Docker volume.
## GPU passthrough
The compose file passes through `/dev/kfd` and `/dev/dri` and adds the `video` and `render` groups. If ROCm does not detect the 6700 XT, uncomment the override in `docker-compose.yml`:
```yaml
environment:
- HSA_OVERRIDE_GFX_VERSION=10.3.0
```
## Audio output
Kokoro outputs 24 kHz 16-bit mono PCM. The Wyoming server streams chunks to Home Assistant as they are generated — long utterances start playing before synthesis is complete.
## License
Model weights: [Apache 2.0](https://huggingface.co/hexgrad/Kokoro-82M)