Go to file

Build and Push Docker Image / build (push) Successful in 2m10s

Details

Close TCP connection after synthesis so HA receives FIN and unblocks

disconnect() was a no-op in the base class; writer.close() was never
called, leaving HA waiting for a TCP FIN that never arrived.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-04-08 19:58:54 -04:00

.gitea/workflows

Update .gitea/workflows/docker-build.yml

2026-04-08 17:54:20 -04:00

.gitignore

Switch hf_cache from Docker volume to host bind mount

2026-04-08 19:14:44 -04:00

config.yaml

Change Wyoming port from 10200 to 10300

2026-04-08 18:57:02 -04:00

docker-compose.yml

Switch hf_cache from Docker volume to host bind mount

2026-04-08 19:14:44 -04:00

Dockerfile

Change Wyoming port from 10200 to 10300

2026-04-08 18:57:02 -04:00

entrypoint.sh

Replace upstream library with ROCm/Wyoming deployment project

2026-04-08 13:30:54 -04:00

LICENSE

Initial commit

2025-01-10 13:37:06 -08:00

README.md

Change Wyoming port from 10200 to 10300

2026-04-08 18:57:02 -04:00

requirements.txt

Replace upstream library with ROCm/Wyoming deployment project

2026-04-08 13:30:54 -04:00

server.py

Close TCP connection after synthesis so HA receives FIN and unblocks

2026-04-08 19:58:54 -04:00

README.md

kokoro-rocm-wyoming

A Docker image running Kokoro-82M TTS on AMD GPUs via ROCm, with a Wyoming protocol server for Home Assistant integration.

Stack

Component	Version
ROCm	6.1.2
PyTorch	2.5.1
Target GPU	AMD RX 6700 XT (gfx1031)
Kokoro model	hexgrad/Kokoro-82M
Protocol	Wyoming (TCP, port 10300)

Quick start

docker compose up -d

The Wyoming server will be available at <host-ip>:10300.

Home Assistant setup

In Home Assistant, go to Settings → Devices & Services → Add Integration
Search for Wyoming Protocol
Enter your host IP and port 10300
Kokoro voices will appear in your voice assistant configuration

Configuration

Edit config.yaml before building to change the default voice, language, speed, or the list of voices advertised to Home Assistant

tts:
  device: cuda          # ROCm presents as 'cuda' to PyTorch via HIP
  language: a           # a=American English, b=British English, etc.
  default_voice: af_heart
  default_speed: 1.0
  voices:
    - name: af_heart
      description: "Heart (Female, American English)"
      language: en-us
    # add more voices here

Available language codes: a (American English), b (British English), e (Spanish), f (French), h (Hindi), i (Italian), j (Japanese), p (Portuguese), z (Mandarin).

Building

The image is built automatically by Gitea Actions on every push to main and on v* tags. To build locally:

docker build -t kokoro-rocm-wyoming .

Model weights are downloaded from HuggingFace at build time. Voice files are fetched on first use and cached in the hf_cache Docker volume.

GPU passthrough

The compose file passes through /dev/kfd and /dev/dri and adds the video and render groups. If ROCm does not detect the 6700 XT, uncomment the override in docker-compose.yml:

environment:
  - HSA_OVERRIDE_GFX_VERSION=10.3.0

Audio output

Kokoro outputs 24 kHz 16-bit mono PCM. The Wyoming server streams chunks to Home Assistant as they are generated — long utterances start playing before synthesis is complete.

License

Model weights: Apache 2.0