scott 6999bcdb57
All checks were successful
Build and Push Docker Image / build (push) Successful in 2m10s
Close TCP connection after synthesis so HA receives FIN and unblocks
disconnect() was a no-op in the base class; writer.close() was never
called, leaving HA waiting for a TCP FIN that never arrived.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 19:58:54 -04:00
2025-01-10 13:37:06 -08:00

kokoro-rocm-wyoming

A Docker image running Kokoro-82M TTS on AMD GPUs via ROCm, with a Wyoming protocol server for Home Assistant integration.

Stack

Component Version
ROCm 6.1.2
PyTorch 2.5.1
Target GPU AMD RX 6700 XT (gfx1031)
Kokoro model hexgrad/Kokoro-82M
Protocol Wyoming (TCP, port 10300)

Quick start

docker compose up -d

The Wyoming server will be available at <host-ip>:10300.

Home Assistant setup

  1. In Home Assistant, go to Settings → Devices & Services → Add Integration
  2. Search for Wyoming Protocol
  3. Enter your host IP and port 10300
  4. Kokoro voices will appear in your voice assistant configuration

Configuration

Edit config.yaml before building to change the default voice, language, speed, or the list of voices advertised to Home Assistant

tts:
  device: cuda          # ROCm presents as 'cuda' to PyTorch via HIP
  language: a           # a=American English, b=British English, etc.
  default_voice: af_heart
  default_speed: 1.0
  voices:
    - name: af_heart
      description: "Heart (Female, American English)"
      language: en-us
    # add more voices here

Available language codes: a (American English), b (British English), e (Spanish), f (French), h (Hindi), i (Italian), j (Japanese), p (Portuguese), z (Mandarin).

Building

The image is built automatically by Gitea Actions on every push to main and on v* tags. To build locally:

docker build -t kokoro-rocm-wyoming .

Model weights are downloaded from HuggingFace at build time. Voice files are fetched on first use and cached in the hf_cache Docker volume.

GPU passthrough

The compose file passes through /dev/kfd and /dev/dri and adds the video and render groups. If ROCm does not detect the 6700 XT, uncomment the override in docker-compose.yml:

environment:
  - HSA_OVERRIDE_GFX_VERSION=10.3.0

Audio output

Kokoro outputs 24 kHz 16-bit mono PCM. The Wyoming server streams chunks to Home Assistant as they are generated — long utterances start playing before synthesis is complete.

License

Model weights: Apache 2.0

Description
Kokoro TTS with ROCm Docker image and Wyoming protocol server for Home Assistant
Readme Apache-2.0 29 MiB
Languages
Python 82.4%
Dockerfile 10.6%
Shell 7%