Switch to ONNX runtime with chatterbox-turbo-ONNX (fp16)
Some checks failed
Build ROCm Image / build (push) Has been cancelled
Some checks failed
Build ROCm Image / build (push) Has been cancelled
Replaces the PyTorch/chatterbox-tts stack with direct ONNX inference using ResembleAI/chatterbox-turbo-ONNX fp16 weights. - engine.py: full rewrite — ONNX sessions, autoregressive KV-cache LM loop, voice conditionals cache via speech_encoder outputs - wyoming_handler.py: remove torch dep, use np.asarray for audio bytes - requirements-rocm-init.txt: onnxruntime-rocm replaces torch wheels - requirements-rocm.txt: drop chatterbox/torch deps, keep audio utils - Dockerfile.rocm: remove chatterbox-tts install step Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -3,6 +3,8 @@ import logging
|
||||
import time
|
||||
from typing import Dict, Optional
|
||||
|
||||
import numpy as np
|
||||
|
||||
from wyoming.audio import AudioChunk, AudioStart, AudioStop
|
||||
from wyoming.event import Event
|
||||
from wyoming.info import Describe, Info
|
||||
@@ -151,7 +153,7 @@ class ChatterboxWyomingHandler(AsyncEventHandler):
|
||||
continue
|
||||
|
||||
audio_bytes = (
|
||||
audio_tensor.cpu().numpy().squeeze() * 32767
|
||||
np.asarray(audio_tensor).squeeze() * 32767
|
||||
).astype("int16").tobytes()
|
||||
|
||||
if first_chunk:
|
||||
|
||||
Reference in New Issue
Block a user