rocm-chatterbox-whisper/engine.py at f20699aed3ad7c54541bbb2e8afd9316d762437b

Files

scott f20699aed3

Build ROCm Image / build (push) Successful in 2m49s

Details

Add fp16 autocast to synthesis for faster GPU throughput

The 6700 XT has significantly higher fp16 throughput than fp32.
autocast("cuda") uses fp16 for matmuls and convolutions (HiFiGAN,
S3 tokenizer, flow matching) while keeping fp32 for precision-sensitive
ops like softmax and layer norm.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-04-05 13:34:21 -04:00

3.9 KiB

Raw Blame History

View Raw

3.9 KiB Raw Blame History

3.9 KiB

Raw Blame History