51188ca973f9523327e7453ae777d8e61e10ca1e
All checks were successful
Build ROCm Image / build (push) Successful in 2m39s
s3gen.speaker_encoder (CAMPPlus xvector) hardcodes float32 inputs in its inference() method, causing dtype mismatch when weights are fp16. T3 (the autoregressive GPT-2-medium LLM) has no such constraint and is the token-generation bottleneck that benefits most from fp16. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Description
No description provided
Languages
Python
100%