Fix warmup text length and ve attribute for torch.compile
All checks were successful
Build ROCm Image / build (push) Successful in 3m35s

- Warmup now uses a ~170-char representative sentence so torch.compile
  JIT-compiles for typical token sequence lengths. Previously "Warmup."
  compiled for very short shapes, causing a full re-compile (17s) on the
  first real HA request and pushing total synthesis past 30s.
- Compile model.ve (voice encoder) in addition to s3gen — both are
  convolutional and hit the MIOpen workspace=0 bug.
- Fix _patch_timing: attribute is model.ve not model.voice_encoder,
  so the timing wrap was silently skipping the speaker embedding.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-04-05 14:51:08 -04:00
parent 5766870304
commit 169e003a34
2 changed files with 19 additions and 10 deletions

View File

@@ -23,7 +23,13 @@ def _warmup(voices: dict) -> None:
audio_prompt = resolve_voice(None, voices) if voices else None
logger.info("Running warmup synthesis to populate MIOpen kernel cache...")
try:
engine.synthesize(text="Warmup.", audio_prompt_path=audio_prompt)
engine.synthesize(
text=(
"This is a warmup synthesis request used to pre-compile neural network kernels "
"for typical text lengths, so that the first real request runs at full speed."
),
audio_prompt_path=audio_prompt,
)
logger.info("Warmup complete — MIOpen cache populated")
except Exception:
logger.warning("Warmup synthesis failed (non-fatal)", exc_info=True)