rocm-chatterbox-whisper

Author	SHA1	Message	Date
scott	cd33b1c161	Fix MIOpen MLIR kernel compilation crash during benchmark search All checks were successful Build ROCm Image / build (push) Successful in 18s Details Two changes: - ulimits nofile=65536: MIOpen exhaustive search compiles many MLIR kernels in parallel, each opening temp files in /tmp. Default container limit (1024) is too low and ld.lld fails with 'too many open files'. - MIOPEN_DEBUG_CONV_IMPLICIT_GEMM=0: disables the MLIR-based ImplicitGEMM solvers that generate the failing kernels, leaving Direct/Winograd/GEMM. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 14:21:32 -04:00
scott	e69b072b70	Add MIOPEN_DISABLE_CACHE=1 to prevent SQLite crash on benchmark All checks were successful Build ROCm Image / build (push) Successful in 19s Details cudnn.benchmark triggers MIOpen exhaustive kernel search which then crashes writing results to SQLite. Disabling the cache skips the write. PyTorch's in-memory benchmark cache still applies so warmup results are reused for all requests within a container run. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 14:14:44 -04:00
scott	7436c49d44	Remove custom MIOpen cache path — let MIOpen use its defaults All checks were successful Build ROCm Image / build (push) Successful in 3m25s Details The named volume overlay was causing SQLite 'unable to open database file' crashes. MIOpen's default cache location (~/.config/miopen) works reliably inside the container. The startup warmup repopulates it each run. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 14:04:05 -04:00
scott	b990cacd31	Enable HSA_OVERRIDE_GFX_VERSION=10.3.0 for RX 6700 XT All checks were successful Build ROCm Image / build (push) Successful in 18s Details gfx1031 is not natively supported in ROCm 7.2. Without the override the GPU falls back to software emulation causing 40+ second synthesis. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 13:07:46 -04:00
scott	2a80555c60	Disable MIOpen GEMM solver to fix null workspace fallback All checks were successful Build ROCm Image / build (push) Successful in 35s Details PyTorch passes ptr=0 size=0 workspace to MIOpen convolutions, causing GemmFwdRest to warn and fall back to a slow path on every operation. MIOPEN_DEBUG_CONV_GEMM=0 skips GEMM entirely and uses Direct/Winograd solvers which have no workspace requirement. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 13:00:21 -04:00
scott	b68bccb20f	Revert to torch 2.5.1 + ROCm 6.1 (known working combination) Some checks failed Build ROCm Image / build (push) Has been cancelled Details PyTorch 2.11.0 with ROCm 7.2 wheels against rocm/dev-ubuntu-22.04:latest causes MIOpen version mismatches that force every convolution onto a slow zero-workspace fallback path (41s synthesis). The existing working project uses torch 2.5.1 + ROCm 6.1 successfully on the same base image. Also remove MIOPEN_FIND_ENFORCE override - unnecessary with matched versions. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 12:34:06 -04:00
scott	7a966c8532	Fix MIOPEN_FIND_ENFORCE: 3 -> 1 (DB_UPDATE) Some checks failed Build ROCm Image / build (push) Has been cancelled Details Enforce=3 (SEARCH_DB_UPDATE) runs exhaustive kernel benchmarking on every single GPU operation, making inference impossibly slow. Enforce=1 searches once, writes to cache, then reuses cached results on all subsequent calls. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 12:29:54 -04:00
scott	f45fa0496e	Fix MIOpen workspace warnings and add kernel cache persistence Some checks failed Build ROCm Image / build (push) Has been cancelled Details MIOPEN_FIND_ENFORCE=3 tells MIOpen to only select solvers that fit in available workspace, eliminating the GemmFwdRest fallback warnings and the associated performance hit. Persisting the MIOpen cache via a named volume avoids kernel recompilation on every container start. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 12:20:18 -04:00
scott	dc7a3cf769	Upgrade to ROCm 7.2 and PyTorch 2.11.0 Some checks failed Build ROCm Image / build (push) Failing after 7m25s Details - Update torch/torchaudio to 2.11.0 with ROCm 7.2 wheel index - Drop torchvision (unused for TTS) and pytorch_triton_rocm (bundled in 2.11) - Update HSA_OVERRIDE_GFX_VERSION docs; RX 7000+ natively supported in ROCm 7.2 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 11:06:39 -04:00
scott	16ea2853f5	Initial implementation: Chatterbox TTS with ROCm and Wyoming All checks were successful Build ROCm Image / build (push) Successful in 15m27s Details Wyoming-only server built around the official chatterbox TTS model. Includes ROCm/AMD GPU support, sentence-level streaming, config.yaml management, and Gitea CI for container builds. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 09:51:09 -04:00

10 Commits