maintainer edit: WebUI settings still treated non-registry custom providers as unknown, so users could not select them in model configurations or fetch their model list. Reuse dynamic provider specs for settings payloads, model-list requests, and provider updates.
Add AssemblyAI as a third transcription provider option alongside
OpenAI and Groq. AssemblyAI offers better accuracy for certain
audio types (distant voices, noisy environments) and serves as a
reliable fallback when other providers struggle.
Changes:
- Add AssemblyAITranscriptionProvider class in providers/transcription.py
- Add 'assemblyai' option in base channel's transcribe_audio()
- Per-channel configuration via transcriptionProvider in config
Usage:
Set transcriptionProvider: 'assemblyai' and provide an AssemblyAI
API key via transcriptionApiKey in the channel config.
Add support for Xiaomi MiMo ASR as a third transcription backend alongside
Groq and OpenAI Whisper. Xiaomi ASR uses the /v1/chat/completions endpoint
with base64-encoded audio input, rather than the standard Whisper multipart
upload format.
Co-Authored-By:连 <lian@tangping.homes>
Add a `transcriptionModel` channel setting and an OpenRouter transcription
backend so voice messages can be transcribed through OpenRouter's
speech-to-text endpoint (e.g. nvidia/parakeet-tdt-0.6b-v3, openai/whisper-1),
alongside the existing Groq/OpenAI Whisper providers.
- schema: add channels.transcriptionModel (None = provider default)
- providers/transcription: extract a shared POST/retry skeleton; add a
JSON+base64 OpenRouterTranscriptionProvider; make the STT model a
constructor param on all providers instead of hardcoding it
- channels: route transcriptionProvider="openrouter" and thread the model
through the manager to each channel
- docs + tests
Only dedicated STT models work on OpenRouter's transcription endpoint;
chat LLMs (e.g. google/gemini-3.5-flash) are rejected there.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>