When LLM generates faster than channel can process, asyncio.Queue
accumulates multiple _stream_delta messages. Each delta triggers a
separate API call (~700ms each), causing visible delay after LLM
finishes.
Solution: In _dispatch_outbound, drain all queued deltas for the same
(channel, chat_id) before sending, combining them into a single API
call. Non-matching messages are preserved in a pending buffer for
subsequent processing.
This reduces N API calls to 1 when queue has N accumulated deltas.
Make channel delivery failures raise consistently so retry policy lives in ChannelManager rather than being split across individual channels. Tighten Telegram stream finalization, clarify sendMaxRetries semantics, and align the docs with the behavior the system actually guarantees.
- Prevent repeated retries on expired sessions in the polling thread
- Stop sending messages to invalid agent sessions to eliminate noise logs and unnecessary requests