Add two focused regression tests for the retry-wait leak this PR fixes:
- tests/agent/test_runner.py::test_runner_binds_on_retry_wait_to_retry_callback_not_progress
locks in that `AgentRunSpec.retry_wait_callback` (not `progress_callback`) is
what `_build_request_kwargs` forwards to the provider as `on_retry_wait`.
- tests/channels/test_channel_manager_delta_coalescing.py::TestRetryWaitFiltering
runs `_dispatch_outbound` end-to-end and asserts that `_retry_wait: True`
messages never reach channel send.
Both tests fail on origin/main and pass with this PR's fix applied.
Made-with: Cursor
When LLM generates faster than channel can process, asyncio.Queue
accumulates multiple _stream_delta messages. Each delta triggers a
separate API call (~700ms each), causing visible delay after LLM
finishes.
Solution: In _dispatch_outbound, drain all queued deltas for the same
(channel, chat_id) before sending, combining them into a single API
call. Non-matching messages are preserved in a pending buffer for
subsequent processing.
This reduces N API calls to 1 when queue has N accumulated deltas.