3 Commits

Author SHA1 Message Date
Xubin Ren
70a1279b86 test: pin retry-wait callback routing so internal heartbeats stay off channels
Add two focused regression tests for the retry-wait leak this PR fixes:

- tests/agent/test_runner.py::test_runner_binds_on_retry_wait_to_retry_callback_not_progress
  locks in that `AgentRunSpec.retry_wait_callback` (not `progress_callback`) is
  what `_build_request_kwargs` forwards to the provider as `on_retry_wait`.

- tests/channels/test_channel_manager_delta_coalescing.py::TestRetryWaitFiltering
  runs `_dispatch_outbound` end-to-end and asserts that `_retry_wait: True`
  messages never reach channel send.

Both tests fail on origin/main and pass with this PR's fix applied.

Made-with: Cursor
2026-04-18 13:50:05 +08:00
Xubin Ren
cf25a582ba fix(channel): stop delta coalescing at stream boundaries 2026-03-27 21:43:57 +08:00
chengyongru
5ff9146a24 fix(channel): coalesce queued stream deltas to reduce API calls
When LLM generates faster than channel can process, asyncio.Queue
accumulates multiple _stream_delta messages. Each delta triggers a
separate API call (~700ms each), causing visible delay after LLM
finishes.

Solution: In _dispatch_outbound, drain all queued deltas for the same
(channel, chat_id) before sending, combining them into a single API
call. Non-matching messages are preserved in a pending buffer for
subsequent processing.

This reduces N API calls to 1 when queue has N accumulated deltas.
2026-03-27 21:43:57 +08:00