20 Commits

Author SHA1 Message Date
chengyongru
a1e1eed2f1 refactor(runner): consolidate all injection drain paths and deduplicate tests
- Migrate "after tools" inline drain to use _try_drain_injections,
  completing the refactoring (all 6 drain sites now use the helper).
- Move checkpoint emission into _try_drain_injections via optional
  iteration parameter, eliminating the leaky split between helper
  and caller for the final-response path.
- Extract _make_injection_callback() test helper to replace 7
  identical inject_cb function bodies.
- Add test_injection_cycle_cap_on_error_path to verify the cycle
  cap is enforced on error exit paths.
2026-04-14 00:30:30 +08:00
chengyongru
d849a3fa06 fix(agent): drain injection queue on error/edge-case exit paths
When the agent runner exits due to LLM error, tool error, empty response,
or max_iterations, it breaks out of the iteration loop without draining
the pending injection queue. This causes leftover messages to be
re-published as independent inbound messages, resulting in duplicate or
confusing replies to the user.

Extract the injection drain logic into a `_try_drain_injections` helper
and call it before each break in the error/edge-case paths. If injections
are found, continue the loop instead of breaking. For max_iterations
(where the loop is exhausted), drain injections to prevent re-publish
without continuing.
2026-04-14 00:30:30 +08:00
layla
f25cdb7138
Merge branch 'main' into fix/tool-call-result-order-2943 2026-04-11 22:00:07 +08:00
04cb
4cd4ed8ada fix(agent): preserve tool results on fatal error to prevent orphan tool_calls (#2943) 2026-04-11 21:50:44 +08:00
Xubin Ren
cf8381f517 feat(agent): enhance message injection handling and content merging 2026-04-11 21:43:23 +08:00
Xubin Ren
f6c39ec946 feat(agent): enhance session key handling for follow-up messages 2026-04-11 21:43:23 +08:00
chengyongru
36d2a11e73 feat(agent): mid-turn message injection for responsive follow-ups (#2985)
* feat(agent): add mid-turn message injection for responsive follow-ups

Allow user messages sent during an active agent turn to be injected
into the running LLM context instead of being queued behind a
per-session lock. Inspired by Claude Code's mid-turn queue drain
mechanism (query.ts:1547-1643).

Key design decisions:
- Messages are injected as natural user messages between iterations,
  no tool cancellation or special system prompt needed
- Two drain checkpoints: after tool execution and after final LLM
  response ("last-mile" to prevent dropping late arrivals)
- Bounded by MAX_INJECTION_CYCLES (5) to prevent consuming the
  iteration budget on rapid follow-ups
- had_injections flag bypasses _sent_in_turn suppression so follow-up
  responses are always delivered

Closes #1609

* fix(agent): harden mid-turn injection with streaming fix, bounded queue, and message safety

- Fix streaming protocol violation: Checkpoint 2 now checks for injections
  BEFORE calling on_stream_end, passing resuming=True when injections found
  so streaming channels (Feishu) don't prematurely finalize the card
- Bound pending queue to maxsize=20 with QueueFull handling
- Add warning log when injection batch exceeds _MAX_INJECTIONS_PER_TURN
- Re-publish leftover queue messages to bus in _dispatch finally block to
  prevent silent message loss on early exit (max_iterations, tool_error, cancel)
- Fix PEP 8 blank line before dataclass and logger.info indentation
- Add 12 new tests covering drain, checkpoints, cycle cap, queue routing,
  cleanup, and leftover re-publish
2026-04-11 21:43:23 +08:00
Xubin Ren
2bef9cb650 fix(agent): preserve interrupted tool-call turns
Keep tool-call assistant messages valid across provider sanitization and avoid trailing user-only history after model errors. This prevents follow-up requests from sending broken tool chains back to the gateway.
2026-04-10 05:37:25 +00:00
Xubin Ren
363a0704db refactor(runner): update message processing to preserve historical context
- Adjusted message handling in AgentRunner to ensure that historical messages remain unchanged during context governance.
- Introduced tests to verify that backfill operations do not alter the saved message boundary, maintaining the integrity of the conversation history.
2026-04-10 04:46:48 +00:00
Xubin Ren
c625c0c2a7 Merge origin/main and add regression tests for streaming error delivery
- Merged latest main (no conflicts)
- Added test_llm_error_not_appended_to_session_messages: verifies error
  content stays out of session messages
- Added test_streamed_flag_not_set_on_llm_error: verifies _streamed is
  not set when LLM returns an error, so ChannelManager delivers it

Made-with: Cursor
2026-04-09 23:10:46 +08:00
yanghan-cyber
10f6c875a5 fix(agent): deliver LLM errors to streaming channels and avoid polluting session context
When the LLM returns an error (e.g. 429 quota exceeded, stream timeout),
streaming channels silently drop the error message because `_streamed=True`
is set in metadata even though no content was actually streamed.

This change:
- Skips setting `_streamed` when stop_reason is "error", so error messages
  go through the normal channel.send() path and reach the user
- Stops appending error content to session history, preventing error
  messages from polluting subsequent conversation context
- Exposes stop_reason from _run_agent_loop to enable the above check
2026-04-09 23:10:46 +08:00
Xubin Ren
edb821e10d feat(agent): prompt behavior directives, tool descriptions, and loop robustness 2026-04-08 02:22:25 +08:00
Xubin Ren
02597c3ec9 fix(runner): silent retry on empty response before finalization 2026-04-07 15:03:41 +08:00
Xubin Ren
e4b335ce81 refactor: extract runtime response guards into utils runtime module 2026-04-02 13:54:40 +00:00
Xubin Ren
714a4c7bb6 fix(runtime): address review feedback on retry and cleanup 2026-04-02 10:57:12 +00:00
Xubin Ren
eefd7e60f2 Merge remote-tracking branch 'origin/main' into feat/runtime-hardening 2026-04-02 10:40:49 +00:00
chengyongru
da08dee144 feat(provider): show cache hit rate in /status (#2645) 2026-04-02 12:51:45 +08:00
Xubin Ren
fbedf7ad77 feat: harden agent runtime for long-running tasks 2026-04-01 19:12:49 +00:00
Xubin Ren
5bf0f6fe7d refactor: unify agent runner lifecycle hooks 2026-03-27 12:41:17 +08:00
Xubin Ren
e7d371ec1e refactor: extract shared agent runner and preserve subagent progress on failure 2026-03-27 02:49:43 +08:00