* feat(agent): add mid-turn message injection for responsive follow-ups
Allow user messages sent during an active agent turn to be injected
into the running LLM context instead of being queued behind a
per-session lock. Inspired by Claude Code's mid-turn queue drain
mechanism (query.ts:1547-1643).
Key design decisions:
- Messages are injected as natural user messages between iterations,
no tool cancellation or special system prompt needed
- Two drain checkpoints: after tool execution and after final LLM
response ("last-mile" to prevent dropping late arrivals)
- Bounded by MAX_INJECTION_CYCLES (5) to prevent consuming the
iteration budget on rapid follow-ups
- had_injections flag bypasses _sent_in_turn suppression so follow-up
responses are always delivered
Closes#1609
* fix(agent): harden mid-turn injection with streaming fix, bounded queue, and message safety
- Fix streaming protocol violation: Checkpoint 2 now checks for injections
BEFORE calling on_stream_end, passing resuming=True when injections found
so streaming channels (Feishu) don't prematurely finalize the card
- Bound pending queue to maxsize=20 with QueueFull handling
- Add warning log when injection batch exceeds _MAX_INJECTIONS_PER_TURN
- Re-publish leftover queue messages to bus in _dispatch finally block to
prevent silent message loss on early exit (max_iterations, tool_error, cancel)
- Fix PEP 8 blank line before dataclass and logger.info indentation
- Add 12 new tests covering drain, checkpoints, cycle cap, queue routing,
cleanup, and leftover re-publish
Keep tool-call assistant messages valid across provider sanitization and avoid trailing user-only history after model errors. This prevents follow-up requests from sending broken tool chains back to the gateway.
- Adjusted message handling in AgentRunner to ensure that historical messages remain unchanged during context governance.
- Introduced tests to verify that backfill operations do not alter the saved message boundary, maintaining the integrity of the conversation history.
- Merged latest main (no conflicts)
- Added test_llm_error_not_appended_to_session_messages: verifies error
content stays out of session messages
- Added test_streamed_flag_not_set_on_llm_error: verifies _streamed is
not set when LLM returns an error, so ChannelManager delivers it
Made-with: Cursor
When the LLM returns an error (e.g. 429 quota exceeded, stream timeout),
streaming channels silently drop the error message because `_streamed=True`
is set in metadata even though no content was actually streamed.
This change:
- Skips setting `_streamed` when stop_reason is "error", so error messages
go through the normal channel.send() path and reach the user
- Stops appending error content to session history, preventing error
messages from polluting subsequent conversation context
- Exposes stop_reason from _run_agent_loop to enable the above check