mirror of
https://github.com/HKUDS/nanobot.git
synced 2026-05-19 16:12:30 +00:00
* feat(long-task): add LongTaskTool for multi-step agent tasks
Implements a meta-ReAct loop where long-running tasks are broken into
sequential subagent steps, each starting fresh with the original goal
and progress from the previous step. This prevents context drift when
agents work on complex, multi-step tasks.
- Extract build_tool_registry() from SubagentManager for reuse
- Add run_step() for synchronous subagent execution (no bus announcement)
- Add HandoffTool and CompleteTool as signal mechanisms via shared dict
- Add LongTaskTool orchestrator with simplified prompt (8 iterations/step)
- Register LongTaskTool in main agent loop
- Add _extract_handoff_from_messages fallback for robustness
* fix(long-task): add debug logging for step-level observability
* feat(long-task): major overhaul with structured handoffs, validation, and observability
- Structured HandoffState: HandoffTool now accepts files_created,
files_modified, next_step_hint, and verification fields instead of
a plain string. Progress is passed between steps as structured data.
- Completion validation round: After complete() is called, a dedicated
validator step runs to verify the claim against the original goal.
If validation fails, the task continues rather than returning
a false completion.
- Dynamic prompt system: 3 Jinja2 templates (step_start, step_middle,
step_final) selected based on step number. Final steps get tighter
budget and stronger "wrap up" guidance.
- Automatic file change tracking: Extracts write_file/edit_file events
from tool_events and injects them into the next step's context if
the subagent forgot to report them explicitly.
- Budget tracking & adaptive strategy: Cumulative token usage is tracked
across steps. Per-step tool budget drops from 8 to 4 in the last
two steps to force handoff/completion.
- Crash retry with graceful degradation: A step that crashes is retried
once. Persistent crashes terminate the task and return partial progress.
- Full observability hooks for future WebUI integration:
- set_hooks() with on_step_start, on_step_complete, on_handoff,
on_validation_started, on_validation_passed, on_validation_failed,
on_task_complete, on_task_error, and catch-all on_event.
- Readable state properties: current_step, total_steps, status,
last_handoff, cumulative_usage, goal.
- inject_correction() allows external code to send user corrections
that are injected into the next step's prompt.
- run_step() accepts optional max_iterations for dynamic budget control.
All 27 long-task tests and 11 subagent tests pass.
* test(long-task): add boundary tests and fix race conditions
- Add 7 edge-case tests: validation crash resilience, hook exception safety, mid-run correction injection, FIFO correction ordering, explicit file changes overriding auto-detection, final budget for max_steps=1, and dynamic budget switching boundaries
- Fix assertion in test_long_task_completes_after_multiple_handoffs to match exact prompt format
- Remove asyncio timing hack from test_state_exposure
- Add asyncio.sleep(0) yield in test_inject_correction_during_execution to prevent race between signal injection and step continuation
- All 34 tests passing
* fix(long-task): address code review findings
- Declare _scopes = {"core"} explicitly to prevent recursive nesting in subagent scope
- Document fragile coupling in _extract_file_changes: path extraction depends on
write_file/edit_file detail format; add debug log for unexpected formats
- Align final-template threshold (max_steps - 2) with budget switch threshold
- Eliminate hasattr(self, "_state") in _reset_state by initializing in __init__
* fix(long-task): honor final signal and file tracking
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(long-task): improve prompt structure and agent contract
- Expand LongTaskTool.description to instruct parent agent on goal
construction, return value semantics, and how to handle results.
- Expand CompleteTool.description to emphasize that the summary IS the
final answer returned to the parent agent.
- Prefix validated return value with an explicit "final answer" directive
to stop parent agent from re-running work.
- Redesign step_start.md: Step 1 is now explicitly for exploration,
planning, and skeleton-building. complete() is discouraged.
- Remove bulky payload debug logging from _emit(); add targeted
info/warning/error logs at key state transitions instead.
- Add signal_type to HandoffState for cleaner signal detection.
* test(long-task): expect wrapped completion message after validation
Align assertions with LongTaskTool final return shape on main.
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(webui): turn timing strip, latency, and session-switch restore
- Agent loop: publish goal_status run/idle for WebSocket turns; attach
wall-clock latency_ms on turn_end and persisted assistant metadata.
- WebSocket channel: forward goal_status and latency fields to clients.
- NanobotClient: track goal_status started_at per chat without requiring
onChat; useNanobotStream restores run strip when returning to a chat.
- Thread UI: composer/shell viewport hooks for run duration and latency;
format helpers and i18n strings.
- MessageBubble: drop trailing StreamCursor (layout artifact vs block markdown).
- Builtin / tests: model command coverage, websocket and loop tests.
Covers multi-session UX and round-trip timing visibility for the WebUI.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix: keep message-tool file attachments after canonical history hydrate
- MessageTool records per-turn media paths delivered to the active chat.
- nanobot.utils.session_attachments stages out-of-media-root files and
merges into the last assistant message before save (loop stays a thin call).
- WebUI MediaCell: use a signed URL as a real download link when present.
Fixes attachments flashing then vanishing on turn_end when paths lived
outside get_media_dir (e.g. workspace files).
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(webui): agent activity cluster, stable keys, LTR sheen labels
- Group reasoning and tool traces in AgentActivityCluster with i18n summaries
- Stabilize React list keys for activity clusters (first message id anchor)
- Replace background-clip shimmer with overlay sheen for streaming labels
- ThreadMessages/MessageList integration and locale strings
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(webui): render assistant reasoning with Markdown + deferred stream
- Use MarkdownText for ReasoningBubble body (same GFM/KaTeX path as replies)
- Apply muted/italic prose tokens so thinking stays visually subordinate
- useDeferredValue while reasoningStreaming to ease parser work during deltas
- Preload markdown chunk when trace opens; add regression test with preloaded renderer
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(webui): default-collapse agent activity cluster while Working
Outer fold no longer auto-expands during isTurnStreaming; user opens to see traces.
Header sheen and live summary unchanged.
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(long_task): cumulative run history, file union, and prompt tuning
Inject cross-step summaries and merged file paths into middle/final step
templates so chains do not lose early context. Strip the last run-history
block when it duplicates Previous Progress to save tokens. Add optional
cumulative_prompt_max_chars and cumulative_step_body_max_chars parameters
with clamped defaults.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(webui): session switch keeps in-flight thread and replays buffered WS
Save the prior chat message list to the per-chat cache in a layout effect
when chatId changes (before stale writes could corrupt another chat).
Skip one post-switch layout cache tick so we do not snapshot the wrong tab.
Buffer inbound events per chat_id when no onChat subscriber is registered
(e.g. user focused another session) and drain on resubscribe up to a cap,
so streaming deltas are not lost while off-tab.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(webui): snap thread scroll to bottom on session open (no smooth glide)
Use scroll-behavior auto on the viewport, instant programmatic scroll when
following new messages and on scrollToBottomSignal. Keep smooth only for
the explicit scroll-to-bottom button.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(webui): respect manual scroll-up after opening a session
Track when the user leaves the bottom with a ref and skip ResizeObserver
and deferred bottom snaps until they return or the conversation is reset.
Remove the time-based force-bottom window that overrode atBottom.
Multi-frame scrollToBottom honours the same guard unless force (scroll button).
Co-authored-by: Cursor <cursoragent@cursor.com>
* Publish long_task UI snapshots on outbound metadata
- Add OUTBOUND_META_AGENT_UI (_agent_ui) for channel-agnostic structured state
- LongTaskTool publishes {kind: long_task, data: snapshot} on the bus with _progress
- WebSocket send forwards metadata as agent_ui for WebUI clients
- Tests for bus payload, WS frame, and progress assertions
- Fix loop progress tests: ignore _goal_status in streaming final filter and
avoid brittle outbound[-1] ordering after goal status idle messages
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat: WebUI long_task activity card and resilient history merge
Add optional ui_summary to the long_task tool for one-line UI labels. Stream
long_task agent_ui into a dedicated message row with timeline, markdown peek,
and a right sheet for details. Merge canonical history after turn_end while
re-inserting long_task rows before the final assistant reply. Collapse
duplicate task_start/step_start steps in the timeline and extend i18n.
Co-authored-by: Cursor <cursoragent@cursor.com>
* refactor: align long_task with thread_goal and drop orchestrator UI
- Persist sustained objectives via session metadata (long_task / complete_goal); no subagent wiring or tool-driven agent_ui payloads.\n- Remove WebUI long-task activity UI, types, and translations; history merge preserves trace replay only, with legacy long_task rows normalized to traces.\n- Drop long_task prompt templates and get_long_task_run_dir; add webui thread disk helper for gateway persistence tests.
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(agent): thread goal runtime context, tools, and skill
- Add thread_goal_state helper and mirror active objectives into Runtime Context
- Wire loop/context/memory/events as needed for goal metadata in turns
- Expand long_task / complete_goal semantics (pivot/cancel/honest recap)
- Add always-on thread-goal SKILL.md; align /goal command prompt
- Tests for context builder and thread goal state
- Remove unused webui ChatPane component
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(thread-goal): add websocket snapshot helper and publish goal updates from long_task
Introduce thread_goal_ws_blob for bounded JSON snapshots, attach snapshots to
websocket turn_end metadata in AgentLoop, and let long_task fan-out dedicated
thread_goal frames on the websocket channel after persisting session metadata.
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(channels): websocket thread_goal frames, turn_end replay, and session API scrub for subagent inject
Emit thread_goal events and optional thread_goal on turn_end; scrub persisted
subagent announce blobs on GET /api/sessions/.../messages and shorten session
list previews so WebUI does not surface full Task/Summarize scaffolding.
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(webui): merge ephemeral traces per user turn when reconciling canonical history
Preserve disk/live trace rows inside the matching user–assistant segment instead
of stacking every trace before the final assistant reply (fixes inflated tool
counts after refresh or session switch).
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(webui): show assistant reply copy only on the last slice before the next user turn
Avoid duplicate copy affordances on intermediate assistant bubbles that precede
more agent activity in the same turn (tools or further assistant text).
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(webui): thread_goal stream plumbing, composer goal strip, sky glow, and client-side subagent scrub projection
Track thread_goal and turn_goal snapshots in NanobotClient, hydrate React state
from thread_goal frames and turn_end, surface objective/elapsed in the composer,
add breathing sky halo CSS while goals are active, mirror server scrub logic on
history hydration and webui_thread snapshots, and extend tests/client mocks.
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(channels): add Slack Socket Mode connect timeout with actionable timeout errors
Abort hung websockets.connect handshakes after a bounded wait, log REST-vs-WSS
guidance, surface RuntimeError to channel startup, and log successful WSS setup.
Co-authored-by: Cursor <cursoragent@cursor.com>
* webui: expand thread goal in composer bottom sheet
Add ChevronUp control on the run/goal strip that opens a bottom Sheet
with full ui_summary and objective. Inline preview logic in RunElapsedStrip,
add i18n strings across locales, and a composer unit test.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(webui): widen dedupeToolCallsForUi input for session API typing
fetchSessionMessages types tool_calls as unknown; accept unknown so tsc
build passes when passing message.tool_calls through.
Co-authored-by: Cursor <cursoragent@cursor.com>
* refactor(agent): extract WebSocket turn run status to webui_turn_helpers
* refactor(skills): rename thread-goal to long-task and document idempotent goals
* feat(skills): rename sustained-goal skill to long-goal and tighten long_task guidance
* chore: remove unused subagent/context/router helpers
* feat(session): rename sustained goal to goal_state and align WS/WebUI
- Move helpers from agent/thread_goal_state to session/goal_state:
GOAL_STATE_KEY, goal_state_runtime_lines, goal_state_ws_blob, parse_goal_state.
- Session metadata now uses "goal_state"; still read legacy "thread_goal";
long_task writes drop the legacy key after save.
- WebSocket: event/field goal_state, _goal_state_sync; turn_end carries goal_state;
accept legacy _thread_goal_sync/thread_goal inbound metadata for dispatch.
- WebUI: GoalStateWsPayload, goalState hook/client props, i18n keys goalState*.
- Runtime Context copy uses "Goal (active):" instead of "Thread goal".
* feat(agent): stream Anthropic thinking deltas and fix stream idle timeout
* refactor(webui): transcript jsonl as sole timeline source
* fix(agent): reject mismatched WS message chat_id and stream reasoning deltas
* feat(webui): hydrate sustained goal and run timer after websocket subscribe
* chore(webui,websocket): remove unused fetch helpers and legacy thread_goal WS paths
* Raise default max_tokens and context window in agent schema.
Align AgentDefaults and ModelPresetConfig with typical Claude-scale usage
(32k completion budget, 256k context window) and update migration tests.
Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(gateway): bootstrap prefers in-memory model; clarify websocket naming
* fix(websocket): websocket _handle_message passes is_dm; refresh /status test expectations
---------
Co-authored-by: chengyongru <2755839590@qq.com>
Co-authored-by: chengyongru <chengyongru.ai@gmail.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
893 lines
32 KiB
Python
893 lines
32 KiB
Python
import asyncio
|
|
from pathlib import Path
|
|
from unittest.mock import AsyncMock, MagicMock
|
|
|
|
import pytest
|
|
|
|
from nanobot.agent.context import ContextBuilder
|
|
from nanobot.agent.loop import AgentLoop
|
|
from nanobot.bus.events import InboundMessage
|
|
from nanobot.bus.queue import MessageBus
|
|
from nanobot.providers.base import LLMResponse
|
|
from nanobot.session.manager import Session
|
|
from nanobot.utils.webui_titles import (
|
|
WEBUI_SESSION_METADATA_KEY,
|
|
WEBUI_TITLE_METADATA_KEY,
|
|
maybe_generate_webui_title,
|
|
)
|
|
|
|
|
|
def _mk_loop() -> AgentLoop:
|
|
loop = AgentLoop.__new__(AgentLoop)
|
|
from nanobot.config.schema import AgentDefaults
|
|
|
|
loop.max_tool_result_chars = AgentDefaults().max_tool_result_chars
|
|
return loop
|
|
|
|
|
|
def _make_full_loop(tmp_path: Path) -> AgentLoop:
|
|
provider = MagicMock()
|
|
provider.get_default_model.return_value = "test-model"
|
|
provider.chat_with_retry = AsyncMock(return_value=LLMResponse(content="Test title"))
|
|
return AgentLoop(bus=MessageBus(), provider=provider, workspace=tmp_path, model="test-model")
|
|
|
|
|
|
@pytest.mark.asyncio
|
|
async def test_generate_webui_title_only_for_marked_webui_sessions(tmp_path: Path) -> None:
|
|
loop = _make_full_loop(tmp_path)
|
|
loop.provider.chat_with_retry = AsyncMock(
|
|
return_value=LLMResponse(content='"优化 WebUI 侧边栏。"', finish_reason="stop")
|
|
)
|
|
session = loop.sessions.get_or_create("websocket:chat-title")
|
|
session.metadata[WEBUI_SESSION_METADATA_KEY] = True
|
|
session.add_message("user", "帮我优化一下 webui 的 sidebar")
|
|
session.add_message("assistant", "可以,我会先调整布局和视觉层级。")
|
|
loop.sessions.save(session)
|
|
|
|
generated = await maybe_generate_webui_title(
|
|
sessions=loop.sessions,
|
|
session_key="websocket:chat-title",
|
|
provider=loop.provider,
|
|
model=loop.model,
|
|
)
|
|
|
|
assert generated is True
|
|
assert session.metadata[WEBUI_TITLE_METADATA_KEY] == "优化 WebUI 侧边栏"
|
|
loop.provider.chat_with_retry.assert_awaited_once()
|
|
|
|
|
|
@pytest.mark.asyncio
|
|
async def test_generate_webui_title_skips_plain_websocket_sessions(tmp_path: Path) -> None:
|
|
loop = _make_full_loop(tmp_path)
|
|
loop.provider.chat_with_retry = AsyncMock(
|
|
return_value=LLMResponse(content="Plain websocket title", finish_reason="stop")
|
|
)
|
|
session = loop.sessions.get_or_create("websocket:custom-client")
|
|
session.add_message("user", "hello from a custom websocket client")
|
|
loop.sessions.save(session)
|
|
|
|
generated = await maybe_generate_webui_title(
|
|
sessions=loop.sessions,
|
|
session_key="websocket:custom-client",
|
|
provider=loop.provider,
|
|
model=loop.model,
|
|
)
|
|
|
|
assert generated is False
|
|
assert WEBUI_TITLE_METADATA_KEY not in session.metadata
|
|
loop.provider.chat_with_retry.assert_not_awaited()
|
|
|
|
|
|
def test_save_turn_skips_multimodal_user_when_only_runtime_context() -> None:
|
|
loop = _mk_loop()
|
|
session = Session(key="test:runtime-only")
|
|
runtime = ContextBuilder._RUNTIME_CONTEXT_TAG + "\nCurrent Time: now (UTC)"
|
|
|
|
loop._save_turn(
|
|
session,
|
|
[{"role": "user", "content": [{"type": "text", "text": runtime}]}],
|
|
skip=0,
|
|
)
|
|
assert session.messages == []
|
|
|
|
|
|
def test_save_turn_keeps_image_placeholder_with_path_after_runtime_strip() -> None:
|
|
loop = _mk_loop()
|
|
session = Session(key="test:image")
|
|
runtime = ContextBuilder._RUNTIME_CONTEXT_TAG + "\nCurrent Time: now (UTC)"
|
|
|
|
loop._save_turn(
|
|
session,
|
|
[{
|
|
"role": "user",
|
|
"content": [
|
|
{"type": "image_url", "image_url": {"url": "data:image/png;base64,abc"}, "_meta": {"path": "/media/feishu/photo.jpg"}},
|
|
{"type": "text", "text": runtime},
|
|
],
|
|
}],
|
|
skip=0,
|
|
)
|
|
assert session.messages[0]["content"] == [{"type": "text", "text": "[image: /media/feishu/photo.jpg]"}]
|
|
|
|
|
|
def test_save_turn_keeps_image_placeholder_without_meta() -> None:
|
|
loop = _mk_loop()
|
|
session = Session(key="test:image-no-meta")
|
|
runtime = ContextBuilder._RUNTIME_CONTEXT_TAG + "\nCurrent Time: now (UTC)"
|
|
|
|
loop._save_turn(
|
|
session,
|
|
[{
|
|
"role": "user",
|
|
"content": [
|
|
{"type": "image_url", "image_url": {"url": "data:image/png;base64,abc"}},
|
|
{"type": "text", "text": runtime},
|
|
],
|
|
}],
|
|
skip=0,
|
|
)
|
|
assert session.messages[0]["content"] == [{"type": "text", "text": "[image]"}]
|
|
|
|
|
|
def test_save_turn_strips_runtime_context_suffix_from_string() -> None:
|
|
loop = _mk_loop()
|
|
session = Session(key="test:suffix-strip")
|
|
runtime = (
|
|
ContextBuilder._RUNTIME_CONTEXT_TAG
|
|
+ "\nCurrent Time: now\n"
|
|
+ ContextBuilder._RUNTIME_CONTEXT_END
|
|
)
|
|
|
|
loop._save_turn(
|
|
session,
|
|
[{"role": "user", "content": f"hello world\n\n{runtime}"}],
|
|
skip=0,
|
|
)
|
|
assert session.messages[0]["content"] == "hello world"
|
|
|
|
|
|
def test_save_turn_skips_string_user_when_only_runtime_context_suffix() -> None:
|
|
loop = _mk_loop()
|
|
session = Session(key="test:suffix-only")
|
|
runtime = (
|
|
ContextBuilder._RUNTIME_CONTEXT_TAG
|
|
+ "\nCurrent Time: now\n"
|
|
+ ContextBuilder._RUNTIME_CONTEXT_END
|
|
)
|
|
|
|
loop._save_turn(
|
|
session,
|
|
[{"role": "user", "content": runtime}],
|
|
skip=0,
|
|
)
|
|
assert session.messages == []
|
|
|
|
|
|
def test_save_turn_keeps_tool_results_under_16k() -> None:
|
|
loop = _mk_loop()
|
|
session = Session(key="test:tool-result")
|
|
content = "x" * 12_000
|
|
|
|
loop._save_turn(
|
|
session,
|
|
[{"role": "tool", "tool_call_id": "call_1", "name": "read_file", "content": content}],
|
|
skip=0,
|
|
)
|
|
|
|
assert session.messages[0]["content"] == content
|
|
|
|
|
|
def test_save_turn_stamps_latency_on_last_assistant() -> None:
|
|
loop = _mk_loop()
|
|
session = Session(key="test:latency")
|
|
|
|
loop._save_turn(
|
|
session,
|
|
[
|
|
{"role": "assistant", "content": "hello", "tool_calls": [{"id": "c1"}]},
|
|
{"role": "assistant", "content": "final answer"},
|
|
],
|
|
skip=0,
|
|
turn_latency_ms=12345,
|
|
)
|
|
|
|
assert session.messages[-1]["role"] == "assistant"
|
|
assert session.messages[-1]["content"] == "final answer"
|
|
assert session.messages[-1]["latency_ms"] == 12345
|
|
|
|
|
|
def test_restore_runtime_checkpoint_rehydrates_completed_and_pending_tools() -> None:
|
|
loop = _mk_loop()
|
|
session = Session(
|
|
key="test:checkpoint",
|
|
metadata={
|
|
AgentLoop._RUNTIME_CHECKPOINT_KEY: {
|
|
"assistant_message": {
|
|
"role": "assistant",
|
|
"content": "working",
|
|
"tool_calls": [
|
|
{
|
|
"id": "call_done",
|
|
"type": "function",
|
|
"function": {"name": "read_file", "arguments": "{}"},
|
|
},
|
|
{
|
|
"id": "call_pending",
|
|
"type": "function",
|
|
"function": {"name": "exec", "arguments": "{}"},
|
|
},
|
|
],
|
|
},
|
|
"completed_tool_results": [
|
|
{
|
|
"role": "tool",
|
|
"tool_call_id": "call_done",
|
|
"name": "read_file",
|
|
"content": "ok",
|
|
}
|
|
],
|
|
"pending_tool_calls": [
|
|
{
|
|
"id": "call_pending",
|
|
"type": "function",
|
|
"function": {"name": "exec", "arguments": "{}"},
|
|
}
|
|
],
|
|
}
|
|
},
|
|
)
|
|
|
|
restored = loop._restore_runtime_checkpoint(session)
|
|
|
|
assert restored is True
|
|
assert session.metadata.get(AgentLoop._RUNTIME_CHECKPOINT_KEY) is None
|
|
assert session.messages[0]["role"] == "assistant"
|
|
assert session.messages[1]["tool_call_id"] == "call_done"
|
|
assert session.messages[2]["tool_call_id"] == "call_pending"
|
|
assert "interrupted before this tool finished" in session.messages[2]["content"].lower()
|
|
|
|
|
|
def test_restore_runtime_checkpoint_dedupes_overlapping_tail() -> None:
|
|
loop = _mk_loop()
|
|
session = Session(
|
|
key="test:checkpoint-overlap",
|
|
messages=[
|
|
{
|
|
"role": "assistant",
|
|
"content": "working",
|
|
"tool_calls": [
|
|
{
|
|
"id": "call_done",
|
|
"type": "function",
|
|
"function": {"name": "read_file", "arguments": "{}"},
|
|
},
|
|
{
|
|
"id": "call_pending",
|
|
"type": "function",
|
|
"function": {"name": "exec", "arguments": "{}"},
|
|
},
|
|
],
|
|
},
|
|
{
|
|
"role": "tool",
|
|
"tool_call_id": "call_done",
|
|
"name": "read_file",
|
|
"content": "ok",
|
|
},
|
|
],
|
|
metadata={
|
|
AgentLoop._RUNTIME_CHECKPOINT_KEY: {
|
|
"assistant_message": {
|
|
"role": "assistant",
|
|
"content": "working",
|
|
"tool_calls": [
|
|
{
|
|
"id": "call_done",
|
|
"type": "function",
|
|
"function": {"name": "read_file", "arguments": "{}"},
|
|
},
|
|
{
|
|
"id": "call_pending",
|
|
"type": "function",
|
|
"function": {"name": "exec", "arguments": "{}"},
|
|
},
|
|
],
|
|
},
|
|
"completed_tool_results": [
|
|
{
|
|
"role": "tool",
|
|
"tool_call_id": "call_done",
|
|
"name": "read_file",
|
|
"content": "ok",
|
|
}
|
|
],
|
|
"pending_tool_calls": [
|
|
{
|
|
"id": "call_pending",
|
|
"type": "function",
|
|
"function": {"name": "exec", "arguments": "{}"},
|
|
}
|
|
],
|
|
}
|
|
},
|
|
)
|
|
|
|
restored = loop._restore_runtime_checkpoint(session)
|
|
|
|
assert restored is True
|
|
assert session.metadata.get(AgentLoop._RUNTIME_CHECKPOINT_KEY) is None
|
|
assert len(session.messages) == 3
|
|
assert session.messages[0]["role"] == "assistant"
|
|
assert session.messages[1]["tool_call_id"] == "call_done"
|
|
assert session.messages[2]["tool_call_id"] == "call_pending"
|
|
|
|
|
|
@pytest.mark.asyncio
|
|
async def test_process_message_persists_user_message_before_turn_completes(tmp_path: Path) -> None:
|
|
loop = _make_full_loop(tmp_path)
|
|
loop.consolidator.maybe_consolidate_by_tokens = AsyncMock(return_value=False) # type: ignore[method-assign]
|
|
loop._run_agent_loop = AsyncMock(side_effect=RuntimeError("boom")) # type: ignore[method-assign]
|
|
|
|
msg = InboundMessage(channel="feishu", sender_id="u1", chat_id="c1", content="persist me")
|
|
with pytest.raises(RuntimeError, match="boom"):
|
|
await loop._process_message(msg)
|
|
|
|
loop.sessions.invalidate("feishu:c1")
|
|
persisted = loop.sessions.get_or_create("feishu:c1")
|
|
assert [m["role"] for m in persisted.messages] == ["user"]
|
|
assert persisted.messages[0]["content"] == "persist me"
|
|
assert persisted.metadata.get(AgentLoop._PENDING_USER_TURN_KEY) is True
|
|
assert persisted.updated_at >= persisted.created_at
|
|
|
|
|
|
# 1x1 PNG used by the media-persistence tests. ``extract_documents`` runs
|
|
# at the top of ``_process_message`` and filters ``msg.media`` down to
|
|
# paths that magic-byte-sniff as images, so the test fixture needs real
|
|
# bytes on disk (not just placeholder paths).
|
|
_PNG_1X1 = (
|
|
b"\x89PNG\r\n\x1a\n"
|
|
b"\x00\x00\x00\rIHDR\x00\x00\x00\x01\x00\x00\x00\x01"
|
|
b"\x08\x06\x00\x00\x00\x1f\x15\xc4\x89"
|
|
b"\x00\x00\x00\nIDATx\x9cc\x00\x00\x00\x02\x00\x01"
|
|
b"\x00\x00\x00\x00IEND\xaeB`\x82"
|
|
)
|
|
|
|
|
|
@pytest.mark.asyncio
|
|
async def test_process_message_persists_media_paths_on_user_turn(tmp_path: Path) -> None:
|
|
"""User turns that attach images must record the media paths alongside
|
|
the text so the webui can rehydrate previews on session replay.
|
|
|
|
This is the producer half of the signed-media-URL round-trip: paths are
|
|
stored here, then :meth:`WebSocketChannel._augment_media_urls` maps them
|
|
onto signed URLs on the way out.
|
|
"""
|
|
img_a = tmp_path / "uuid-1.png"
|
|
img_a.write_bytes(_PNG_1X1)
|
|
img_b = tmp_path / "uuid-2.png"
|
|
img_b.write_bytes(_PNG_1X1)
|
|
|
|
loop = _make_full_loop(tmp_path)
|
|
loop.consolidator.maybe_consolidate_by_tokens = AsyncMock(return_value=False) # type: ignore[method-assign]
|
|
loop._run_agent_loop = AsyncMock(side_effect=RuntimeError("interrupt")) # type: ignore[method-assign]
|
|
|
|
msg = InboundMessage(
|
|
channel="websocket",
|
|
sender_id="u1",
|
|
chat_id="c-media",
|
|
content="look",
|
|
media=[str(img_a), str(img_b)],
|
|
)
|
|
with pytest.raises(RuntimeError, match="interrupt"):
|
|
await loop._process_message(msg)
|
|
|
|
loop.sessions.invalidate("websocket:c-media")
|
|
persisted = loop.sessions.get_or_create("websocket:c-media")
|
|
assert [m["role"] for m in persisted.messages] == ["user"]
|
|
assert persisted.messages[0]["content"] == "look"
|
|
assert persisted.messages[0]["media"] == [str(img_a), str(img_b)]
|
|
|
|
|
|
@pytest.mark.asyncio
|
|
async def test_process_message_persists_media_only_turn_without_text(tmp_path: Path) -> None:
|
|
"""A turn with images but no text still persists (previously silent-dropped).
|
|
|
|
The old early-persist gate skipped messages without text, leaving pure
|
|
image turns un-checkpointed. They now materialise as an empty-content
|
|
user row with ``media`` attached.
|
|
"""
|
|
img = tmp_path / "only.png"
|
|
img.write_bytes(_PNG_1X1)
|
|
|
|
loop = _make_full_loop(tmp_path)
|
|
loop.consolidator.maybe_consolidate_by_tokens = AsyncMock(return_value=False) # type: ignore[method-assign]
|
|
loop._run_agent_loop = AsyncMock(side_effect=RuntimeError("boom")) # type: ignore[method-assign]
|
|
|
|
msg = InboundMessage(
|
|
channel="websocket",
|
|
sender_id="u1",
|
|
chat_id="c-images-only",
|
|
content="",
|
|
media=[str(img)],
|
|
)
|
|
with pytest.raises(RuntimeError):
|
|
await loop._process_message(msg)
|
|
|
|
loop.sessions.invalidate("websocket:c-images-only")
|
|
persisted = loop.sessions.get_or_create("websocket:c-images-only")
|
|
assert len(persisted.messages) == 1
|
|
assert persisted.messages[0]["role"] == "user"
|
|
assert persisted.messages[0]["content"] == ""
|
|
assert persisted.messages[0]["media"] == [str(img)]
|
|
|
|
|
|
@pytest.mark.asyncio
|
|
async def test_process_message_does_not_duplicate_early_persisted_user_message(tmp_path: Path) -> None:
|
|
loop = _make_full_loop(tmp_path)
|
|
loop.consolidator.maybe_consolidate_by_tokens = AsyncMock(return_value=False) # type: ignore[method-assign]
|
|
loop._run_agent_loop = AsyncMock(return_value=(
|
|
"done",
|
|
None,
|
|
[
|
|
{"role": "system", "content": "system"},
|
|
{"role": "user", "content": "hello"},
|
|
{"role": "assistant", "content": "done"},
|
|
],
|
|
"stop",
|
|
False,
|
|
)) # type: ignore[method-assign]
|
|
|
|
result = await loop._process_message(
|
|
InboundMessage(channel="feishu", sender_id="u1", chat_id="c2", content="hello")
|
|
)
|
|
|
|
assert result is not None
|
|
assert result.content == "done"
|
|
session = loop.sessions.get_or_create("feishu:c2")
|
|
assert [
|
|
{k: v for k, v in m.items() if k in {"role", "content"}}
|
|
for m in session.messages
|
|
] == [
|
|
{"role": "user", "content": "hello"},
|
|
{"role": "assistant", "content": "done"},
|
|
]
|
|
assert AgentLoop._PENDING_USER_TURN_KEY not in session.metadata
|
|
|
|
|
|
@pytest.mark.asyncio
|
|
async def test_process_message_uses_context_chat_id_for_runtime_prompt(tmp_path: Path) -> None:
|
|
loop = _make_full_loop(tmp_path)
|
|
loop.consolidator.maybe_consolidate_by_tokens = AsyncMock(return_value=False) # type: ignore[method-assign]
|
|
loop.context.build_messages = MagicMock( # type: ignore[method-assign]
|
|
return_value=[
|
|
{"role": "system", "content": "system"},
|
|
{"role": "user", "content": "runtime + hello"},
|
|
]
|
|
)
|
|
loop._run_agent_loop = AsyncMock(return_value=( # type: ignore[method-assign]
|
|
"done",
|
|
[],
|
|
[
|
|
{"role": "system", "content": "system"},
|
|
{"role": "user", "content": "runtime + hello"},
|
|
{"role": "assistant", "content": "done"},
|
|
],
|
|
"stop",
|
|
False,
|
|
))
|
|
|
|
result = await loop._process_message(
|
|
InboundMessage(
|
|
channel="discord",
|
|
sender_id="u1",
|
|
chat_id="thread-777",
|
|
content="hello",
|
|
metadata={"context_chat_id": "parent-456"},
|
|
session_key_override="discord:parent-456:thread:thread-777",
|
|
)
|
|
)
|
|
|
|
assert result is not None
|
|
assert result.chat_id == "thread-777"
|
|
assert loop.context.build_messages.call_args.kwargs["chat_id"] == "parent-456"
|
|
assert loop._run_agent_loop.call_args.kwargs["chat_id"] == "thread-777"
|
|
|
|
|
|
def test_set_tool_context_uses_effective_key_for_spawn_tool(tmp_path: Path) -> None:
|
|
loop = _make_full_loop(tmp_path)
|
|
spawn_tool = loop.tools.get("spawn")
|
|
assert spawn_tool is not None
|
|
|
|
loop._set_tool_context(
|
|
"discord",
|
|
"thread-777",
|
|
session_key="discord:parent-456:thread:thread-777",
|
|
)
|
|
|
|
assert spawn_tool._origin_channel.get() == "discord" # type: ignore[attr-defined]
|
|
assert spawn_tool._origin_chat_id.get() == "thread-777" # type: ignore[attr-defined]
|
|
assert spawn_tool._session_key.get() == "discord:parent-456:thread:thread-777" # type: ignore[attr-defined]
|
|
|
|
|
|
@pytest.mark.asyncio
|
|
async def test_next_turn_after_crash_closes_pending_user_turn_before_new_input(tmp_path: Path) -> None:
|
|
loop = _make_full_loop(tmp_path)
|
|
loop.consolidator.maybe_consolidate_by_tokens = AsyncMock(return_value=False) # type: ignore[method-assign]
|
|
loop.provider.chat_with_retry = AsyncMock(return_value=MagicMock()) # unused because _run_agent_loop is stubbed
|
|
|
|
session = loop.sessions.get_or_create("feishu:c3")
|
|
session.add_message("user", "old question")
|
|
session.metadata[AgentLoop._PENDING_USER_TURN_KEY] = True
|
|
loop.sessions.save(session)
|
|
|
|
loop._run_agent_loop = AsyncMock(return_value=(
|
|
"new answer",
|
|
None,
|
|
[
|
|
{"role": "system", "content": "system"},
|
|
{"role": "user", "content": "old question"},
|
|
{"role": "assistant", "content": "Error: Task interrupted before a response was generated."},
|
|
{"role": "user", "content": "new question"},
|
|
{"role": "assistant", "content": "new answer"},
|
|
],
|
|
"stop",
|
|
False,
|
|
)) # type: ignore[method-assign]
|
|
|
|
result = await loop._process_message(
|
|
InboundMessage(channel="feishu", sender_id="u1", chat_id="c3", content="new question")
|
|
)
|
|
|
|
assert result is not None
|
|
assert result.content == "new answer"
|
|
session = loop.sessions.get_or_create("feishu:c3")
|
|
assert [
|
|
{k: v for k, v in m.items() if k in {"role", "content"}}
|
|
for m in session.messages
|
|
] == [
|
|
{"role": "user", "content": "old question"},
|
|
{"role": "assistant", "content": "Error: Task interrupted before a response was generated."},
|
|
{"role": "user", "content": "new question"},
|
|
{"role": "assistant", "content": "new answer"},
|
|
]
|
|
assert AgentLoop._PENDING_USER_TURN_KEY not in session.metadata
|
|
|
|
|
|
@pytest.mark.asyncio
|
|
async def test_stop_preserves_runtime_checkpoint_for_next_turn(tmp_path: Path) -> None:
|
|
from nanobot.command.builtin import cmd_stop
|
|
from nanobot.command.router import CommandContext
|
|
|
|
loop = _make_full_loop(tmp_path)
|
|
loop.consolidator.maybe_consolidate_by_tokens = AsyncMock(return_value=False) # type: ignore[method-assign]
|
|
|
|
checkpoint_saved = asyncio.Event()
|
|
|
|
async def interrupted_run_agent_loop(_initial_messages, *, session=None, **_kwargs):
|
|
assert session is not None
|
|
loop._set_runtime_checkpoint(
|
|
session,
|
|
{
|
|
"assistant_message": {
|
|
"role": "assistant",
|
|
"content": "working",
|
|
"tool_calls": [
|
|
{
|
|
"id": "call_done",
|
|
"type": "function",
|
|
"function": {"name": "read_file", "arguments": "{}"},
|
|
},
|
|
{
|
|
"id": "call_pending",
|
|
"type": "function",
|
|
"function": {"name": "exec", "arguments": "{}"},
|
|
},
|
|
],
|
|
},
|
|
"completed_tool_results": [
|
|
{
|
|
"role": "tool",
|
|
"tool_call_id": "call_done",
|
|
"name": "read_file",
|
|
"content": "ok",
|
|
}
|
|
],
|
|
"pending_tool_calls": [
|
|
{
|
|
"id": "call_pending",
|
|
"type": "function",
|
|
"function": {"name": "exec", "arguments": "{}"},
|
|
}
|
|
],
|
|
},
|
|
)
|
|
checkpoint_saved.set()
|
|
await asyncio.Event().wait()
|
|
|
|
loop._run_agent_loop = interrupted_run_agent_loop # type: ignore[method-assign]
|
|
|
|
first_msg = InboundMessage(channel="feishu", sender_id="u1", chat_id="c4", content="keep progress")
|
|
task = asyncio.create_task(loop._process_message(first_msg))
|
|
loop._active_tasks[first_msg.session_key] = [task]
|
|
await asyncio.wait_for(checkpoint_saved.wait(), timeout=1.0)
|
|
|
|
stop_msg = InboundMessage(channel="feishu", sender_id="u1", chat_id="c4", content="/stop")
|
|
stop_ctx = CommandContext(msg=stop_msg, session=None, key=stop_msg.session_key, raw="/stop", loop=loop)
|
|
stop_result = await cmd_stop(stop_ctx)
|
|
|
|
assert "Stopped 1 task" in stop_result.content
|
|
assert task.done()
|
|
|
|
loop.sessions.invalidate("feishu:c4")
|
|
interrupted = loop.sessions.get_or_create("feishu:c4")
|
|
assert interrupted.metadata.get(AgentLoop._PENDING_USER_TURN_KEY) is True
|
|
assert interrupted.metadata.get(AgentLoop._RUNTIME_CHECKPOINT_KEY) is not None
|
|
|
|
async def resumed_run_agent_loop(initial_messages, **_kwargs):
|
|
return (
|
|
"next answer",
|
|
None,
|
|
[*initial_messages, {"role": "assistant", "content": "next answer"}],
|
|
"stop",
|
|
False,
|
|
)
|
|
|
|
loop._run_agent_loop = resumed_run_agent_loop # type: ignore[method-assign]
|
|
result = await loop._process_message(
|
|
InboundMessage(channel="feishu", sender_id="u1", chat_id="c4", content="continue here")
|
|
)
|
|
|
|
assert result is not None
|
|
assert result.content == "next answer"
|
|
|
|
session = loop.sessions.get_or_create("feishu:c4")
|
|
assert [
|
|
{k: v for k, v in m.items() if k in {"role", "content", "tool_call_id", "name"}}
|
|
for m in session.messages
|
|
] == [
|
|
{"role": "user", "content": "keep progress"},
|
|
{"role": "assistant", "content": "working"},
|
|
{"role": "tool", "tool_call_id": "call_done", "name": "read_file", "content": "ok"},
|
|
{
|
|
"role": "tool",
|
|
"tool_call_id": "call_pending",
|
|
"name": "exec",
|
|
"content": "Error: Task interrupted before this tool finished.",
|
|
},
|
|
{"role": "user", "content": "continue here"},
|
|
{"role": "assistant", "content": "next answer"},
|
|
]
|
|
assert AgentLoop._PENDING_USER_TURN_KEY not in session.metadata
|
|
assert AgentLoop._RUNTIME_CHECKPOINT_KEY not in session.metadata
|
|
|
|
|
|
@pytest.mark.asyncio
|
|
async def test_system_subagent_followup_is_persisted_before_prompt_assembly(tmp_path: Path) -> None:
|
|
loop = _make_full_loop(tmp_path)
|
|
loop.consolidator.maybe_consolidate_by_tokens = AsyncMock(return_value=False) # type: ignore[method-assign]
|
|
|
|
session = loop.sessions.get_or_create("cli:test")
|
|
session.add_message("user", "question")
|
|
session.add_message("assistant", "working")
|
|
loop.sessions.save(session)
|
|
|
|
seen: dict[str, list[dict]] = {}
|
|
|
|
async def fake_run_agent_loop(initial_messages, **_kwargs):
|
|
seen["initial_messages"] = initial_messages
|
|
return (
|
|
"done",
|
|
[],
|
|
[*initial_messages, {"role": "assistant", "content": "done"}],
|
|
"stop",
|
|
False,
|
|
)
|
|
|
|
loop._run_agent_loop = fake_run_agent_loop # type: ignore[method-assign]
|
|
|
|
await loop._process_message(
|
|
InboundMessage(
|
|
channel="system",
|
|
sender_id="subagent",
|
|
chat_id="cli:test",
|
|
content="subagent result",
|
|
metadata={"subagent_task_id": "sub-1"},
|
|
)
|
|
)
|
|
|
|
non_system = [m for m in seen["initial_messages"] if m.get("role") != "system"]
|
|
assert "question" in non_system[0]["content"]
|
|
assert "working" in non_system[1]["content"]
|
|
# User turns carry the timestamp prefix so the model can reason about
|
|
# relative time. Assistant turns do NOT, otherwise the model treats those
|
|
# past replies as in-context examples and starts its own outputs with
|
|
# ``[Message Time: ...]`` (which then leaks back to the user).
|
|
assert "[Message Time:" in non_system[0]["content"]
|
|
assert "[Message Time:" not in non_system[1]["content"]
|
|
assert non_system[2]["content"].count("subagent result") == 1
|
|
assert "Current Time:" in non_system[2]["content"]
|
|
|
|
loop.sessions.invalidate("cli:test")
|
|
persisted = loop.sessions.get_or_create("cli:test")
|
|
assert [
|
|
{k: v for k, v in m.items() if k in {"role", "content", "injected_event", "subagent_task_id"}}
|
|
for m in persisted.messages
|
|
] == [
|
|
{"role": "user", "content": "question"},
|
|
{"role": "assistant", "content": "working"},
|
|
{
|
|
"role": "assistant",
|
|
"content": "subagent result",
|
|
"injected_event": "subagent_result",
|
|
"subagent_task_id": "sub-1",
|
|
},
|
|
{"role": "assistant", "content": "done"},
|
|
]
|
|
|
|
|
|
@pytest.mark.asyncio
|
|
async def test_multiple_subagent_followups_all_persist_as_standalone_history(tmp_path: Path) -> None:
|
|
loop = _make_full_loop(tmp_path)
|
|
loop.consolidator.maybe_consolidate_by_tokens = AsyncMock(return_value=False) # type: ignore[method-assign]
|
|
|
|
async def fake_run_agent_loop(initial_messages, **_kwargs):
|
|
return (
|
|
"ack",
|
|
[],
|
|
[*initial_messages, {"role": "assistant", "content": "ack"}],
|
|
"stop",
|
|
False,
|
|
)
|
|
|
|
loop._run_agent_loop = fake_run_agent_loop # type: ignore[method-assign]
|
|
|
|
for idx in range(3):
|
|
await loop._process_message(
|
|
InboundMessage(
|
|
channel="system",
|
|
sender_id="subagent",
|
|
chat_id="cli:multi",
|
|
content=f"subagent result {idx}",
|
|
metadata={"subagent_task_id": f"sub-{idx}"},
|
|
)
|
|
)
|
|
|
|
loop.sessions.invalidate("cli:multi")
|
|
persisted = loop.sessions.get_or_create("cli:multi")
|
|
followups = [m for m in persisted.messages if m.get("injected_event") == "subagent_result"]
|
|
assert [m["content"] for m in followups] == [
|
|
"subagent result 0",
|
|
"subagent result 1",
|
|
"subagent result 2",
|
|
]
|
|
|
|
|
|
def test_prompt_merge_does_not_replace_standalone_subagent_history_entry(tmp_path: Path) -> None:
|
|
loop = _mk_loop()
|
|
session = Session(key="cli:merge")
|
|
session.add_message("assistant", "previous assistant")
|
|
|
|
inserted = loop._persist_subagent_followup(
|
|
session,
|
|
InboundMessage(
|
|
channel="system",
|
|
sender_id="subagent",
|
|
chat_id="cli:merge",
|
|
content="subagent result",
|
|
metadata={"subagent_task_id": "sub-1"},
|
|
),
|
|
)
|
|
|
|
assert inserted is True
|
|
|
|
builder = ContextBuilder(tmp_path)
|
|
projected = builder.build_messages(
|
|
history=session.get_history(max_messages=0),
|
|
current_message="",
|
|
current_role="assistant",
|
|
channel="cli",
|
|
chat_id="merge",
|
|
)
|
|
|
|
non_system = [m for m in projected if m.get("role") != "system"]
|
|
assert len(non_system) == 2
|
|
assert "subagent result" in non_system[-1]["content"]
|
|
assert session.messages[-1]["content"] == "subagent result"
|
|
assert session.messages[-1]["injected_event"] == "subagent_result"
|
|
|
|
|
|
def test_subagent_followup_dedupes_by_task_id() -> None:
|
|
loop = _mk_loop()
|
|
session = Session(key="cli:dedupe")
|
|
msg = InboundMessage(
|
|
channel="system",
|
|
sender_id="subagent",
|
|
chat_id="cli:dedupe",
|
|
content="subagent result",
|
|
metadata={"subagent_task_id": "sub-1"},
|
|
)
|
|
|
|
assert loop._persist_subagent_followup(session, msg) is True
|
|
assert loop._persist_subagent_followup(session, msg) is False
|
|
assert len(session.messages) == 1
|
|
|
|
|
|
def test_subagent_followup_skips_empty_content() -> None:
|
|
loop = _mk_loop()
|
|
session = Session(key="cli:empty")
|
|
msg = InboundMessage(
|
|
channel="system",
|
|
sender_id="subagent",
|
|
chat_id="cli:empty",
|
|
content="",
|
|
metadata={"subagent_task_id": "sub-empty"},
|
|
)
|
|
|
|
assert loop._persist_subagent_followup(session, msg) is False
|
|
assert session.messages == []
|
|
|
|
|
|
def test_set_tool_context_passes_thread_session_key_to_spawn(tmp_path: Path) -> None:
|
|
loop = _make_full_loop(tmp_path)
|
|
|
|
loop._set_tool_context(
|
|
"slack",
|
|
"C123",
|
|
message_id="msg-123",
|
|
metadata={"slack": {"thread_ts": "1700.42", "channel_type": "channel"}},
|
|
session_key="slack:C123:1700.42",
|
|
)
|
|
|
|
spawn_tool = loop.tools.get("spawn")
|
|
assert spawn_tool is not None
|
|
assert spawn_tool._session_key.get() == "slack:C123:1700.42"
|
|
assert spawn_tool._origin_message_id.get() == "msg-123"
|
|
|
|
|
|
@pytest.mark.asyncio
|
|
async def test_system_subagent_followup_uses_thread_session_and_slack_metadata(tmp_path: Path) -> None:
|
|
loop = _make_full_loop(tmp_path)
|
|
loop.consolidator.maybe_consolidate_by_tokens = AsyncMock(return_value=False) # type: ignore[method-assign]
|
|
|
|
thread_session = loop.sessions.get_or_create("slack:C123:1700.42")
|
|
thread_session.add_message("user", "thread question")
|
|
loop.sessions.save(thread_session)
|
|
|
|
seen: dict[str, list[dict]] = {}
|
|
|
|
async def fake_run_agent_loop(initial_messages, **_kwargs):
|
|
seen["initial_messages"] = initial_messages
|
|
return (
|
|
"done",
|
|
[],
|
|
[*initial_messages, {"role": "assistant", "content": "done"}],
|
|
"stop",
|
|
False,
|
|
)
|
|
|
|
loop._run_agent_loop = fake_run_agent_loop # type: ignore[method-assign]
|
|
|
|
outbound = await loop._process_message(
|
|
InboundMessage(
|
|
channel="system",
|
|
sender_id="subagent",
|
|
chat_id="slack:C123",
|
|
content="subagent result",
|
|
session_key_override="slack:C123:1700.42",
|
|
metadata={"subagent_task_id": "sub-1", "origin_message_id": "msg-123"},
|
|
)
|
|
)
|
|
|
|
assert outbound is not None
|
|
assert outbound.channel == "slack"
|
|
assert outbound.chat_id == "C123"
|
|
assert outbound.metadata == {
|
|
"slack": {"thread_ts": "1700.42"},
|
|
"origin_message_id": "msg-123",
|
|
}
|
|
assert "thread question" in seen["initial_messages"][1]["content"]
|
|
|
|
loop.sessions.invalidate("slack:C123:1700.42")
|
|
persisted = loop.sessions.get_or_create("slack:C123:1700.42")
|
|
assert any(m.get("subagent_task_id") == "sub-1" for m in persisted.messages)
|