diff --git a/docs/channel-plugin-guide.md b/docs/channel-plugin-guide.md index d37a92883..da668c9ee 100644 --- a/docs/channel-plugin-guide.md +++ b/docs/channel-plugin-guide.md @@ -238,6 +238,9 @@ nanobot channels login --force # re-authenticate | `supports_streaming` (property) | `True` when config has `"streaming": true` **and** subclass overrides `send_delta()`. | | `is_running` | Returns `self._running`. | | `login(force=False)` | Perform interactive login (e.g. QR code scan). Returns `True` if already authenticated or login succeeds. Override in subclasses that support interactive login. | +| `send_reasoning_delta(chat_id, delta, metadata?)` | Optional hook for streamed model reasoning/thinking content. Default is no-op. | +| `send_reasoning_end(chat_id, metadata?)` | Optional hook marking the end of a reasoning block. Default is no-op. | +| `send_reasoning(msg)` | Optional one-shot reasoning fallback. Default translates to `send_reasoning_delta()` + `send_reasoning_end()`. | ### Optional (streaming) @@ -350,6 +353,112 @@ When `streaming` is `false` (default) or omitted, only `send()` is called — no | `async send_delta(chat_id, delta, metadata?)` | Override to handle streaming chunks. No-op by default. | | `supports_streaming` (property) | Returns `True` when config has `streaming: true` **and** subclass overrides `send_delta`. | +## Progress, Tool Hints, and Reasoning + +Besides normal assistant text, nanobot can emit low-emphasis trace blocks. These are intended for UI affordances like status rows, collapsible "used tools" groups, or reasoning/thinking blocks. Platforms that do not have a good place for them can ignore them safely. + +### Progress and Tool Hints + +Progress and tool hints arrive through the normal `send(msg)` path. Check `msg.metadata` before rendering: + +```python +async def send(self, msg: OutboundMessage) -> None: + meta = msg.metadata or {} + + if meta.get("_tool_hint"): + # A short tool breadcrumb, e.g. read_file("config.json") + await self._send_trace(msg.chat_id, msg.content, kind="tool") + return + + if meta.get("_progress"): + # Generic non-final status, e.g. "Thinking..." or "Running command..." + await self._send_trace(msg.chat_id, msg.content, kind="progress") + return + + await self._send_message(msg.chat_id, msg.content, media=msg.media) +``` + +Tool hints are off by default for most channels. Users can enable them globally or per channel: + +```json +{ + "channels": { + "sendToolHints": true, + "webhook": { + "enabled": true, + "sendToolHints": true + } + } +} +``` + +### Reasoning Blocks + +Reasoning is delivered through dedicated optional hooks, not `send()`. Override `send_reasoning_delta()` and `send_reasoning_end()` if your platform can show model reasoning as a subdued/collapsible block. The default implementation is a no-op, so unsupported channels simply drop reasoning content. + +```python +class WebhookChannel(BaseChannel): + name = "webhook" + display_name = "Webhook" + + def __init__(self, config: Any, bus: MessageBus): + if isinstance(config, dict): + config = WebhookConfig(**config) + super().__init__(config, bus) + self._reasoning_buffers: dict[str, str] = {} + + async def send_reasoning_delta( + self, + chat_id: str, + delta: str, + metadata: dict[str, Any] | None = None, + ) -> None: + meta = metadata or {} + stream_id = str(meta.get("_stream_id") or chat_id) + self._reasoning_buffers[stream_id] = self._reasoning_buffers.get(stream_id, "") + delta + await self._update_reasoning_block(chat_id, self._reasoning_buffers[stream_id], final=False) + + async def send_reasoning_end( + self, + chat_id: str, + metadata: dict[str, Any] | None = None, + ) -> None: + meta = metadata or {} + stream_id = str(meta.get("_stream_id") or chat_id) + text = self._reasoning_buffers.pop(stream_id, "") + if text: + await self._update_reasoning_block(chat_id, text, final=True) +``` + +**Reasoning metadata flags:** + +| Flag | Meaning | +|------|---------| +| `_reasoning_delta: True` | A reasoning/thinking chunk; `delta` contains the new text. | +| `_reasoning_end: True` | The current reasoning block is complete; `delta` is empty. | +| `_reasoning: True` | Legacy one-shot reasoning. `BaseChannel.send_reasoning()` converts it to delta + end. | +| `_stream_id` | Stable id for this assistant turn/segment. Use it to key buffers instead of only `chat_id`. | + +Reasoning visibility is controlled by `showReasoning` globally or per channel: + +```json +{ + "channels": { + "showReasoning": true, + "webhook": { + "enabled": true, + "showReasoning": true + } + } +} +``` + +Recommended rendering: + +- Render tool hints and progress as trace/status UI, not as normal assistant replies. +- Render reasoning with lower visual emphasis and collapse it after completion when the platform supports that. +- Keep reasoning separate from final answer text. A final answer still arrives through `send()` or `send_delta()`. + ## Config ### Why Pydantic model is required diff --git a/docs/configuration.md b/docs/configuration.md index c0d73e7b2..0123017d2 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -743,6 +743,7 @@ Global settings that apply to all channels. Configure under the `channels` secti |---------|---------|-------------| | `sendProgress` | `true` | Stream agent's text progress to the channel | | `sendToolHints` | `false` | Stream tool-call hints (e.g. `read_file("…")`) | +| `showReasoning` | `true` | Allow channels to surface model reasoning/thinking content (DeepSeek-R1 `reasoning_content`, Anthropic `thinking_blocks`, inline `` tags). Reasoning flows as a dedicated stream with `_reasoning_delta` / `_reasoning_end` markers — channels override `send_reasoning_delta` / `send_reasoning_end` to render in-place updates. Even with `true`, channels without those overrides stay no-op silently. Currently surfaced on CLI and WebSocket/WebUI (italic shimmer header, auto-collapses after the stream ends); Telegram / Slack / Discord / Feishu / WeChat / Matrix keep the base no-op until their bubble UI is adapted. Independent of `sendProgress`. | | `sendMaxRetries` | `3` | Max delivery attempts per outbound message, including the initial send (0-10 configured, minimum 1 actual attempt) | | `transcriptionProvider` | `"groq"` | Voice transcription backend: `"groq"` (free tier, default) or `"openai"`. API key is auto-resolved from the matching provider config. | | `transcriptionLanguage` | `null` | Optional ISO-639-1 language hint for audio transcription, e.g. `"en"`, `"ko"`, `"ja"`. | diff --git a/docs/websocket.md b/docs/websocket.md index 556bb5bb6..d6a816ac1 100644 --- a/docs/websocket.md +++ b/docs/websocket.md @@ -128,6 +128,29 @@ All frames are JSON text. Each message has an `event` field. } ``` +**`reasoning_delta`** — incremental model reasoning / thinking chunk for the active assistant turn. Mirrors `delta` but targets the reasoning bubble above the answer rather than the answer body: + +```json +{ + "event": "reasoning_delta", + "chat_id": "uuid-v4", + "text": "Let me decompose ", + "stream_id": "r1" +} +``` + +**`reasoning_end`** — close marker for the active reasoning stream. WebUI uses this to lock the in-place bubble and switch from the shimmer header to a static collapsed state: + +```json +{ + "event": "reasoning_end", + "chat_id": "uuid-v4", + "stream_id": "r1" +} +``` + +Reasoning frames only flow when the channel's `showReasoning` is `true` (default) and the model returns reasoning content (DeepSeek-R1 / Kimi / MiMo / OpenAI reasoning models, Anthropic extended thinking, or inline `` / `` tags). Models without reasoning produce zero `reasoning_delta` frames. + **`runtime_model_updated`** — broadcast when the gateway runtime model changes, for example after `/model `: ```json diff --git a/nanobot/agent/hook.py b/nanobot/agent/hook.py index d0106cfb6..5b6fed445 100644 --- a/nanobot/agent/hook.py +++ b/nanobot/agent/hook.py @@ -22,6 +22,7 @@ class AgentHookContext: tool_results: list[Any] = field(default_factory=list) tool_events: list[dict[str, str]] = field(default_factory=list) streamed_content: bool = False + streamed_reasoning: bool = False final_content: str | None = None stop_reason: str | None = None error: str | None = None @@ -48,6 +49,17 @@ class AgentHook: async def before_execute_tools(self, context: AgentHookContext) -> None: pass + async def emit_reasoning(self, reasoning_content: str | None) -> None: + pass + + async def emit_reasoning_end(self) -> None: + """Mark the end of an in-flight reasoning stream. + + Hooks that buffer ``emit_reasoning`` chunks (for in-place UI updates) + flush and freeze the rendered group here. One-shot hooks ignore. + """ + pass + async def after_iteration(self, context: AgentHookContext) -> None: pass @@ -95,6 +107,12 @@ class CompositeHook(AgentHook): async def before_execute_tools(self, context: AgentHookContext) -> None: await self._for_each_hook_safe("before_execute_tools", context) + async def emit_reasoning(self, reasoning_content: str | None) -> None: + await self._for_each_hook_safe("emit_reasoning", reasoning_content) + + async def emit_reasoning_end(self) -> None: + await self._for_each_hook_safe("emit_reasoning_end") + async def after_iteration(self, context: AgentHookContext) -> None: await self._for_each_hook_safe("after_iteration", context) diff --git a/nanobot/agent/loop.py b/nanobot/agent/loop.py index 476a2caf2..9bfce39fb 100644 --- a/nanobot/agent/loop.py +++ b/nanobot/agent/loop.py @@ -4,7 +4,6 @@ from __future__ import annotations import asyncio import dataclasses -import json import os import time from contextlib import AsyncExitStack, nullcontext, suppress @@ -15,11 +14,12 @@ from typing import TYPE_CHECKING, Any, Awaitable, Callable from loguru import logger +from nanobot.agent import model_presets as preset_helpers from nanobot.agent.autocompact import AutoCompact from nanobot.agent.context import ContextBuilder -from nanobot.agent.hook import AgentHook, AgentHookContext, CompositeHook +from nanobot.agent.hook import AgentHook, CompositeHook from nanobot.agent.memory import Consolidator, Dream -from nanobot.agent import model_presets as preset_helpers +from nanobot.agent.progress_hook import AgentProgressHook from nanobot.agent.runner import _MAX_INJECTIONS_PER_TURN, AgentRunner, AgentRunSpec from nanobot.agent.subagent import SubagentManager from nanobot.agent.tools.file_state import FileStateStore, bind_file_states, reset_file_states @@ -38,12 +38,6 @@ from nanobot.utils.document import extract_documents from nanobot.utils.helpers import image_placeholder_text from nanobot.utils.helpers import truncate_text as truncate_text_fn from nanobot.utils.image_generation_intent import image_generation_prompt -from nanobot.utils.progress_events import ( - build_tool_event_finish_payloads, - build_tool_event_start_payload, - invoke_on_progress, - on_progress_accepts_tool_events, -) from nanobot.utils.runtime import EMPTY_FINAL_RESPONSE_MESSAGE from nanobot.utils.webui_titles import mark_webui_session, maybe_generate_webui_title_after_turn @@ -59,114 +53,6 @@ if TYPE_CHECKING: UNIFIED_SESSION_KEY = "unified:default" -class _LoopHook(AgentHook): - """Core hook for the main loop.""" - - def __init__( - self, - agent_loop: AgentLoop, - on_progress: Callable[..., Awaitable[None]] | None = None, - on_stream: Callable[[str], Awaitable[None]] | None = None, - on_stream_end: Callable[..., Awaitable[None]] | None = None, - *, - channel: str = "cli", - chat_id: str = "direct", - message_id: str | None = None, - metadata: dict[str, Any] | None = None, - session_key: str | None = None, - ) -> None: - super().__init__(reraise=True) - self._loop = agent_loop - self._on_progress = on_progress - self._on_stream = on_stream - self._on_stream_end = on_stream_end - self._channel = channel - self._chat_id = chat_id - self._message_id = message_id - self._metadata = metadata or {} - self._session_key = session_key - self._stream_buf = "" - - def wants_streaming(self) -> bool: - return self._on_stream is not None - - async def on_stream(self, context: AgentHookContext, delta: str) -> None: - from nanobot.utils.helpers import strip_think - - prev_clean = strip_think(self._stream_buf) - self._stream_buf += delta - new_clean = strip_think(self._stream_buf) - incremental = new_clean[len(prev_clean) :] - if incremental and self._on_stream: - await self._on_stream(incremental) - - async def on_stream_end(self, context: AgentHookContext, *, resuming: bool) -> None: - if self._on_stream_end: - await self._on_stream_end(resuming=resuming) - self._stream_buf = "" - - async def before_iteration(self, context: AgentHookContext) -> None: - self._loop._current_iteration = context.iteration - logger.debug( - "Starting agent loop iteration {} for session {}", - context.iteration, - self._session_key, - ) - - async def before_execute_tools(self, context: AgentHookContext) -> None: - if self._on_progress: - if not self._on_stream and not context.streamed_content: - thought = self._loop._strip_think( - context.response.content if context.response else None - ) - if thought: - await self._on_progress(thought) - tool_hint = self._loop._strip_think(self._loop._tool_hint(context.tool_calls)) - tool_events = [build_tool_event_start_payload(tc) for tc in context.tool_calls] - await invoke_on_progress( - self._on_progress, - tool_hint, - tool_hint=True, - tool_events=tool_events, - ) - for tc in context.tool_calls: - args_str = json.dumps(tc.arguments, ensure_ascii=False) - logger.info("Tool call: {}({})", tc.name, args_str[:200]) - self._loop._set_tool_context( - self._channel, - self._chat_id, - self._message_id, - self._metadata, - session_key=self._session_key, - ) - - async def after_iteration(self, context: AgentHookContext) -> None: - if ( - self._on_progress - and context.tool_calls - and context.tool_events - and on_progress_accepts_tool_events(self._on_progress) - ): - tool_events = build_tool_event_finish_payloads(context) - if tool_events: - await invoke_on_progress( - self._on_progress, - "", - tool_hint=False, - tool_events=tool_events, - ) - u = context.usage or {} - logger.debug( - "LLM usage: prompt={} completion={} cached={}", - u.get("prompt_tokens", 0), - u.get("completion_tokens", 0), - u.get("cached_tokens", 0), - ) - - def finalize_content(self, context: AgentHookContext, content: str | None) -> str | None: - return self._loop._strip_think(content) - - class TurnState(Enum): RESTORE = auto() COMPACT = auto() @@ -617,26 +503,11 @@ class AgentLoop: if tool and isinstance(tool, ContextAware): tool.set_context(request_ctx) - @staticmethod - def _strip_think(text: str | None) -> str | None: - """Remove blocks that some models embed in content.""" - if not text: - return None - from nanobot.utils.helpers import strip_think - - return strip_think(text) or None - @staticmethod def _runtime_chat_id(msg: InboundMessage) -> str: """Return the chat id shown in runtime metadata for the model.""" return str(msg.metadata.get("context_chat_id") or msg.chat_id) - def _tool_hint(self, tool_calls: list) -> str: - """Format tool calls as concise hints with smart abbreviation.""" - from nanobot.utils.tool_hints import format_tool_hints - - return format_tool_hints(tool_calls, max_length=self.tool_hint_max_length) - async def _build_bus_progress_callback( self, msg: InboundMessage ) -> Callable[..., Awaitable[None]]: @@ -647,10 +518,16 @@ class AgentLoop: *, tool_hint: bool = False, tool_events: list[dict[str, Any]] | None = None, + reasoning: bool = False, + reasoning_end: bool = False, ) -> None: meta = dict(msg.metadata or {}) meta["_progress"] = True meta["_tool_hint"] = tool_hint + if reasoning: + meta["_reasoning_delta"] = True + if reasoning_end: + meta["_reasoning_end"] = True if tool_events: meta["_tool_events"] = tool_events await self.bus.publish_outbound( @@ -794,8 +671,7 @@ class AgentLoop: """ self._sync_subagent_runtime_limits() - loop_hook = _LoopHook( - self, + loop_hook = AgentProgressHook( on_progress=on_progress, on_stream=on_stream, on_stream_end=on_stream_end, @@ -804,6 +680,9 @@ class AgentLoop: message_id=message_id, metadata=metadata, session_key=session_key, + tool_hint_max_length=self.tool_hint_max_length, + set_tool_context=self._set_tool_context, + on_iteration=lambda iteration: setattr(self, "_current_iteration", iteration), ) hook: AgentHook = ( CompositeHook([loop_hook] + self._extra_hooks) if self._extra_hooks else loop_hook diff --git a/nanobot/agent/progress_hook.py b/nanobot/agent/progress_hook.py new file mode 100644 index 000000000..a9bf6a1e9 --- /dev/null +++ b/nanobot/agent/progress_hook.py @@ -0,0 +1,178 @@ +"""Agent hook that adapts runner events into channel progress UI.""" + +from __future__ import annotations + +import inspect +import json +from typing import Any, Awaitable, Callable + +from loguru import logger + +from nanobot.agent.hook import AgentHook, AgentHookContext +from nanobot.utils.helpers import IncrementalThinkExtractor, strip_think +from nanobot.utils.progress_events import ( + build_tool_event_finish_payloads, + build_tool_event_start_payload, + invoke_on_progress, + on_progress_accepts_tool_events, +) +from nanobot.utils.tool_hints import format_tool_hints + + +class AgentProgressHook(AgentHook): + """Translate runner lifecycle events into user-visible progress signals.""" + + def __init__( + self, + on_progress: Callable[..., Awaitable[None]] | None = None, + on_stream: Callable[[str], Awaitable[None]] | None = None, + on_stream_end: Callable[..., Awaitable[None]] | None = None, + *, + channel: str = "cli", + chat_id: str = "direct", + message_id: str | None = None, + metadata: dict[str, Any] | None = None, + session_key: str | None = None, + tool_hint_max_length: int = 40, + set_tool_context: Callable[..., None] | None = None, + on_iteration: Callable[[int], None] | None = None, + ) -> None: + super().__init__(reraise=True) + self._on_progress = on_progress + self._on_stream = on_stream + self._on_stream_end = on_stream_end + self._channel = channel + self._chat_id = chat_id + self._message_id = message_id + self._metadata = metadata or {} + self._session_key = session_key + self._tool_hint_max_length = tool_hint_max_length + self._set_tool_context = set_tool_context + self._on_iteration = on_iteration + self._stream_buf = "" + self._think_extractor = IncrementalThinkExtractor() + self._reasoning_open = False + + def wants_streaming(self) -> bool: + return self._on_stream is not None + + @staticmethod + def _strip_think(text: str | None) -> str | None: + if not text: + return None + return strip_think(text) or None + + def _tool_hint(self, tool_calls: list[Any]) -> str: + return format_tool_hints(tool_calls, max_length=self._tool_hint_max_length) + + @staticmethod + def _on_progress_accepts(cb: Callable[..., Any], name: str) -> bool: + try: + sig = inspect.signature(cb) + except (TypeError, ValueError): + return False + if any(p.kind == inspect.Parameter.VAR_KEYWORD for p in sig.parameters.values()): + return True + return name in sig.parameters + + async def on_stream(self, context: AgentHookContext, delta: str) -> None: + prev_clean = strip_think(self._stream_buf) + self._stream_buf += delta + new_clean = strip_think(self._stream_buf) + incremental = new_clean[len(prev_clean) :] + + if await self._think_extractor.feed(self._stream_buf, self.emit_reasoning): + context.streamed_reasoning = True + + if incremental: + # Answer text has started; close the reasoning segment so the UI can + # lock the bubble before the answer renders below it. + await self.emit_reasoning_end() + if self._on_stream: + await self._on_stream(incremental) + + async def on_stream_end(self, context: AgentHookContext, *, resuming: bool) -> None: + await self.emit_reasoning_end() + if self._on_stream_end: + await self._on_stream_end(resuming=resuming) + self._stream_buf = "" + self._think_extractor.reset() + + async def before_iteration(self, context: AgentHookContext) -> None: + if self._on_iteration: + self._on_iteration(context.iteration) + logger.debug( + "Starting agent loop iteration {} for session {}", + context.iteration, + self._session_key, + ) + + async def before_execute_tools(self, context: AgentHookContext) -> None: + if self._on_progress: + if not self._on_stream and not context.streamed_content: + thought = self._strip_think(context.response.content if context.response else None) + if thought: + await self._on_progress(thought) + tool_hint = self._strip_think(self._tool_hint(context.tool_calls)) + tool_events = [build_tool_event_start_payload(tc) for tc in context.tool_calls] + await invoke_on_progress( + self._on_progress, + tool_hint, + tool_hint=True, + tool_events=tool_events, + ) + for tc in context.tool_calls: + args_str = json.dumps(tc.arguments, ensure_ascii=False) + logger.info("Tool call: {}({})", tc.name, args_str[:200]) + if self._set_tool_context: + self._set_tool_context( + self._channel, + self._chat_id, + self._message_id, + self._metadata, + session_key=self._session_key, + ) + + async def emit_reasoning(self, reasoning_content: str | None) -> None: + """Publish a reasoning chunk; channel plugins decide whether to render.""" + if ( + self._on_progress + and reasoning_content + and self._on_progress_accepts(self._on_progress, "reasoning") + ): + self._reasoning_open = True + await self._on_progress(reasoning_content, reasoning=True) + + async def emit_reasoning_end(self) -> None: + """Close the current reasoning stream segment, if any was open.""" + if self._reasoning_open and self._on_progress: + self._reasoning_open = False + await self._on_progress("", reasoning_end=True) + else: + self._reasoning_open = False + + async def after_iteration(self, context: AgentHookContext) -> None: + if ( + self._on_progress + and context.tool_calls + and context.tool_events + and on_progress_accepts_tool_events(self._on_progress) + ): + tool_events = build_tool_event_finish_payloads(context) + if tool_events: + await invoke_on_progress( + self._on_progress, + "", + tool_hint=False, + tool_events=tool_events, + ) + u = context.usage or {} + logger.debug( + "LLM usage: prompt={} completion={} cached={}", + u.get("prompt_tokens", 0), + u.get("completion_tokens", 0), + u.get("cached_tokens", 0), + ) + + def finalize_content(self, context: AgentHookContext, content: str | None) -> str | None: + return self._strip_think(content) diff --git a/nanobot/agent/runner.py b/nanobot/agent/runner.py index 9ea0d26de..37da63872 100644 --- a/nanobot/agent/runner.py +++ b/nanobot/agent/runner.py @@ -16,9 +16,11 @@ from nanobot.agent.hook import AgentHook, AgentHookContext from nanobot.agent.tools.registry import ToolRegistry from nanobot.providers.base import LLMProvider, LLMResponse, ToolCallRequest from nanobot.utils.helpers import ( + IncrementalThinkExtractor, build_assistant_message, estimate_message_tokens, estimate_prompt_tokens_chain, + extract_reasoning, find_legal_message_start, maybe_persist_tool_result, strip_think, @@ -281,6 +283,17 @@ class AgentRunner: context.tool_calls = list(response.tool_calls) self._accumulate_usage(usage, raw_usage) + reasoning_text, cleaned_content = extract_reasoning( + response.reasoning_content, + response.thinking_blocks, + response.content, + ) + response.content = cleaned_content + if reasoning_text and not context.streamed_reasoning: + await hook.emit_reasoning(reasoning_text) + await hook.emit_reasoning_end() + context.streamed_reasoning = True + if response.should_execute_tools: context.tool_calls = list(response.tool_calls) if hook.wants_streaming(): @@ -605,6 +618,8 @@ class AgentRunner: and getattr(self.provider, "supports_progress_deltas", False) is True ) + progress_state: dict[str, bool] | None = None + if wants_streaming: async def _stream(delta: str) -> None: if delta: @@ -617,6 +632,8 @@ class AgentRunner: ) elif wants_progress_streaming: stream_buf = "" + think_extractor = IncrementalThinkExtractor() + progress_state = {"reasoning_open": False} async def _stream_progress(delta: str) -> None: nonlocal stream_buf @@ -626,7 +643,15 @@ class AgentRunner: stream_buf += delta new_clean = strip_think(stream_buf) incremental = new_clean[len(prev_clean):] + + if await think_extractor.feed(stream_buf, hook.emit_reasoning): + context.streamed_reasoning = True + progress_state["reasoning_open"] = True + if incremental: + if progress_state["reasoning_open"]: + await hook.emit_reasoning_end() + progress_state["reasoning_open"] = False context.streamed_content = True await spec.progress_callback(incremental) @@ -637,16 +662,20 @@ class AgentRunner: else: coro = self.provider.chat_with_retry(**kwargs) - if timeout_s is None: - return await coro try: - return await asyncio.wait_for(coro, timeout=timeout_s) + response = ( + await coro if timeout_s is None + else await asyncio.wait_for(coro, timeout=timeout_s) + ) except asyncio.TimeoutError: return LLMResponse( content=f"Error calling LLM: timed out after {timeout_s:g}s", finish_reason="error", error_kind="timeout", ) + if progress_state and progress_state.get("reasoning_open"): + await hook.emit_reasoning_end() + return response async def _request_finalization_retry( self, diff --git a/nanobot/channels/base.py b/nanobot/channels/base.py index 087677494..257127d5a 100644 --- a/nanobot/channels/base.py +++ b/nanobot/channels/base.py @@ -28,6 +28,7 @@ class BaseChannel(ABC): transcription_language: str | None = None send_progress: bool = True send_tool_hints: bool = False + show_reasoning: bool = True def __init__(self, config: Any, bus: MessageBus): """ @@ -120,6 +121,53 @@ class BaseChannel(ABC): """ pass + async def send_reasoning_delta( + self, chat_id: str, delta: str, metadata: dict[str, Any] | None = None + ) -> None: + """Stream a chunk of model reasoning/thinking content. + + Default is no-op. Channels with a native low-emphasis primitive + (Slack context block, Telegram expandable blockquote, Discord + subtext, WebUI italic bubble, ...) override to render reasoning + as a subordinate trace that updates in place as the model thinks. + + Streaming contract mirrors :meth:`send_delta`: ``_reasoning_delta`` + is a chunk, ``_reasoning_end`` ends the current reasoning segment, + and stateful implementations should key buffers by ``_stream_id`` + rather than only by ``chat_id``. + """ + return + + async def send_reasoning_end( + self, chat_id: str, metadata: dict[str, Any] | None = None + ) -> None: + """Mark the end of a reasoning stream segment. + + Default is no-op. Channels that buffer ``send_reasoning_delta`` + chunks for in-place updates use this signal to flush and freeze + the rendered group; one-shot channels can ignore it entirely. + """ + return + + async def send_reasoning(self, msg: OutboundMessage) -> None: + """Deliver a complete reasoning block. + + Default implementation reuses the streaming pair so plugins only + need to override the delta/end methods. Equivalent to one delta + with the full content followed immediately by an end marker — + keeps a single rendering path for both streamed and one-shot + reasoning (e.g. DeepSeek-R1's final-response ``reasoning_content``). + """ + if not msg.content: + return + meta = dict(msg.metadata or {}) + meta.setdefault("_reasoning_delta", True) + await self.send_reasoning_delta(msg.chat_id, msg.content, meta) + end_meta = dict(meta) + end_meta.pop("_reasoning_delta", None) + end_meta["_reasoning_end"] = True + await self.send_reasoning_end(msg.chat_id, end_meta) + @property def supports_streaming(self) -> bool: """True when config enables streaming AND this subclass implements send_delta.""" diff --git a/nanobot/channels/manager.py b/nanobot/channels/manager.py index 1d92bb879..3a6b6e50f 100644 --- a/nanobot/channels/manager.py +++ b/nanobot/channels/manager.py @@ -36,6 +36,7 @@ _SEND_RETRY_DELAYS = (1, 2, 4) _BOOL_CAMEL_ALIASES: dict[str, str] = { "send_progress": "sendProgress", "send_tool_hints": "sendToolHints", + "show_reasoning": "showReasoning", } class ChannelManager: @@ -104,6 +105,9 @@ class ChannelManager: channel.send_tool_hints = self._resolve_bool_override( section, "send_tool_hints", self.config.channels.send_tool_hints, ) + channel.show_reasoning = self._resolve_bool_override( + section, "show_reasoning", self.config.channels.show_reasoning, + ) self.channels[name] = channel logger.info("{} channel enabled", cls.display_name) except Exception as e: @@ -279,6 +283,23 @@ class ChannelManager: timeout=1.0 ) + if ( + msg.metadata.get("_reasoning_delta") + or msg.metadata.get("_reasoning_end") + or msg.metadata.get("_reasoning") + ): + # Reasoning rides its own plugin channel: only delivered + # when the destination channel opts in via ``show_reasoning`` + # and overrides the streaming primitives. Channels without + # a low-emphasis UI affordance keep the base no-op and the + # content silently drops here. ``_reasoning`` (one-shot) + # is accepted for backward compatibility with hooks that + # haven't migrated to delta/end yet. + channel = self.channels.get(msg.channel) + if channel is not None and channel.show_reasoning: + await self._send_with_retry(channel, msg) + continue + if msg.metadata.get("_progress"): if msg.metadata.get("_tool_hint") and not self._should_send_progress( msg.channel, tool_hint=True, @@ -329,7 +350,16 @@ class ChannelManager: @staticmethod async def _send_once(channel: BaseChannel, msg: OutboundMessage) -> None: """Send one outbound message without retry policy.""" - if msg.metadata.get("_stream_delta") or msg.metadata.get("_stream_end"): + if msg.metadata.get("_reasoning_end"): + await channel.send_reasoning_end(msg.chat_id, msg.metadata) + elif msg.metadata.get("_reasoning_delta"): + await channel.send_reasoning_delta(msg.chat_id, msg.content, msg.metadata) + elif msg.metadata.get("_reasoning"): + # Back-compat: one-shot reasoning. BaseChannel translates this + # to a single delta + end pair so plugins only implement the + # streaming primitives. + await channel.send_reasoning(msg) + elif msg.metadata.get("_stream_delta") or msg.metadata.get("_stream_end"): await channel.send_delta(msg.chat_id, msg.content, msg.metadata) elif not msg.metadata.get("_streamed"): await channel.send(msg) diff --git a/nanobot/channels/websocket.py b/nanobot/channels/websocket.py index 76ca513d0..a77c8594f 100644 --- a/nanobot/channels/websocket.py +++ b/nanobot/channels/websocket.py @@ -1487,6 +1487,54 @@ class WebSocketChannel(BaseChannel): for connection in conns: await self._safe_send_to(connection, raw, label=" ") + async def send_reasoning_delta( + self, + chat_id: str, + delta: str, + metadata: dict[str, Any] | None = None, + ) -> None: + """Push one chunk of model reasoning. Mirrors ``send_delta`` shape so + WebUI receives a stream that opens, updates in place, and closes — + rendered above the active assistant bubble with a shimmer header + until the matching ``reasoning_end`` arrives. + """ + conns = list(self._subs.get(chat_id, ())) + if not conns or not delta: + return + meta = metadata or {} + body: dict[str, Any] = { + "event": "reasoning_delta", + "chat_id": chat_id, + "text": delta, + } + stream_id = meta.get("_stream_id") + if stream_id is not None: + body["stream_id"] = stream_id + raw = json.dumps(body, ensure_ascii=False) + for connection in conns: + await self._safe_send_to(connection, raw, label=" reasoning ") + + async def send_reasoning_end( + self, + chat_id: str, + metadata: dict[str, Any] | None = None, + ) -> None: + """Close the current reasoning stream segment for in-place renderers.""" + conns = list(self._subs.get(chat_id, ())) + if not conns: + return + meta = metadata or {} + body: dict[str, Any] = { + "event": "reasoning_end", + "chat_id": chat_id, + } + stream_id = meta.get("_stream_id") + if stream_id is not None: + body["stream_id"] = stream_id + raw = json.dumps(body, ensure_ascii=False) + for connection in conns: + await self._safe_send_to(connection, raw, label=" reasoning_end ") + async def send_delta( self, chat_id: str, diff --git a/nanobot/cli/commands.py b/nanobot/cli/commands.py index 0d71d91db..e02653bf9 100644 --- a/nanobot/cli/commands.py +++ b/nanobot/cli/commands.py @@ -176,13 +176,15 @@ def _print_agent_response( response: str, render_markdown: bool, metadata: dict | None = None, + show_header: bool = True, ) -> None: """Render assistant response with consistent terminal styling.""" console = _make_console() content = response or "" body = _response_renderable(content, render_markdown, metadata) - console.print() - console.print(f"[cyan]{__logo__} nanobot[/cyan]") + if show_header: + console.print() + console.print(f"[cyan]{__logo__} nanobot[/cyan]") console.print(body) console.print() @@ -228,42 +230,70 @@ async def _print_interactive_response( await run_in_terminal(_write) -def _print_cli_progress_line(text: str, thinking: ThinkingSpinner | None) -> None: +def _print_cli_progress_line(text: str, thinking: ThinkingSpinner | None, renderer: StreamRenderer | None = None) -> None: """Print a CLI progress line, pausing the spinner if needed.""" if not text.strip(): return - with thinking.pause() if thinking else nullcontext(): - console.print(f" [dim]↳ {text}[/dim]") + target = renderer.console if renderer else console + pause = renderer.pause_spinner() if renderer else (thinking.pause() if thinking else nullcontext()) + with pause: + if renderer: + renderer.ensure_header() + target.print(f" [dim]↳ {text}[/dim]") -async def _print_interactive_progress_line(text: str, renderer: StreamRenderer | None) -> None: - """Print an interactive progress line, pausing the renderer's spinner if needed.""" +def _print_cli_reasoning(text: str, thinking: ThinkingSpinner | None, renderer: StreamRenderer | None = None) -> None: + """Print reasoning/thinking content in a distinct style.""" if not text.strip(): return - with renderer.pause() if renderer else nullcontext(): - await _print_interactive_line(text) + target = renderer.console if renderer else console + pause = renderer.pause_spinner() if renderer else (thinking.pause() if thinking else nullcontext()) + with pause: + if renderer: + renderer.ensure_header() + target.print(f"[dim italic]✻ {text}[/dim italic]") + + +async def _print_interactive_progress_line(text: str, thinking: ThinkingSpinner | None, renderer: StreamRenderer | None = None) -> None: + """Print an interactive progress line, pausing the spinner if needed.""" + if not text.strip(): + return + if renderer: + with renderer.pause_spinner(): + renderer.ensure_header() + renderer.console.print(f" [dim]↳ {text}[/dim]") + else: + with thinking.pause() if thinking else nullcontext(): + await _print_interactive_line(text) async def _maybe_print_interactive_progress( msg: Any, - renderer: StreamRenderer | None, + thinking: ThinkingSpinner | None, channels_config: Any, + renderer: StreamRenderer | None = None, ) -> bool: metadata = msg.metadata or {} if metadata.get("_retry_wait"): - await _print_interactive_progress_line(msg.content, renderer) + await _print_interactive_progress_line(msg.content, thinking, renderer) return True if not metadata.get("_progress"): return False is_tool_hint = metadata.get("_tool_hint", False) + is_reasoning = metadata.get("_reasoning", False) or metadata.get("_reasoning_delta", False) + if is_reasoning: + if channels_config and not channels_config.show_reasoning: + return True + _print_cli_reasoning(msg.content, thinking, renderer) + return True if channels_config and is_tool_hint and not channels_config.send_tool_hints: return True if channels_config and not is_tool_hint and not channels_config.send_progress: return True - await _print_interactive_progress_line(msg.content, renderer) + await _print_interactive_progress_line(msg.content, thinking, renderer) return True @@ -1064,13 +1094,20 @@ def agent( # Shared reference for progress callbacks _thinking: ThinkingSpinner | None = None - async def _cli_progress(content: str, *, tool_hint: bool = False, **_kwargs: Any) -> None: - ch = agent_loop.channels_config - if ch and tool_hint and not ch.send_tool_hints: - return - if ch and not tool_hint and not ch.send_progress: - return - _print_cli_progress_line(content, _thinking) + def _make_progress(renderer: StreamRenderer | None = None): + async def _cli_progress(content: str, *, tool_hint: bool = False, reasoning: bool = False, **_kwargs: Any) -> None: + ch = agent_loop.channels_config + if reasoning: + if ch and not ch.show_reasoning: + return + _print_cli_reasoning(content, _thinking, renderer) + return + if ch and tool_hint and not ch.send_tool_hints: + return + if ch and not tool_hint and not ch.send_progress: + return + _print_cli_progress_line(content, _thinking, renderer) + return _cli_progress if message: # Single message mode — direct call, no bus needed @@ -1082,16 +1119,20 @@ def agent( ) response = await agent_loop.process_direct( message, session_id, - on_progress=_cli_progress, + on_progress=_make_progress(renderer), on_stream=renderer.on_delta, on_stream_end=renderer.on_end, ) if not renderer.streamed: await renderer.close() + print_kwargs: dict[str, Any] = {} + if renderer.header_printed: + print_kwargs["show_header"] = False _print_agent_response( response.content if response else "", render_markdown=markdown, metadata=response.metadata if response else None, + **print_kwargs, ) await agent_loop.close_mcp() @@ -1154,6 +1195,7 @@ def agent( msg, renderer, agent_loop.channels_config, + renderer, ): continue @@ -1215,8 +1257,14 @@ def agent( if content and not meta.get("_streamed"): if renderer: await renderer.close() + print_kwargs: dict[str, Any] = {} + if renderer and renderer.header_printed: + print_kwargs["show_header"] = False _print_agent_response( - content, render_markdown=markdown, metadata=meta, + content, + render_markdown=markdown, + metadata=meta, + **print_kwargs, ) elif renderer and not renderer.streamed: await renderer.close() diff --git a/nanobot/cli/stream.py b/nanobot/cli/stream.py index c6b5a87ad..24a141cdd 100644 --- a/nanobot/cli/stream.py +++ b/nanobot/cli/stream.py @@ -1,13 +1,16 @@ """Streaming renderer for CLI output. -Uses Rich Live with auto_refresh=False for stable, flicker-free -markdown rendering during streaming. Ellipsis mode handles overflow. +Uses Rich Live with ``transient=True`` for in-place markdown updates during +streaming. After the live display stops, a final clean render is printed +so the content persists on screen. ``transient=True`` ensures the live +area is erased before ``stop()`` returns, avoiding the duplication bug +that plagued earlier approaches. """ from __future__ import annotations import sys -import time +from contextlib import contextmanager, nullcontext from rich.console import Console from rich.live import Live @@ -15,6 +18,16 @@ from rich.markdown import Markdown from rich.text import Text +def _clear_current_line(console: Console) -> None: + """Erase a transient status line before printing persistent output.""" + file = console.file + isatty = getattr(file, "isatty", lambda: False) + if not isatty(): + return + file.write("\r\x1b[2K") + file.flush() + + def _make_console() -> Console: """Create a Console that emits plain text when stdout is not a TTY. @@ -34,6 +47,7 @@ class ThinkingSpinner: def __init__(self, console: Console | None = None, bot_name: str = "nanobot"): c = console or _make_console() + self._console = c self._spinner = c.status(f"[dim]{bot_name} is thinking...[/dim]", spinner="dots") self._active = False @@ -45,6 +59,7 @@ class ThinkingSpinner: def __exit__(self, *exc): self._active = False self._spinner.stop() + _clear_current_line(self._console) return False def pause(self): @@ -55,6 +70,7 @@ class ThinkingSpinner: def _ctx(): if self._spinner and self._active: self._spinner.stop() + _clear_current_line(self._console) try: yield finally: @@ -65,13 +81,14 @@ class ThinkingSpinner: class StreamRenderer: - """Rich Live streaming with markdown. auto_refresh=False avoids render races. + """Streaming renderer with Rich Live for in-place updates. - Deltas arrive pre-filtered (no tags) from the agent loop. + During streaming: updates content in-place via Rich Live. + On end: stops Live (transient=True erases it), then prints final render. Flow per round: - spinner -> first visible delta -> header + Live renders -> - on_end -> Live stops (content stays on screen) + spinner -> first delta -> header + Live updates -> + on_end -> stop Live + final render """ def __init__( @@ -86,14 +103,24 @@ class StreamRenderer: self._bot_name = bot_name self._bot_icon = bot_icon self._buf = "" - self._live: Live | None = None - self._t = 0.0 self.streamed = False + self._console = _make_console() + self._live: Live | None = None self._spinner: ThinkingSpinner | None = None + self._header_printed = False self._start_spinner() - def _render(self): - return Markdown(self._buf) if self._md and self._buf else Text(self._buf or "") + def _renderable(self): + """Create a renderable from the current buffer.""" + if self._md and self._buf: + return Markdown(self._buf) + return Text(self._buf or "") + + def _render_str(self) -> str: + """Render current buffer to a plain string via Rich.""" + with self._console.capture() as cap: + self._console.print(self._renderable()) + return cap.get() def _start_spinner(self) -> None: if self._show_spinner: @@ -105,37 +132,85 @@ class StreamRenderer: self._spinner.__exit__(None, None, None) self._spinner = None + @property + def console(self) -> Console: + """Expose the Live's console so external print functions can use it.""" + return self._console + + @property + def header_printed(self) -> bool: + """Whether this turn has already opened the assistant output block.""" + return self._header_printed + + def ensure_header(self) -> None: + """Stop transient status and print the assistant header once.""" + # A turn can print trace rows before the final answer, then restart the + # spinner while tools run. The next answer delta still needs to stop + # that spinner even though the header was already printed. + self._stop_spinner() + if self._header_printed: + return + self._console.print() + header = f"{self._bot_icon} {self._bot_name}" if self._bot_icon else self._bot_name + self._console.print(f"[cyan]{header}[/cyan]") + self._header_printed = True + + def pause_spinner(self): + """Context manager: temporarily stop transient output for clean trace lines.""" + @contextmanager + def _pause(): + live_was_active = self._live is not None + if self._live: + # Trace/reasoning can arrive after answer streaming has started. + # Stop the transient Live view first so it does not leak a raw + # partial markdown frame before the trace line. + self._live.stop() + self._live = None + with self._spinner.pause() if self._spinner else nullcontext(): + yield + # If more answer deltas arrive after the trace, on_delta() will + # create a fresh Live using the existing buffer. If no deltas arrive, + # on_end() prints the final buffered answer once. + if live_was_active: + return + + return _pause() + async def on_delta(self, delta: str) -> None: self.streamed = True self._buf += delta if self._live is None: if not self._buf.strip(): return - self._stop_spinner() - c = _make_console() - c.print() - header = f"{self._bot_icon} {self._bot_name}" if self._bot_icon else self._bot_name - c.print(f"[cyan]{header}[/cyan]") - self._live = Live(self._render(), console=c, auto_refresh=False) + self.ensure_header() + self._live = Live( + self._renderable(), + console=self._console, + auto_refresh=False, + transient=True, + ) self._live.start() - now = time.monotonic() - if (now - self._t) > 0.15: - self._live.update(self._render()) - self._live.refresh() - self._t = now + else: + self._live.update(self._renderable()) + self._live.refresh() async def on_end(self, *, resuming: bool = False) -> None: if self._live: - self._live.update(self._render()) + # Double-refresh to sync _shape before stop() calls refresh(). + self._live.refresh() + self._live.update(self._renderable()) self._live.refresh() self._live.stop() self._live = None self._stop_spinner() + if self._buf.strip(): + # Print final rendered content (persists after Live is gone). + out = sys.stdout + out.write(self._render_str()) + out.flush() if resuming: self._buf = "" self._start_spinner() - else: - _make_console().print() def stop_for_input(self) -> None: """Stop spinner before user input to avoid prompt_toolkit conflicts.""" @@ -143,7 +218,6 @@ class StreamRenderer: def pause(self): """Context manager: pause spinner for external output. No-op once streaming has started.""" - from contextlib import nullcontext if self._spinner: return self._spinner.pause() return nullcontext() diff --git a/nanobot/config/schema.py b/nanobot/config/schema.py index 0f1f06c69..ff7454d71 100644 --- a/nanobot/config/schema.py +++ b/nanobot/config/schema.py @@ -35,6 +35,7 @@ class ChannelsConfig(Base): send_progress: bool = True # stream agent's text progress to the channel send_tool_hints: bool = False # stream tool-call hints (e.g. read_file("…")) + show_reasoning: bool = True # surface model reasoning when channel implements it send_max_retries: int = Field(default=3, ge=0, le=10) # Max delivery attempts (initial send included) transcription_provider: str = "groq" # Voice transcription backend: "groq" or "openai" transcription_language: str | None = Field(default=None, pattern=r"^[a-z]{2,3}$") # Optional ISO-639-1 hint for audio transcription diff --git a/nanobot/utils/helpers.py b/nanobot/utils/helpers.py index 0655b4439..2a969298c 100644 --- a/nanobot/utils/helpers.py +++ b/nanobot/utils/helpers.py @@ -71,6 +71,93 @@ def strip_think(text: str) -> str: return text.strip() +def extract_think(text: str) -> tuple[str | None, str]: + """Extract thinking content from inline ```` / ```` blocks. + + Returns ``(thinking_text, cleaned_text)``. Only closed blocks are + extracted; unclosed streaming prefixes are stripped from the cleaned + text but not surfaced — :func:`strip_think` handles that case. + """ + parts: list[str] = [] + for m in re.finditer(r"([\s\S]*?)", text): + parts.append(m.group(1).strip()) + for m in re.finditer(r"([\s\S]*?)", text): + parts.append(m.group(1).strip()) + thinking = "\n\n".join(parts) if parts else None + return thinking, strip_think(text) + + +class IncrementalThinkExtractor: + """Stateful inline ```` extractor for streaming buffers. + + Streaming providers expose only a single content delta channel. When a + model embeds reasoning in ``...`` blocks inside that + channel, callers need to surface the reasoning incrementally as it + arrives without re-emitting earlier text. This holds the "already + emitted" cursor so the runner and the loop hook share one shape. + """ + + __slots__ = ("_emitted",) + + def __init__(self) -> None: + self._emitted = "" + + def reset(self) -> None: + self._emitted = "" + + async def feed(self, buf: str, emit: Any) -> bool: + """Emit any new thinking text found in ``buf``. + + Returns True if anything was emitted this call. ``emit`` is an + async callable taking a single string (typically + ``hook.emit_reasoning``). + """ + thinking, _ = extract_think(buf) + if not thinking or thinking == self._emitted: + return False + new = thinking[len(self._emitted):].strip() + self._emitted = thinking + if not new: + return False + await emit(new) + return True + + +def extract_reasoning( + reasoning_content: str | None, + thinking_blocks: list[dict[str, Any]] | None, + content: str | None, +) -> tuple[str | None, str | None]: + """Return ``(reasoning_text, cleaned_content)`` from one model response. + + Single source of truth for "what reasoning did this response carry, and + what answer text remains after we peel it out". Fallback order: + + 1. Dedicated ``reasoning_content`` (DeepSeek-R1, Kimi, MiMo, OpenAI + reasoning models, Bedrock). + 2. Anthropic ``thinking_blocks``. + 3. Inline ```` / ```` blocks in ``content``. + + Only one source contributes per response; lower-priority sources are + ignored if a higher-priority one is present, but inline ```` + tags are still stripped from ``content`` so they never leak into the + final answer. + """ + if reasoning_content: + return reasoning_content, strip_think(content) if content else content + if thinking_blocks: + parts = [ + tb.get("thinking", "") + for tb in thinking_blocks + if isinstance(tb, dict) and tb.get("type") == "thinking" + ] + joined = "\n\n".join(p for p in parts if p) + return (joined or None), strip_think(content) if content else content + if content: + return extract_think(content) + return None, content + + def detect_image_mime(data: bytes) -> str | None: """Detect image MIME type from magic bytes, ignoring file extension.""" if data[:8] == b"\x89PNG\r\n\x1a\n": diff --git a/tests/agent/test_hook_composite.py b/tests/agent/test_hook_composite.py index 8971d48ec..9b6c2820d 100644 --- a/tests/agent/test_hook_composite.py +++ b/tests/agent/test_hook_composite.py @@ -13,6 +13,17 @@ def _ctx() -> AgentHookContext: return AgentHookContext(iteration=0, messages=[]) +# --------------------------------------------------------------------------- +# Base AgentHook emit_reasoning: no-op +# --------------------------------------------------------------------------- + + +@pytest.mark.asyncio +async def test_base_hook_emit_reasoning_is_noop(): + hook = AgentHook() + await hook.emit_reasoning("should not raise") + + # --------------------------------------------------------------------------- # Fan-out: every hook is called in order # --------------------------------------------------------------------------- @@ -45,6 +56,9 @@ async def test_composite_fans_out_all_async_methods(): async def before_iteration(self, context: AgentHookContext) -> None: events.append("before_iteration") + async def emit_reasoning(self, reasoning_content: str | None) -> None: + events.append(f"emit_reasoning:{reasoning_content}") + async def on_stream(self, context: AgentHookContext, delta: str) -> None: events.append(f"on_stream:{delta}") @@ -61,6 +75,7 @@ async def test_composite_fans_out_all_async_methods(): ctx = _ctx() await hook.before_iteration(ctx) + await hook.emit_reasoning("thinking...") await hook.on_stream(ctx, "hi") await hook.on_stream_end(ctx, resuming=True) await hook.before_execute_tools(ctx) @@ -68,6 +83,7 @@ async def test_composite_fans_out_all_async_methods(): assert events == [ "before_iteration", "before_iteration", + "emit_reasoning:thinking...", "emit_reasoning:thinking...", "on_stream:hi", "on_stream:hi", "on_stream_end:True", "on_stream_end:True", "before_execute_tools", "before_execute_tools", @@ -120,6 +136,8 @@ async def test_composite_error_isolation_all_async(): calls: list[str] = [] class Bad(AgentHook): + async def emit_reasoning(self, reasoning_content): + raise RuntimeError("err") async def on_stream_end(self, context, *, resuming): raise RuntimeError("err") async def before_execute_tools(self, context): @@ -128,6 +146,8 @@ async def test_composite_error_isolation_all_async(): raise RuntimeError("err") class Good(AgentHook): + async def emit_reasoning(self, reasoning_content): + calls.append("emit_reasoning") async def on_stream_end(self, context, *, resuming): calls.append("on_stream_end") async def before_execute_tools(self, context): @@ -137,10 +157,11 @@ async def test_composite_error_isolation_all_async(): hook = CompositeHook([Bad(), Good()]) ctx = _ctx() + await hook.emit_reasoning("test") await hook.on_stream_end(ctx, resuming=False) await hook.before_execute_tools(ctx) await hook.after_iteration(ctx) - assert calls == ["on_stream_end", "before_execute_tools", "after_iteration"] + assert calls == ["emit_reasoning", "on_stream_end", "before_execute_tools", "after_iteration"] # --------------------------------------------------------------------------- diff --git a/tests/agent/test_runner_reasoning.py b/tests/agent/test_runner_reasoning.py new file mode 100644 index 000000000..d971e05a1 --- /dev/null +++ b/tests/agent/test_runner_reasoning.py @@ -0,0 +1,321 @@ +"""Tests for AgentRunner reasoning extraction and emission. + +Covers the three sources of model reasoning (dedicated ``reasoning_content``, +Anthropic ``thinking_blocks``, inline ````/```` tags) plus +the streaming interaction: reasoning and answer streams are independent +channels, gated by ``context.streamed_reasoning`` rather than +``context.streamed_content``. +""" + +from __future__ import annotations + +from unittest.mock import AsyncMock, MagicMock + +import pytest + +from nanobot.agent.hook import AgentHook +from nanobot.config.schema import AgentDefaults +from nanobot.providers.base import LLMResponse, ToolCallRequest + +_MAX_TOOL_RESULT_CHARS = AgentDefaults().max_tool_result_chars + + +class _RecordingHook(AgentHook): + def __init__(self) -> None: + super().__init__() + self.emitted: list[str] = [] + self.end_calls = 0 + + async def emit_reasoning(self, reasoning_content: str | None) -> None: + if reasoning_content: + self.emitted.append(reasoning_content) + + async def emit_reasoning_end(self) -> None: + self.end_calls += 1 + + +@pytest.mark.asyncio +async def test_runner_preserves_reasoning_fields_in_assistant_history(): + """Reasoning fields ride along on the persisted assistant message so + follow-up provider calls retain the model's prior thinking context.""" + from nanobot.agent.runner import AgentRunSpec, AgentRunner + + provider = MagicMock() + captured_second_call: list[dict] = [] + call_count = {"n": 0} + + async def chat_with_retry(*, messages, **kwargs): + call_count["n"] += 1 + if call_count["n"] == 1: + return LLMResponse( + content="thinking", + tool_calls=[ToolCallRequest(id="call_1", name="list_dir", arguments={"path": "."})], + reasoning_content="hidden reasoning", + thinking_blocks=[{"type": "thinking", "thinking": "step"}], + usage={"prompt_tokens": 5, "completion_tokens": 3}, + ) + captured_second_call[:] = messages + return LLMResponse(content="done", tool_calls=[], usage={}) + + provider.chat_with_retry = chat_with_retry + tools = MagicMock() + tools.get_definitions.return_value = [] + tools.execute = AsyncMock(return_value="tool result") + + runner = AgentRunner(provider) + result = await runner.run(AgentRunSpec( + initial_messages=[ + {"role": "system", "content": "system"}, + {"role": "user", "content": "do task"}, + ], + tools=tools, + model="test-model", + max_iterations=3, + max_tool_result_chars=_MAX_TOOL_RESULT_CHARS, + )) + + assert result.final_content == "done" + assistant_messages = [ + msg for msg in captured_second_call + if msg.get("role") == "assistant" and msg.get("tool_calls") + ] + assert len(assistant_messages) == 1 + assert assistant_messages[0]["reasoning_content"] == "hidden reasoning" + assert assistant_messages[0]["thinking_blocks"] == [{"type": "thinking", "thinking": "step"}] + + +@pytest.mark.asyncio +async def test_runner_emits_anthropic_thinking_blocks(): + from nanobot.agent.runner import AgentRunSpec, AgentRunner + + provider = MagicMock() + + async def chat_with_retry(**kwargs): + return LLMResponse( + content="The answer is 42.", + thinking_blocks=[ + {"type": "thinking", "thinking": "Let me analyze this step by step.", "signature": "sig1"}, + {"type": "thinking", "thinking": "After careful consideration.", "signature": "sig2"}, + ], + tool_calls=[], + usage={"prompt_tokens": 5, "completion_tokens": 3}, + ) + + provider.chat_with_retry = chat_with_retry + tools = MagicMock() + tools.get_definitions.return_value = [] + + hook = _RecordingHook() + runner = AgentRunner(provider) + result = await runner.run(AgentRunSpec( + initial_messages=[{"role": "user", "content": "question"}], + tools=tools, + model="test-model", + max_iterations=3, + max_tool_result_chars=_MAX_TOOL_RESULT_CHARS, + hook=hook, + )) + + assert result.final_content == "The answer is 42." + assert len(hook.emitted) == 1 + assert "Let me analyze this" in hook.emitted[0] + assert "After careful consideration" in hook.emitted[0] + + +@pytest.mark.asyncio +async def test_runner_emits_inline_think_content_as_reasoning(): + """Models embedding reasoning in ... blocks should have + that content extracted and emitted, and stripped from the answer.""" + from nanobot.agent.runner import AgentRunSpec, AgentRunner + + provider = MagicMock() + + async def chat_with_retry(**kwargs): + return LLMResponse( + content="Let me think about this...\nThe answer is 42.The answer is 42.", + tool_calls=[], + usage={"prompt_tokens": 5, "completion_tokens": 3}, + ) + + provider.chat_with_retry = chat_with_retry + tools = MagicMock() + tools.get_definitions.return_value = [] + + hook = _RecordingHook() + runner = AgentRunner(provider) + result = await runner.run(AgentRunSpec( + initial_messages=[{"role": "user", "content": "what is the answer?"}], + tools=tools, + model="test-model", + max_iterations=3, + max_tool_result_chars=_MAX_TOOL_RESULT_CHARS, + hook=hook, + )) + + assert result.final_content == "The answer is 42." + assert len(hook.emitted) == 1 + assert "Let me think about this" in hook.emitted[0] + + +@pytest.mark.asyncio +async def test_runner_prefers_reasoning_content_over_inline_think(): + """Fallback priority: dedicated reasoning_content wins; inline + is still scrubbed from the answer content.""" + from nanobot.agent.runner import AgentRunSpec, AgentRunner + + provider = MagicMock() + + async def chat_with_retry(**kwargs): + return LLMResponse( + content="inline thinkingThe answer.", + reasoning_content="dedicated reasoning field", + tool_calls=[], + usage={"prompt_tokens": 5, "completion_tokens": 3}, + ) + + provider.chat_with_retry = chat_with_retry + tools = MagicMock() + tools.get_definitions.return_value = [] + + hook = _RecordingHook() + runner = AgentRunner(provider) + result = await runner.run(AgentRunSpec( + initial_messages=[{"role": "user", "content": "question"}], + tools=tools, + model="test-model", + max_iterations=3, + max_tool_result_chars=_MAX_TOOL_RESULT_CHARS, + hook=hook, + )) + + assert result.final_content == "The answer." + assert hook.emitted == ["dedicated reasoning field"] + + +@pytest.mark.asyncio +async def test_runner_emits_reasoning_content_even_when_answer_was_streamed(): + """`reasoning_content` arrives only on the final response; streaming the + answer must not suppress it (the answer stream and the reasoning channel + are independent — only the reasoning-already-emitted bit matters).""" + from nanobot.agent.runner import AgentRunSpec, AgentRunner + + provider = MagicMock() + provider.supports_progress_deltas = True + + async def chat_stream_with_retry(*, on_content_delta=None, **kwargs): + if on_content_delta: + await on_content_delta("The ") + await on_content_delta("answer.") + return LLMResponse( + content="The answer.", + reasoning_content="step-by-step deduction", + tool_calls=[], + usage={"prompt_tokens": 5, "completion_tokens": 3}, + ) + + provider.chat_stream_with_retry = chat_stream_with_retry + tools = MagicMock() + tools.get_definitions.return_value = [] + + progress_calls: list[str] = [] + + async def _progress(content: str, **_kwargs): + progress_calls.append(content) + + hook = _RecordingHook() + runner = AgentRunner(provider) + result = await runner.run(AgentRunSpec( + initial_messages=[{"role": "user", "content": "question"}], + tools=tools, + model="test-model", + max_iterations=3, + max_tool_result_chars=_MAX_TOOL_RESULT_CHARS, + hook=hook, + stream_progress_deltas=True, + progress_callback=_progress, + )) + + assert result.final_content == "The answer." + assert progress_calls, "answer should have streamed via progress callback" + assert hook.emitted == ["step-by-step deduction"] + + +@pytest.mark.asyncio +async def test_runner_does_not_double_emit_when_inline_think_already_streamed(): + """Inline `` blocks streamed incrementally during the answer + stream must not be re-emitted from the final response.""" + from nanobot.agent.runner import AgentRunSpec, AgentRunner + + provider = MagicMock() + provider.supports_progress_deltas = True + + async def chat_stream_with_retry(*, on_content_delta=None, **kwargs): + if on_content_delta: + await on_content_delta("working...") + await on_content_delta("The answer.") + return LLMResponse( + content="working...The answer.", + tool_calls=[], + usage={"prompt_tokens": 5, "completion_tokens": 3}, + ) + + provider.chat_stream_with_retry = chat_stream_with_retry + tools = MagicMock() + tools.get_definitions.return_value = [] + + async def _progress(content: str, **_kwargs): + pass + + hook = _RecordingHook() + runner = AgentRunner(provider) + result = await runner.run(AgentRunSpec( + initial_messages=[{"role": "user", "content": "question"}], + tools=tools, + model="test-model", + max_iterations=3, + max_tool_result_chars=_MAX_TOOL_RESULT_CHARS, + hook=hook, + stream_progress_deltas=True, + progress_callback=_progress, + )) + + assert result.final_content == "The answer." + assert hook.emitted == ["working..."] + assert hook.end_calls >= 1, "reasoning stream must be closed once the answer starts" + + +@pytest.mark.asyncio +async def test_runner_closes_reasoning_stream_after_one_shot_response(): + """A non-streaming response carrying ``reasoning_content`` must emit + both a reasoning delta and an end marker so channels can finalize the + in-place bubble.""" + from nanobot.agent.runner import AgentRunSpec, AgentRunner + + provider = MagicMock() + + async def chat_with_retry(**kwargs): + return LLMResponse( + content="answer", + reasoning_content="hidden thought", + tool_calls=[], + usage={"prompt_tokens": 5, "completion_tokens": 3}, + ) + + provider.chat_with_retry = chat_with_retry + tools = MagicMock() + tools.get_definitions.return_value = [] + + hook = _RecordingHook() + runner = AgentRunner(provider) + result = await runner.run(AgentRunSpec( + initial_messages=[{"role": "user", "content": "q"}], + tools=tools, + model="test-model", + max_iterations=3, + max_tool_result_chars=_MAX_TOOL_RESULT_CHARS, + hook=hook, + )) + + assert result.final_content == "answer" + assert hook.emitted == ["hidden thought"] + assert hook.end_calls == 1 diff --git a/tests/channels/test_channel_manager_reasoning.py b/tests/channels/test_channel_manager_reasoning.py new file mode 100644 index 000000000..bc2a640c6 --- /dev/null +++ b/tests/channels/test_channel_manager_reasoning.py @@ -0,0 +1,228 @@ +"""Tests for ChannelManager routing of model reasoning content. + +Reasoning is delivered through plugin streaming primitives +(``send_reasoning_delta`` / ``send_reasoning_end``) so each channel +controls in-place rendering — mirroring the existing answer ``send_delta`` +/ ``stream_end`` pair. The manager forwards reasoning frames only to +channels that opt in via ``channel.show_reasoning``; plugins without a +low-emphasis UI primitive keep the base no-op and the content silently +drops at dispatch. + +One-shot ``_reasoning`` frames are accepted for back-compat with hooks +that haven't migrated yet — ``BaseChannel.send_reasoning`` expands them +to a single delta + end pair so plugins only implement the streaming +primitives. +""" + +from __future__ import annotations + +import asyncio +from unittest.mock import AsyncMock + +import pytest + +from nanobot.bus.events import OutboundMessage +from nanobot.bus.queue import MessageBus +from nanobot.channels.base import BaseChannel +from nanobot.channels.manager import ChannelManager +from nanobot.config.schema import Config + + +class _MockChannel(BaseChannel): + name = "mock" + display_name = "Mock" + + def __init__(self, config, bus): + super().__init__(config, bus) + self._send_mock = AsyncMock() + self._delta_mock = AsyncMock() + self._end_mock = AsyncMock() + + async def start(self): # pragma: no cover - not exercised + pass + + async def stop(self): # pragma: no cover - not exercised + pass + + async def send(self, msg): + return await self._send_mock(msg) + + async def send_reasoning_delta(self, chat_id, delta, metadata=None): + return await self._delta_mock(chat_id, delta, metadata) + + async def send_reasoning_end(self, chat_id, metadata=None): + return await self._end_mock(chat_id, metadata) + + +@pytest.fixture +def manager() -> ChannelManager: + mgr = ChannelManager(Config(), MessageBus()) + mgr.channels["mock"] = _MockChannel({}, mgr.bus) + return mgr + + +@pytest.mark.asyncio +async def test_reasoning_delta_routes_to_send_reasoning_delta(manager): + channel = manager.channels["mock"] + msg = OutboundMessage( + channel="mock", + chat_id="c1", + content="step-by-step", + metadata={"_progress": True, "_reasoning_delta": True, "_stream_id": "r1"}, + ) + await manager._send_once(channel, msg) + channel._delta_mock.assert_awaited_once() + args = channel._delta_mock.await_args.args + assert args[0] == "c1" + assert args[1] == "step-by-step" + channel._send_mock.assert_not_awaited() + channel._end_mock.assert_not_awaited() + + +@pytest.mark.asyncio +async def test_reasoning_end_routes_to_send_reasoning_end(manager): + channel = manager.channels["mock"] + msg = OutboundMessage( + channel="mock", + chat_id="c1", + content="", + metadata={"_progress": True, "_reasoning_end": True, "_stream_id": "r1"}, + ) + await manager._send_once(channel, msg) + channel._end_mock.assert_awaited_once() + channel._delta_mock.assert_not_awaited() + + +@pytest.mark.asyncio +async def test_legacy_one_shot_reasoning_expands_to_delta_plus_end(manager): + """`_reasoning` (no delta/end pair) falls back through `send_reasoning` + which the base class expands to a single delta + end. Hooks that haven't + migrated still surface in WebUI as a complete stream segment.""" + channel = manager.channels["mock"] + msg = OutboundMessage( + channel="mock", + chat_id="c1", + content="one-shot reasoning", + metadata={"_progress": True, "_reasoning": True}, + ) + await manager._send_once(channel, msg) + channel._delta_mock.assert_awaited_once() + channel._end_mock.assert_awaited_once() + + +@pytest.mark.asyncio +async def test_dispatch_drops_reasoning_when_channel_opts_out(manager): + channel = manager.channels["mock"] + channel.show_reasoning = False + msg = OutboundMessage( + channel="mock", + chat_id="c1", + content="hidden thinking", + metadata={"_progress": True, "_reasoning_delta": True}, + ) + await manager.bus.publish_outbound(msg) + + await _pump_one(manager) + + channel._delta_mock.assert_not_awaited() + channel._end_mock.assert_not_awaited() + channel._send_mock.assert_not_awaited() + + +@pytest.mark.asyncio +async def test_dispatch_delivers_reasoning_when_channel_opts_in(manager): + channel = manager.channels["mock"] + channel.show_reasoning = True + for chunk in ("first ", "second"): + await manager.bus.publish_outbound(OutboundMessage( + channel="mock", + chat_id="c1", + content=chunk, + metadata={"_progress": True, "_reasoning_delta": True, "_stream_id": "r1"}, + )) + await manager.bus.publish_outbound(OutboundMessage( + channel="mock", + chat_id="c1", + content="", + metadata={"_progress": True, "_reasoning_end": True, "_stream_id": "r1"}, + )) + + await _pump_one(manager) + + assert channel._delta_mock.await_count == 2 + channel._end_mock.assert_awaited_once() + + +@pytest.mark.asyncio +async def test_dispatch_silently_drops_reasoning_for_unknown_channel(manager): + msg = OutboundMessage( + channel="ghost", + chat_id="c1", + content="nobody home", + metadata={"_progress": True, "_reasoning_delta": True}, + ) + await manager.bus.publish_outbound(msg) + + await _pump_one(manager) + + manager.channels["mock"]._delta_mock.assert_not_awaited() + manager.channels["mock"]._send_mock.assert_not_awaited() + + +@pytest.mark.asyncio +async def test_base_channel_reasoning_primitives_are_noop_safe(): + """Plugins that don't override the streaming primitives must not blow up.""" + + class _Plain(BaseChannel): + name = "plain" + display_name = "Plain" + + async def start(self): # pragma: no cover + pass + + async def stop(self): # pragma: no cover + pass + + async def send(self, msg): # pragma: no cover + pass + + channel = _Plain({}, MessageBus()) + assert await channel.send_reasoning_delta("c", "x") is None + assert await channel.send_reasoning_end("c") is None + # And the one-shot wrapper translates without raising. + assert await channel.send_reasoning( + OutboundMessage(channel="plain", chat_id="c", content="x", metadata={}) + ) is None + + +@pytest.mark.asyncio +async def test_reasoning_routing_does_not_consult_send_progress(manager): + """`show_reasoning` is orthogonal to `send_progress` — turning off + progress streaming must not silence reasoning.""" + channel = manager.channels["mock"] + channel.send_progress = False + channel.show_reasoning = True + await manager.bus.publish_outbound(OutboundMessage( + channel="mock", + chat_id="c1", + content="still surfaces", + metadata={"_progress": True, "_reasoning_delta": True}, + )) + + await _pump_one(manager) + + channel._delta_mock.assert_awaited_once() + + +async def _pump_one(manager: ChannelManager) -> None: + """Drive the dispatcher until the outbound queue drains, then cancel.""" + task = asyncio.create_task(manager._dispatch_outbound()) + for _ in range(50): + await asyncio.sleep(0.01) + if manager.bus.outbound.qsize() == 0: + break + task.cancel() + try: + await task + except asyncio.CancelledError: + pass diff --git a/tests/channels/test_websocket_channel.py b/tests/channels/test_websocket_channel.py index 92b61f7d6..f11cb21b4 100644 --- a/tests/channels/test_websocket_channel.py +++ b/tests/channels/test_websocket_channel.py @@ -358,6 +358,87 @@ async def test_send_delta_emits_delta_and_stream_end() -> None: assert second["stream_id"] == "sid" +@pytest.mark.asyncio +async def test_send_reasoning_delta_emits_streaming_frame() -> None: + bus = MagicMock() + channel = WebSocketChannel({"enabled": True, "allowFrom": ["*"]}, bus) + mock_ws = AsyncMock() + channel._attach(mock_ws, "chat-1") + + await channel.send_reasoning_delta( + "chat-1", + "step-by-step thinking", + {"_reasoning_delta": True, "_stream_id": "r1"}, + ) + + mock_ws.send.assert_awaited_once() + payload = json.loads(mock_ws.send.await_args.args[0]) + assert payload["event"] == "reasoning_delta" + assert payload["chat_id"] == "chat-1" + assert payload["text"] == "step-by-step thinking" + assert payload["stream_id"] == "r1" + + +@pytest.mark.asyncio +async def test_send_reasoning_end_emits_close_frame() -> None: + bus = MagicMock() + channel = WebSocketChannel({"enabled": True, "allowFrom": ["*"]}, bus) + mock_ws = AsyncMock() + channel._attach(mock_ws, "chat-1") + + await channel.send_reasoning_end("chat-1", {"_reasoning_end": True, "_stream_id": "r1"}) + + payload = json.loads(mock_ws.send.await_args.args[0]) + assert payload == {"event": "reasoning_end", "chat_id": "chat-1", "stream_id": "r1"} + + +@pytest.mark.asyncio +async def test_send_reasoning_one_shot_expands_to_delta_plus_end() -> None: + """``send_reasoning`` is back-compat for hooks that haven't migrated: + the base implementation must produce one delta and one end so the + WebUI sees the same shape either way.""" + bus = MagicMock() + channel = WebSocketChannel({"enabled": True, "allowFrom": ["*"]}, bus) + mock_ws = AsyncMock() + channel._attach(mock_ws, "chat-1") + + await channel.send_reasoning(OutboundMessage( + channel="websocket", + chat_id="chat-1", + content="thinking", + metadata={"_reasoning": True}, + )) + + assert mock_ws.send.await_count == 2 + first = json.loads(mock_ws.send.call_args_list[0][0][0]) + second = json.loads(mock_ws.send.call_args_list[1][0][0]) + assert first["event"] == "reasoning_delta" + assert first["text"] == "thinking" + assert second["event"] == "reasoning_end" + + +@pytest.mark.asyncio +async def test_send_reasoning_delta_drops_empty_chunks() -> None: + bus = MagicMock() + channel = WebSocketChannel({"enabled": True, "allowFrom": ["*"]}, bus) + mock_ws = AsyncMock() + channel._attach(mock_ws, "chat-1") + + await channel.send_reasoning_delta("chat-1", "", {"_reasoning_delta": True}) + + mock_ws.send.assert_not_awaited() + + +@pytest.mark.asyncio +async def test_send_reasoning_without_subscribers_is_noop() -> None: + bus = MagicMock() + channel = WebSocketChannel({"enabled": True, "allowFrom": ["*"]}, bus) + + await channel.send_reasoning_delta("unattached", "thinking", None) + await channel.send_reasoning_end("unattached", None) + # No subscribers, no exception, no send. + + @pytest.mark.asyncio async def test_send_turn_end_emits_turn_end_event() -> None: bus = MagicMock() diff --git a/tests/cli/test_cli_input.py b/tests/cli/test_cli_input.py index e648e818c..34046e8d4 100644 --- a/tests/cli/test_cli_input.py +++ b/tests/cli/test_cli_input.py @@ -1,4 +1,6 @@ import asyncio +from contextlib import nullcontext +from io import StringIO from unittest.mock import AsyncMock, MagicMock, call, patch import pytest @@ -96,6 +98,66 @@ def test_print_cli_progress_line_pauses_spinner_before_printing(): assert order == ["start", "stop", "print", "start", "stop"] +def test_thinking_spinner_clears_status_line_when_paused(): + """Stopping the spinner should erase its transient line before output.""" + stream = StringIO() + stream.isatty = lambda: True # type: ignore[method-assign] + mock_console = MagicMock() + mock_console.file = stream + spinner = MagicMock() + mock_console.status.return_value = spinner + + thinking = stream_mod.ThinkingSpinner(console=mock_console) + with thinking: + with thinking.pause(): + pass + + assert "\r\x1b[2K" in stream.getvalue() + + +def test_stream_renderer_stops_spinner_even_after_header_printed(): + """A later answer delta must stop the spinner even when header already exists.""" + stream = StringIO() + stream.isatty = lambda: True # type: ignore[method-assign] + mock_console = MagicMock() + mock_console.file = stream + spinner = MagicMock() + mock_console.status.return_value = spinner + + with patch.object(stream_mod, "_make_console", return_value=mock_console): + renderer = stream_mod.StreamRenderer(show_spinner=True) + renderer._header_printed = True + renderer.ensure_header() + + spinner.stop.assert_called_once() + assert "\r\x1b[2K" in stream.getvalue() + + +def test_print_cli_progress_line_opens_renderer_header_before_trace(): + """Trace lines should appear under the assistant header, not under You.""" + order: list[str] = [] + renderer = MagicMock() + renderer.console.print.side_effect = lambda *_args, **_kwargs: order.append("print") + renderer.ensure_header.side_effect = lambda: order.append("header") + renderer.pause_spinner.return_value = nullcontext() + + commands._print_cli_progress_line("tool running", None, renderer) + + assert order == ["header", "print"] + + +def test_print_cli_progress_line_stops_live_before_trace(): + """A trace line should not leak the current transient Live frame.""" + mock_live = MagicMock() + renderer = stream_mod.StreamRenderer(show_spinner=False) + renderer._live = mock_live + + commands._print_cli_progress_line("tool running", None, renderer) + + mock_live.stop.assert_called_once() + assert renderer._live is None + + @pytest.mark.asyncio async def test_print_interactive_progress_line_pauses_spinner_before_printing(): """Interactive progress output should also pause spinner cleanly.""" @@ -156,17 +218,65 @@ def test_stream_renderer_stop_for_input_stops_spinner(): # Create renderer with mocked console with patch.object(stream_mod, "_make_console", return_value=mock_console): renderer = stream_mod.StreamRenderer(show_spinner=True) - + # Verify spinner started spinner.start.assert_called_once() - + # Stop for input renderer.stop_for_input() - + # Verify spinner stopped spinner.stop.assert_called_once() +@pytest.mark.asyncio +async def test_on_end_writes_final_content_to_stdout_after_stopping_live(): + """on_end should stop Live (transient erases it) then print final content to stdout.""" + mock_live = MagicMock() + mock_console = MagicMock() + mock_console.capture.return_value.__enter__ = MagicMock( + return_value=MagicMock(get=lambda: "final output\n") + ) + mock_console.capture.return_value.__exit__ = MagicMock(return_value=False) + + with patch.object(stream_mod, "_make_console", return_value=mock_console): + renderer = stream_mod.StreamRenderer(show_spinner=False) + renderer._live = mock_live + renderer._buf = "final output" + + written: list[str] = [] + with patch("sys.stdout") as mock_stdout: + mock_stdout.write = lambda s: written.append(s) + mock_stdout.flush = MagicMock() + await renderer.on_end() + + mock_live.stop.assert_called_once() + assert renderer._live is None + assert written == ["final output\n"] + + +@pytest.mark.asyncio +async def test_on_end_resuming_clears_buffer_and_restarts_spinner(): + """on_end(resuming=True) should reset state for the next iteration.""" + spinner = MagicMock() + mock_console = MagicMock() + mock_console.status.return_value = spinner + mock_console.capture.return_value.__enter__ = MagicMock( + return_value=MagicMock(get=lambda: "") + ) + mock_console.capture.return_value.__exit__ = MagicMock(return_value=False) + + with patch.object(stream_mod, "_make_console", return_value=mock_console): + renderer = stream_mod.StreamRenderer(show_spinner=True) + renderer._buf = "some content" + + await renderer.on_end(resuming=True) + + assert renderer._buf == "" + # Spinner should have been restarted (start called twice: __init__ + resuming) + assert spinner.start.call_count == 2 + + def test_make_console_force_terminal_when_stdout_is_tty(): """Console should set force_terminal=True when stdout is a TTY (rich output).""" import sys diff --git a/tests/cli/test_interactive_retry_wait.py b/tests/cli/test_interactive_retry_wait.py index 5cc217c56..52c27d2c9 100644 --- a/tests/cli/test_interactive_retry_wait.py +++ b/tests/cli/test_interactive_retry_wait.py @@ -17,7 +17,7 @@ async def test_interactive_retry_wait_is_rendered_as_progress_even_when_progress metadata={"_retry_wait": True}, ) - async def fake_print(text: str, active_thinking: object | None) -> None: + async def fake_print(text: str, active_thinking: object | None, renderer=None) -> None: calls.append((text, active_thinking)) with patch("nanobot.cli.commands._print_interactive_progress_line", side_effect=fake_print): @@ -29,3 +29,104 @@ async def test_interactive_retry_wait_is_rendered_as_progress_even_when_progress assert handled is True assert calls == [("Model request failed, retry in 2s (attempt 1).", thinking)] + + +@pytest.mark.asyncio +async def test_reasoning_displayed_when_show_reasoning_enabled(): + """Reasoning content should be displayed when show_reasoning is True.""" + calls: list[str] = [] + channels_config = SimpleNamespace( + send_progress=True, send_tool_hints=False, show_reasoning=True, + ) + msg = SimpleNamespace( + content="Let me think about this...", + metadata={"_progress": True, "_reasoning": True}, + ) + + with patch("nanobot.cli.commands._print_cli_reasoning", side_effect=lambda t, th, r=None: calls.append(t)): + handled = await commands._maybe_print_interactive_progress(msg, None, channels_config) + + assert handled is True + assert calls == ["Let me think about this..."] + + +@pytest.mark.asyncio +async def test_reasoning_delta_displayed_when_show_reasoning_enabled(): + """Streamed reasoning delta frames should use the reasoning renderer.""" + calls: list[str] = [] + channels_config = SimpleNamespace( + send_progress=True, send_tool_hints=False, show_reasoning=True, + ) + msg = SimpleNamespace( + content="I should search first.", + metadata={"_progress": True, "_reasoning_delta": True}, + ) + + with patch("nanobot.cli.commands._print_cli_reasoning", side_effect=lambda t, th, r=None: calls.append(t)): + handled = await commands._maybe_print_interactive_progress(msg, None, channels_config) + + assert handled is True + assert calls == ["I should search first."] + + +@pytest.mark.asyncio +async def test_reasoning_hidden_when_show_reasoning_disabled(): + """Reasoning content should be suppressed when show_reasoning is False.""" + channels_config = SimpleNamespace( + send_progress=True, send_tool_hints=False, show_reasoning=False, + ) + msg = SimpleNamespace( + content="Let me think about this...", + metadata={"_progress": True, "_reasoning": True}, + ) + + with patch("nanobot.cli.commands._print_cli_reasoning") as mock_reasoning: + handled = await commands._maybe_print_interactive_progress(msg, None, channels_config) + + assert handled is True + mock_reasoning.assert_not_called() + + +@pytest.mark.asyncio +async def test_non_reasoning_progress_not_affected_by_show_reasoning(): + """Regular progress lines should display regardless of show_reasoning.""" + calls: list[str] = [] + channels_config = SimpleNamespace( + send_progress=True, send_tool_hints=False, show_reasoning=False, + ) + msg = SimpleNamespace( + content="working on it...", + metadata={"_progress": True}, + ) + + async def fake_print(text: str, thinking=None, renderer=None): + calls.append(text) + + with patch("nanobot.cli.commands._print_interactive_progress_line", side_effect=fake_print): + handled = await commands._maybe_print_interactive_progress(msg, None, channels_config) + + assert handled is True + assert calls == ["working on it..."] + + +@pytest.mark.asyncio +async def test_reasoning_shown_when_send_progress_disabled(): + """Reasoning display is governed by `show_reasoning` alone, independent + of `send_progress` — the two knobs are orthogonal.""" + calls: list[str] = [] + channels_config = SimpleNamespace( + send_progress=False, send_tool_hints=False, show_reasoning=True, + ) + msg = SimpleNamespace( + content="Let me think about this...", + metadata={"_progress": True, "_reasoning": True}, + ) + + with patch( + "nanobot.cli.commands._print_cli_reasoning", + side_effect=lambda t, th, r=None: calls.append(t), + ): + handled = await commands._maybe_print_interactive_progress(msg, None, channels_config) + + assert handled is True + assert calls == ["Let me think about this..."] diff --git a/tests/utils/test_strip_think.py b/tests/utils/test_strip_think.py index 5db93e658..f1048f40c 100644 --- a/tests/utils/test_strip_think.py +++ b/tests/utils/test_strip_think.py @@ -1,4 +1,4 @@ -from nanobot.utils.helpers import strip_think +from nanobot.utils.helpers import extract_reasoning, extract_think, strip_think class TestStripThinkTag: @@ -144,3 +144,130 @@ class TestStripThinkConservativePreserve: def test_literal_channel_marker_in_code_block_preserved(self): text = "Example:\n```\nif line.startswith(''):\n skip()\n```" assert strip_think(text) == text + + +class TestExtractThink: + + def test_no_think_tags(self): + thinking, clean = extract_think("Hello World") + assert thinking is None + assert clean == "Hello World" + + def test_single_think_block(self): + text = "Hello reasoning content\nhere World" + thinking, clean = extract_think(text) + assert thinking == "reasoning content\nhere" + assert clean == "Hello World" + + def test_single_thought_block(self): + text = "Hello reasoning content World" + thinking, clean = extract_think(text) + assert thinking == "reasoning content" + assert clean == "Hello World" + + def test_multiple_think_blocks(self): + text = "AfirstBsecondC" + thinking, clean = extract_think(text) + assert thinking == "first\n\nsecond" + assert clean == "ABC" + + def test_think_only_no_content(self): + text = "just thinking" + thinking, clean = extract_think(text) + assert thinking == "just thinking" + assert clean == "" + + def test_unclosed_think_not_extracted(self): + # Unclosed blocks at start are stripped but NOT extracted + text = "unclosed thinking..." + thinking, clean = extract_think(text) + assert thinking is None + assert clean == "" + + def test_empty_think_block(self): + text = "Hello World" + thinking, clean = extract_think(text) + # Empty blocks result in empty string after strip + assert thinking == "" + assert clean == "Hello World" + + def test_think_with_whitespace_only(self): + text = "Hello \n World" + thinking, clean = extract_think(text) + assert thinking is None + assert clean == "Hello \n World" + + def test_mixed_think_and_thought(self): + text = "Startfirst reasoningmiddlesecond reasoningEnd" + thinking, clean = extract_think(text) + assert thinking == "first reasoning\n\nsecond reasoning" + assert clean == "StartmiddleEnd" + + def test_real_world_ollama_response(self): + text = """ +The user is asking about Python list comprehensions. +Let me explain the syntax and give examples. + + +List comprehensions in Python provide a concise way to create lists. Here's the syntax: + +```python +[expression for item in iterable if condition] +``` + +For example: +```python +squares = [x**2 for x in range(10)] +```""" + thinking, clean = extract_think(text) + assert "list comprehensions" in thinking.lower() + assert "Let me explain" in thinking + assert "List comprehensions in Python" in clean + assert "" not in clean + assert "" not in clean + + +class TestExtractReasoning: + """Single source of truth for reasoning extraction across all providers.""" + + def test_prefers_reasoning_content_and_strips_inline_think(self): + # Dedicated field wins; inline tags are still scrubbed from content. + reasoning, content = extract_reasoning( + "dedicated", + None, + "inlinevisible answer", + ) + assert reasoning == "dedicated" + assert content == "visible answer" + + def test_falls_back_to_thinking_blocks(self): + reasoning, content = extract_reasoning( + None, + [ + {"type": "thinking", "thinking": "step 1"}, + {"type": "thinking", "thinking": "step 2"}, + {"type": "redacted_thinking"}, + ], + "hello", + ) + assert reasoning == "step 1\n\nstep 2" + assert content == "hello" + + def test_falls_back_to_inline_think_tags(self): + reasoning, content = extract_reasoning( + None, None, "plananswer" + ) + assert reasoning == "plan" + assert content == "answer" + + def test_no_reasoning_returns_none(self): + reasoning, content = extract_reasoning(None, None, "plain answer") + assert reasoning is None + assert content == "plain answer" + + def test_empty_thinking_blocks_falls_through_to_inline(self): + reasoning, content = extract_reasoning( + None, [], "plananswer" + ) + assert reasoning == "plan" + assert content == "answer" diff --git a/webui/src/components/MessageBubble.tsx b/webui/src/components/MessageBubble.tsx index 3bd580567..bd1d8c93b 100644 --- a/webui/src/components/MessageBubble.tsx +++ b/webui/src/components/MessageBubble.tsx @@ -1,5 +1,5 @@ import { useCallback, useEffect, useRef, useState } from "react"; -import { Check, ChevronRight, Copy, FileIcon, ImageIcon, PlaySquare, Wrench } from "lucide-react"; +import { Check, ChevronRight, Copy, FileIcon, ImageIcon, PlaySquare, Sparkles, Wrench } from "lucide-react"; import { useTranslation } from "react-i18next"; import { ImageLightbox } from "@/components/ImageLightbox"; @@ -85,12 +85,18 @@ export function MessageBubble({ message }: MessageBubbleProps) { const empty = message.content.trim().length === 0; const media = message.media ?? []; + const reasoning = message.role === "assistant" ? message.reasoning ?? "" : ""; + const reasoningStreaming = !!(message.role === "assistant" && message.reasoningStreaming); + const hasReasoning = reasoning.length > 0 || reasoningStreaming; const showAssistantActions = message.role === "assistant" && !message.isStreaming && !empty; return (
- {empty && message.isStreaming ? ( + {hasReasoning ? ( + + ) : null} + {empty && message.isStreaming && !hasReasoning ? ( - ) : ( + ) : empty && message.isStreaming ? null : ( <> {message.content} {message.isStreaming && } @@ -380,14 +386,14 @@ interface TraceGroupProps { /** * Collapsible group of tool-call / progress breadcrumbs. Defaults to - * expanded for discoverability; a single click on the header folds the - * group down to a one-line summary so it never dominates the thread. + * collapsed because tool traces are supporting evidence, not the answer. + * A single click expands the exact calls when the user wants details. */ function TraceGroup({ message, animClass }: TraceGroupProps) { const { t } = useTranslation(); const lines = message.traces ?? [message.content]; const count = lines.length; - const [open, setOpen] = useState(true); + const [open, setOpen] = useState(false); return (
+ {open && text.length > 0 && ( +
+ {text} +
+ )} +
+ ); +} diff --git a/webui/src/components/thread/ThreadComposer.tsx b/webui/src/components/thread/ThreadComposer.tsx index 572ac3966..b95a7bbc4 100644 --- a/webui/src/components/thread/ThreadComposer.tsx +++ b/webui/src/components/thread/ThreadComposer.tsx @@ -1,6 +1,7 @@ import { useCallback, useEffect, + useLayoutEffect, useMemo, useRef, useState, @@ -77,6 +78,17 @@ const COMMAND_ICONS: Record = { type ImageAspectRatio = "auto" | "1:1" | "3:4" | "9:16" | "4:3" | "16:9"; const IMAGE_ASPECT_RATIOS: ImageAspectRatio[] = ["auto", "1:1", "3:4", "9:16", "4:3", "16:9"]; +const SLASH_PALETTE_GAP_PX = 8; +const SLASH_PALETTE_MAX_HEIGHT_PX = 288; +const SLASH_PALETTE_MIN_HEIGHT_PX = 144; +const SLASH_PALETTE_CHROME_PX = 64; + +type SlashPalettePlacement = "above" | "below"; + +interface SlashPaletteLayout { + placement: SlashPalettePlacement; + maxHeight: number; +} function slashCommandI18nKey(command: string): string { return command.replace(/^\//, "").replace(/-/g, "_"); @@ -96,6 +108,24 @@ function scrollNearestOverflowParent(target: EventTarget | null, deltaY: number) } } +function getVisibleBounds(el: HTMLElement): { top: number; bottom: number } { + let top = 0; + let bottom = window.innerHeight; + let parent = el.parentElement; + + while (parent) { + const style = window.getComputedStyle(parent); + if (/(auto|scroll|hidden|clip)/.test(style.overflowY)) { + const rect = parent.getBoundingClientRect(); + top = Math.max(top, rect.top); + bottom = Math.min(bottom, rect.bottom); + } + parent = parent.parentElement; + } + + return { top, bottom }; +} + export function ThreadComposer({ onSend, disabled, @@ -117,6 +147,7 @@ export function ThreadComposer({ const [imageAspectRatio, setImageAspectRatio] = useState("auto"); const [aspectMenuOpen, setAspectMenuOpen] = useState(false); const textareaRef = useRef(null); + const formRef = useRef(null); const fileInputRef = useRef(null); const aspectControlRef = useRef(null); const chipRefs = useRef(new Map()); @@ -221,6 +252,10 @@ export function ThreadComposer({ }, [slashCommands, slashQuery, t]); const showSlashMenu = filteredSlashCommands.length > 0; + const [slashPaletteLayout, setSlashPaletteLayout] = useState({ + placement: "above", + maxHeight: SLASH_PALETTE_MAX_HEIGHT_PX, + }); useEffect(() => { setSelectedCommandIndex(0); @@ -232,6 +267,56 @@ export function ThreadComposer({ } }, [filteredSlashCommands.length, selectedCommandIndex]); + useEffect(() => { + if (!showSlashMenu) return; + + const dismissOnPointerDown = (event: PointerEvent) => { + const target = event.target; + if (target instanceof Node && formRef.current?.contains(target)) return; + setSlashMenuDismissed(true); + }; + + document.addEventListener("pointerdown", dismissOnPointerDown, true); + return () => { + document.removeEventListener("pointerdown", dismissOnPointerDown, true); + }; + }, [showSlashMenu]); + + useLayoutEffect(() => { + if (!showSlashMenu) return; + + const updateLayout = () => { + const form = formRef.current; + if (!form) return; + const rect = form.getBoundingClientRect(); + if (rect.width === 0 && rect.height === 0) return; + + const bounds = getVisibleBounds(form); + const spaceAbove = Math.max(0, rect.top - bounds.top - SLASH_PALETTE_GAP_PX); + const spaceBelow = Math.max(0, bounds.bottom - rect.bottom - SLASH_PALETTE_GAP_PX); + const placement: SlashPalettePlacement = + spaceAbove >= SLASH_PALETTE_MIN_HEIGHT_PX || spaceAbove >= spaceBelow + ? "above" + : "below"; + const available = placement === "above" ? spaceAbove : spaceBelow; + const maxHeight = Math.min(SLASH_PALETTE_MAX_HEIGHT_PX, available); + + setSlashPaletteLayout((current) => + current.placement === placement && current.maxHeight === maxHeight + ? current + : { placement, maxHeight }, + ); + }; + + updateLayout(); + window.addEventListener("resize", updateLayout); + document.addEventListener("scroll", updateLayout, true); + return () => { + window.removeEventListener("resize", updateLayout); + document.removeEventListener("scroll", updateLayout, true); + }; + }, [filteredSlashCommands.length, showSlashMenu]); + useEffect(() => { if (!aspectMenuOpen) return; @@ -398,6 +483,7 @@ export function ThreadComposer({ return (
{ e.preventDefault(); submit(); @@ -412,6 +498,7 @@ export function ThreadComposer({ void; onChoose: (command: SlashCommand) => void; @@ -695,17 +783,24 @@ function ImageAspectMenu({ function SlashCommandPalette({ commands, selectedIndex, + layout, isHero, onHover, onChoose, }: SlashCommandPaletteProps) { const { t } = useTranslation(); + const listMaxHeight = Math.max( + 0, + layout.maxHeight - SLASH_PALETTE_CHROME_PX, + ); return (
{t("thread.composer.slash.label")}
-
+
{commands.map((command, index) => { const Icon = COMMAND_ICONS[command.icon] ?? CircleHelp; const selected = index === selectedIndex; diff --git a/webui/src/components/thread/ThreadMessages.tsx b/webui/src/components/thread/ThreadMessages.tsx index 1ef5c864b..3d3d068f3 100644 --- a/webui/src/components/thread/ThreadMessages.tsx +++ b/webui/src/components/thread/ThreadMessages.tsx @@ -1,4 +1,5 @@ import { MessageBubble } from "@/components/MessageBubble"; +import { cn } from "@/lib/utils"; import type { UIMessage } from "@/lib/types"; interface ThreadMessagesProps { @@ -7,10 +8,30 @@ interface ThreadMessagesProps { export function ThreadMessages({ messages }: ThreadMessagesProps) { return ( -
- {messages.map((message) => ( - - ))} +
+ {messages.map((message, index) => { + const prev = messages[index - 1]; + const compact = isAuxiliaryRow(message) && prev && isAuxiliaryRow(prev); + return ( +
0 && (compact ? "mt-2" : "mt-5"))} + > + +
+ ); + })}
); } + +function isAuxiliaryRow(message: UIMessage): boolean { + return ( + message.kind === "trace" + || ( + message.role === "assistant" + && message.content.trim().length === 0 + && (!!message.reasoning || !!message.reasoningStreaming) + ) + ); +} diff --git a/webui/src/components/thread/ThreadShell.tsx b/webui/src/components/thread/ThreadShell.tsx index c5c488de0..0d330c2a9 100644 --- a/webui/src/components/thread/ThreadShell.tsx +++ b/webui/src/components/thread/ThreadShell.tsx @@ -260,6 +260,7 @@ export function ThreadShell({ } modelLabel={toModelBadgeLabel(modelName)} variant="hero" + slashCommands={slashCommands} imageMode={heroImageMode} onImageModeChange={setHeroImageMode} /> diff --git a/webui/src/globals.css b/webui/src/globals.css index 802009ee7..a365e33b6 100644 --- a/webui/src/globals.css +++ b/webui/src/globals.css @@ -117,6 +117,34 @@ --cjk-line-height: 1.625; } + /* Shimmer band sweeping across the reasoning header while + ``reasoning_delta`` frames are arriving. Pure CSS, no JS animation, + respects ``prefers-reduced-motion``. */ + @keyframes reasoning-shimmer-sweep { + 0% { + background-position: -200% 0; + } + 100% { + background-position: 200% 0; + } + } + .reasoning-shimmer { + background-image: linear-gradient( + 90deg, + transparent 0%, + hsl(var(--muted-foreground) / 0.18) 50%, + transparent 100% + ); + background-size: 200% 100%; + background-repeat: no-repeat; + animation: reasoning-shimmer-sweep 2.2s linear infinite; + } + @media (prefers-reduced-motion: reduce) { + .reasoning-shimmer { + animation: none; + } + } + /* Subtle scrollbar that doesn't fight the dark background. */ .scrollbar-thin { scrollbar-width: thin; diff --git a/webui/src/hooks/useNanobotStream.ts b/webui/src/hooks/useNanobotStream.ts index 8ec1a9ac4..10f1e2400 100644 --- a/webui/src/hooks/useNanobotStream.ts +++ b/webui/src/hooks/useNanobotStream.ts @@ -18,6 +18,95 @@ interface StreamBuffer { parts: string[]; } +/** + * Append a reasoning chunk to the last open reasoning stream in ``prev``. + * + * Lookup rule: prefer the most recent assistant turn in the active UI tail. + * Most providers emit reasoning before answer text, but some only expose + * ``reasoning_content`` after the answer stream completes. In that post-hoc + * case the reasoning still belongs to the same assistant turn and must render + * above the answer, not as a new row below it. + */ +function attachReasoningChunk(prev: UIMessage[], chunk: string): UIMessage[] { + for (let i = prev.length - 1; i >= 0; i -= 1) { + const candidate = prev[i]; + // A user turn is a hard boundary: reasoning after it belongs to the new + // assistant turn, never to an earlier assistant reply. + if (candidate.role === "user") break; + // A trace row (e.g. Used tools) is also a phase boundary. Reasoning after + // tools belongs to the next assistant iteration, not the assistant turn + // that produced those tool calls. + if (candidate.kind === "trace") break; + if (candidate.role !== "assistant") continue; + const hasAnswer = candidate.content.length > 0; + if ( + candidate.reasoningStreaming + || candidate.reasoning !== undefined + || hasAnswer + || candidate.isStreaming + ) { + const merged: UIMessage = { + ...candidate, + reasoning: (candidate.reasoning ?? "") + chunk, + reasoningStreaming: true, + }; + return [...prev.slice(0, i), merged, ...prev.slice(i + 1)]; + } + if (!hasAnswer && candidate.isStreaming) { + const merged: UIMessage = { + ...candidate, + reasoning: chunk, + reasoningStreaming: true, + }; + return [...prev.slice(0, i), merged, ...prev.slice(i + 1)]; + } + break; + } + return [ + ...prev, + { + id: crypto.randomUUID(), + role: "assistant", + content: "", + isStreaming: true, + reasoning: chunk, + reasoningStreaming: true, + createdAt: Date.now(), + }, + ]; +} + +/** + * Find the most recent assistant placeholder that an incoming answer + * delta should adopt instead of spawning a parallel row. We look for an + * empty-content assistant turn that is still marked ``isStreaming`` — + * typically created earlier by ``reasoning_delta``. Anything else means + * the model already produced an answer in a previous turn, so the new + * delta belongs in a fresh row. + */ +function findActiveAssistantPlaceholder(prev: UIMessage[]): string | null { + const last = prev[prev.length - 1]; + if (!last) return null; + if (last.role !== "assistant" || last.kind === "trace") return null; + if (last.content.length > 0) return null; + if (!last.isStreaming) return null; + return last.id; +} + +/** + * Close the active reasoning stream segment, if any. Idempotent: a + * ``reasoning_end`` with no preceding deltas is a harmless no-op. + */ +function closeReasoningStream(prev: UIMessage[]): UIMessage[] { + for (let i = prev.length - 1; i >= 0; i -= 1) { + const candidate = prev[i]; + if (!candidate.reasoningStreaming) continue; + const merged: UIMessage = { ...candidate, reasoningStreaming: false }; + return [...prev.slice(0, i), merged, ...prev.slice(i + 1)]; + } + return prev; +} + /** * Subscribe to a chat by ID. Returns the in-memory message list for the chat, * a streaming flag, and a ``send`` function. Initial history must be seeded @@ -122,27 +211,42 @@ export function useNanobotStream( if (ev.event === "delta") { if (suppressStreamUntilTurnEndRef.current) return; - const id = buffer.current?.messageId ?? crypto.randomUUID(); - if (!buffer.current) { - buffer.current = { messageId: id, parts: [] }; - setMessages((prev) => [ - ...prev, - { - id, - role: "assistant", - content: "", - isStreaming: true, - createdAt: Date.now(), - }, - ]); - setIsStreaming(true); - } - buffer.current.parts.push(ev.text); - const combined = buffer.current.parts.join(""); - const targetId = buffer.current.messageId; - setMessages((prev) => - prev.map((m) => (m.id === targetId ? { ...m, content: combined } : m)), - ); + const chunk = ev.text; + setIsStreaming(true); + setMessages((prev) => { + // Reuse an in-flight assistant placeholder (typically created by + // ``reasoning_delta``) so the answer renders below its own + // thinking trace instead of in a parallel row. + const adopted = !buffer.current ? findActiveAssistantPlaceholder(prev) : null; + let targetId: string; + let next: UIMessage[]; + if (buffer.current) { + targetId = buffer.current.messageId; + next = prev; + } else if (adopted) { + targetId = adopted; + buffer.current = { messageId: targetId, parts: [] }; + next = prev; + } else { + targetId = crypto.randomUUID(); + buffer.current = { messageId: targetId, parts: [] }; + next = [ + ...prev, + { + id: targetId, + role: "assistant", + content: "", + isStreaming: true, + createdAt: Date.now(), + }, + ]; + } + buffer.current.parts.push(chunk); + const combined = buffer.current.parts.join(""); + return next.map((m) => + m.id === targetId ? { ...m, content: combined, isStreaming: true } : m, + ); + }); return; } @@ -159,6 +263,21 @@ export function useNanobotStream( return; } + if (ev.event === "reasoning_delta") { + if (suppressStreamUntilTurnEndRef.current) return; + const chunk = ev.text; + if (!chunk) return; + setMessages((prev) => attachReasoningChunk(prev, chunk)); + setIsStreaming(true); + return; + } + + if (ev.event === "reasoning_end") { + if (suppressStreamUntilTurnEndRef.current) return; + setMessages((prev) => closeReasoningStream(prev)); + return; + } + if (ev.event === "turn_end") { // Definitive signal that the turn is fully complete. Cancel any // pending debounce timer and stop the loading indicator immediately. @@ -183,10 +302,19 @@ export function useNanobotStream( if (ev.event === "message") { if ( suppressStreamUntilTurnEndRef.current && - (ev.kind === "tool_hint" || ev.kind === "progress") + (ev.kind === "tool_hint" || ev.kind === "progress" || ev.kind === "reasoning") ) { return; } + // Back-compat: a legacy ``kind: "reasoning"`` message (no streaming + // partner) is treated as one complete delta + immediate end so the + // bubble renders identically to the streaming path. + if (ev.kind === "reasoning") { + const line = ev.text; + if (!line) return; + setMessages((prev) => closeReasoningStream(attachReasoningChunk(prev, line))); + return; + } // Intermediate agent breadcrumbs (tool-call hints, raw progress). // Attach them to the last trace row if it was the last emitted item // so a sequence of calls collapses into one compact trace group. diff --git a/webui/src/hooks/useSessions.ts b/webui/src/hooks/useSessions.ts index e05e16a20..d1be437b7 100644 --- a/webui/src/hooks/useSessions.ts +++ b/webui/src/hooks/useSessions.ts @@ -14,6 +14,48 @@ import type { ChatSummary, UIMessage } from "@/lib/types"; const EMPTY_MESSAGES: UIMessage[] = []; +type HistoryMessage = Awaited>["messages"][number]; + +function reasoningFromHistory(message: HistoryMessage): string | undefined { + if (typeof message.reasoning_content === "string" && message.reasoning_content.trim()) { + return message.reasoning_content; + } + if (!Array.isArray(message.thinking_blocks)) return undefined; + const parts = message.thinking_blocks + .map((block) => { + if (!block || typeof block !== "object") return ""; + const thinking = (block as { thinking?: unknown }).thinking; + return typeof thinking === "string" ? thinking.trim() : ""; + }) + .filter(Boolean); + return parts.length > 0 ? parts.join("\n\n") : undefined; +} + +function formatToolCallTrace(call: unknown): string | null { + if (!call || typeof call !== "object") return null; + const item = call as { + name?: unknown; + function?: { name?: unknown; arguments?: unknown }; + }; + const name = + typeof item.function?.name === "string" + ? item.function.name + : typeof item.name === "string" + ? item.name + : ""; + if (!name) return null; + const args = item.function?.arguments; + if (typeof args === "string" && args.trim()) return `${name}(${args})`; + return `${name}()`; +} + +function toolTracesFromHistory(message: HistoryMessage): string[] { + if (!Array.isArray(message.tool_calls)) return []; + return message.tool_calls + .map(formatToolCallTrace) + .filter((trace): trace is string => !!trace); +} + /** Sidebar state: fetches the full session list and exposes create / delete actions. */ export function useSessions(): { sessions: ChatSummary[]; @@ -143,14 +185,28 @@ export function useSessionHistory(key: string | null): { m.role === "user" && media?.every((item) => item.kind === "image") ? media.map((item) => ({ url: item.url, name: item.name })) : undefined; + const row: UIMessage = { + id: `hist-${idx}`, + role: m.role, + content: m.content, + createdAt: m.timestamp ? Date.parse(m.timestamp) : Date.now(), + ...(images ? { images } : {}), + ...(media ? { media } : {}), + ...(m.role === "assistant" && reasoningFromHistory(m) + ? { reasoning: reasoningFromHistory(m), reasoningStreaming: false } + : {}), + }; + const traces = m.role === "assistant" ? toolTracesFromHistory(m) : []; + if (traces.length === 0) return [row]; return [ + ...(row.content.trim() || row.reasoning || row.media?.length ? [row] : []), { - id: `hist-${idx}`, - role: m.role, - content: m.content, + id: `hist-${idx}-tools`, + role: "tool" as const, + kind: "trace" as const, + content: traces[traces.length - 1], + traces, createdAt: m.timestamp ? Date.parse(m.timestamp) : Date.now(), - ...(images ? { images } : {}), - ...(media ? { media } : {}), }, ]; }); diff --git a/webui/src/i18n/locales/en/common.json b/webui/src/i18n/locales/en/common.json index 4cf1b6391..e82a8f5b7 100644 --- a/webui/src/i18n/locales/en/common.json +++ b/webui/src/i18n/locales/en/common.json @@ -332,6 +332,8 @@ "assistantTyping": "Assistant is typing", "toolSingle": "Using a tool", "toolMany": "Used {{count}} tools", + "reasoning": "Thinking", + "reasoningStreaming": "Thinking…", "imageAttachment": "Image attachment", "copyReply": "Copy reply", "copiedReply": "Copied reply" diff --git a/webui/src/i18n/locales/zh-CN/common.json b/webui/src/i18n/locales/zh-CN/common.json index fed932f29..18d4b5e16 100644 --- a/webui/src/i18n/locales/zh-CN/common.json +++ b/webui/src/i18n/locales/zh-CN/common.json @@ -320,6 +320,8 @@ "assistantTyping": "助手正在输入", "toolSingle": "正在使用工具", "toolMany": "已使用 {{count}} 个工具", + "reasoning": "思考过程", + "reasoningStreaming": "正在思考…", "imageAttachment": "图片附件", "copyReply": "复制回复", "copiedReply": "已复制回复" diff --git a/webui/src/lib/api.ts b/webui/src/lib/api.ts index 23a8c2a67..c27ebd3d6 100644 --- a/webui/src/lib/api.ts +++ b/webui/src/lib/api.ts @@ -89,6 +89,8 @@ export async function fetchSessionMessages( content: string; timestamp?: string; tool_calls?: unknown; + reasoning_content?: string | null; + thinking_blocks?: unknown; tool_call_id?: string; name?: string; /** Present on ``user`` turns that attached images. Paths have already diff --git a/webui/src/lib/types.ts b/webui/src/lib/types.ts index 5e7dc9288..25c317753 100644 --- a/webui/src/lib/types.ts +++ b/webui/src/lib/types.ts @@ -44,6 +44,13 @@ export interface UIMessage { images?: UIImage[]; /** Signed or local UI-renderable media attachments. */ media?: UIMediaAttachment[]; + /** Assistant turn: accumulated model reasoning / thinking text. Built up + * incrementally from ``reasoning_delta`` frames; finalized when + * ``reasoning_end`` arrives. */ + reasoning?: string; + /** True while ``reasoning_delta`` frames are still arriving for this turn. + * Drives the shimmer header on ``ReasoningBubble``. */ + reasoningStreaming?: boolean; } export interface ChatSummary { @@ -141,7 +148,7 @@ export type InboundEvent = media_urls?: Array<{ url: string; name?: string }>; /** Present when the frame is an agent breadcrumb (e.g. tool hint, * generic progress line) rather than a conversational reply. */ - kind?: "tool_hint" | "progress"; + kind?: "tool_hint" | "progress" | "reasoning"; } | { event: "delta"; @@ -154,6 +161,17 @@ export type InboundEvent = chat_id: string; stream_id?: string; } + | { + event: "reasoning_delta"; + chat_id: string; + text: string; + stream_id?: string; + } + | { + event: "reasoning_end"; + chat_id: string; + stream_id?: string; + } | { event: "runtime_model_updated"; model_name: string; diff --git a/webui/src/tests/message-bubble.test.tsx b/webui/src/tests/message-bubble.test.tsx index 35cdaed40..4f5d504dd 100644 --- a/webui/src/tests/message-bubble.test.tsx +++ b/webui/src/tests/message-bubble.test.tsx @@ -72,11 +72,12 @@ describe("MessageBubble", () => { render(); const toggle = screen.getByRole("button", { name: /used 2 tools/i }); - expect(screen.getByText('weather("get")')).toBeInTheDocument(); - expect(screen.getByText('search "hk weather"')).toBeInTheDocument(); + expect(screen.queryByText('weather("get")')).not.toBeInTheDocument(); + expect(screen.queryByText('search "hk weather"')).not.toBeInTheDocument(); fireEvent.click(toggle); - expect(screen.queryByText('weather("get")')).not.toBeInTheDocument(); + expect(screen.getByText('weather("get")')).toBeInTheDocument(); + expect(screen.getByText('search "hk weather"')).toBeInTheDocument(); }); it("renders video media as an inline player", () => { @@ -103,6 +104,45 @@ describe("MessageBubble", () => { expect(container.querySelector("video[controls]")).toBeInTheDocument(); }); + it("auto-expands the reasoning trace while streaming with a shimmer header", () => { + const message: UIMessage = { + id: "a-reasoning-streaming", + role: "assistant", + content: "", + createdAt: Date.now(), + reasoning: "Step 1: parse intent. Step 2: compute.", + reasoningStreaming: true, + }; + + const { container } = render(); + + expect(screen.getByText("Thinking…")).toBeInTheDocument(); + expect(screen.getByText(/Step 1: parse intent\./)).toBeInTheDocument(); + expect(container.querySelector(".reasoning-shimmer")).toBeInTheDocument(); + expect(screen.getByRole("button", { name: /thinking/i }).parentElement).not.toHaveClass("mb-2"); + }); + + it("collapses the reasoning section by default once streaming ends", () => { + const message: UIMessage = { + id: "a-reasoning-done", + role: "assistant", + content: "The answer is 42.", + createdAt: Date.now(), + reasoning: "hidden until expanded", + reasoningStreaming: false, + }; + + render(); + + expect(screen.getByText("Thinking")).toBeInTheDocument(); + expect(screen.getByText("The answer is 42.")).toBeInTheDocument(); + expect(screen.queryByText("hidden until expanded")).not.toBeInTheDocument(); + expect(screen.getByRole("button", { name: /thinking/i }).parentElement).toHaveClass("mb-2"); + + fireEvent.click(screen.getByRole("button", { name: /thinking/i })); + expect(screen.getByText("hidden until expanded")).toBeInTheDocument(); + }); + it("renders assistant image media as a larger generated result", () => { const message: UIMessage = { id: "a-image", diff --git a/webui/src/tests/thread-composer.test.tsx b/webui/src/tests/thread-composer.test.tsx index 7b147602a..015ff50ad 100644 --- a/webui/src/tests/thread-composer.test.tsx +++ b/webui/src/tests/thread-composer.test.tsx @@ -1,5 +1,5 @@ -import { fireEvent, render, screen } from "@testing-library/react"; -import { describe, expect, it, vi } from "vitest"; +import { fireEvent, render, screen, waitFor } from "@testing-library/react"; +import { afterEach, describe, expect, it, vi } from "vitest"; import { ThreadComposer } from "@/components/thread/ThreadComposer"; import type { SlashCommand } from "@/lib/types"; @@ -19,6 +19,33 @@ const COMMANDS: SlashCommand[] = [ argHint: "[n]", }, ]; +const ORIGINAL_INNER_HEIGHT = window.innerHeight; + +afterEach(() => { + vi.restoreAllMocks(); + Object.defineProperty(window, "innerHeight", { + value: ORIGINAL_INNER_HEIGHT, + configurable: true, + }); +}); + +function rect(init: Partial): DOMRect { + const top = init.top ?? 0; + const left = init.left ?? 0; + const width = init.width ?? 0; + const height = init.height ?? 0; + return { + x: init.x ?? left, + y: init.y ?? top, + top, + left, + width, + height, + right: init.right ?? left + width, + bottom: init.bottom ?? top + height, + toJSON: () => ({}), + }; +} describe("ThreadComposer", () => { it("renders a readonly hero model composer when provided", () => { @@ -74,7 +101,9 @@ describe("ThreadComposer", () => { const input = screen.getByLabelText("Message input"); fireEvent.change(input, { target: { value: "/" } }); - expect(screen.getByRole("listbox", { name: "Slash commands" })).toBeInTheDocument(); + const palette = screen.getByRole("listbox", { name: "Slash commands" }); + expect(palette).toBeInTheDocument(); + expect(palette).toHaveStyle({ maxHeight: "288px" }); expect(screen.getByRole("option", { name: /\/stop/i })).toHaveAttribute( "aria-selected", "true", @@ -92,6 +121,55 @@ describe("ThreadComposer", () => { expect(screen.queryByRole("listbox", { name: "Slash commands" })).not.toBeInTheDocument(); }); + it("opens the slash command palette downward when there is more room below", async () => { + vi.spyOn(HTMLFormElement.prototype, "getBoundingClientRect").mockReturnValue( + rect({ top: 40, bottom: 160, width: 800, height: 120 }), + ); + Object.defineProperty(window, "innerHeight", { + value: 330, + configurable: true, + }); + render( + , + ); + const input = screen.getByLabelText("Message input"); + + fireEvent.change(input, { target: { value: "/" } }); + + await waitFor(() => { + const palette = screen.getByRole("listbox", { name: "Slash commands" }); + expect(palette.className).toContain("top-full"); + expect(palette).toHaveStyle({ maxHeight: "162px" }); + }); + }); + + it("dismisses the slash command palette on outside click", () => { + render( +
+ + +
, + ); + + fireEvent.change(screen.getByLabelText("Message input"), { + target: { value: "/" }, + }); + expect(screen.getByRole("listbox", { name: "Slash commands" })).toBeInTheDocument(); + + fireEvent.pointerDown(screen.getByRole("button", { name: "outside" })); + + expect(screen.queryByRole("listbox", { name: "Slash commands" })).not.toBeInTheDocument(); + }); + it("sends image generation mode with automatic aspect ratio", () => { const onSend = vi.fn(); render( diff --git a/webui/src/tests/thread-messages.test.tsx b/webui/src/tests/thread-messages.test.tsx new file mode 100644 index 000000000..710b86298 --- /dev/null +++ b/webui/src/tests/thread-messages.test.tsx @@ -0,0 +1,52 @@ +import { render } from "@testing-library/react"; +import { describe, expect, it } from "vitest"; + +import { ThreadMessages } from "@/components/thread/ThreadMessages"; +import type { UIMessage } from "@/lib/types"; + +describe("ThreadMessages", () => { + it("uses compact spacing between consecutive auxiliary rows", () => { + const messages: UIMessage[] = [ + { + id: "r1", + role: "assistant", + content: "", + reasoning: "thinking", + reasoningStreaming: false, + isStreaming: true, + createdAt: Date.now(), + }, + { + id: "t1", + role: "tool", + kind: "trace", + content: "search()", + traces: ["search()"], + createdAt: Date.now(), + }, + { + id: "r2", + role: "assistant", + content: "", + reasoning: "more thinking", + reasoningStreaming: false, + isStreaming: true, + createdAt: Date.now(), + }, + { + id: "a1", + role: "assistant", + content: "final answer", + createdAt: Date.now(), + }, + ]; + + const { container } = render(); + const rows = Array.from(container.firstElementChild?.children ?? []); + + expect(rows[0]).not.toHaveClass("mt-2", "mt-5"); + expect(rows[1]).toHaveClass("mt-2"); + expect(rows[2]).toHaveClass("mt-2"); + expect(rows[3]).toHaveClass("mt-5"); + }); +}); diff --git a/webui/src/tests/thread-shell.test.tsx b/webui/src/tests/thread-shell.test.tsx index 8dd999d6b..f9bf7db0c 100644 --- a/webui/src/tests/thread-shell.test.tsx +++ b/webui/src/tests/thread-shell.test.tsx @@ -573,7 +573,7 @@ describe("ThreadShell", () => { await waitFor(() => expect(screen.getByText("live assistant reply")).toBeInTheDocument()); }); - it("does not open slash commands on the blank welcome page", async () => { + it("opens slash commands on the blank welcome page", async () => { const client = makeClient(); vi.stubGlobal( "fetch", @@ -583,10 +583,11 @@ describe("ThreadShell", () => { return httpJson({ commands: [ { - command: "/stop", - title: "Stop current task", - description: "Cancel the active agent turn.", - icon: "square", + command: "/history", + title: "Show conversation history", + description: "Print the last N persisted messages.", + icon: "history", + arg_hint: "[n]", }, ], }); @@ -622,7 +623,8 @@ describe("ThreadShell", () => { target: { value: "/" }, }); - expect(screen.queryByRole("listbox", { name: "Slash commands" })).not.toBeInTheDocument(); + expect(screen.getByRole("listbox", { name: "Slash commands" })).toBeInTheDocument(); + expect(screen.getByRole("option", { name: /\/history/i })).toBeInTheDocument(); }); it("switches welcome quick actions when image mode is enabled", async () => { diff --git a/webui/src/tests/useNanobotStream.test.tsx b/webui/src/tests/useNanobotStream.test.tsx index 60e6ada62..0aa069cfb 100644 --- a/webui/src/tests/useNanobotStream.test.tsx +++ b/webui/src/tests/useNanobotStream.test.tsx @@ -113,6 +113,208 @@ describe("useNanobotStream", () => { expect(result.current.messages[1].kind).toBeUndefined(); }); + it("accumulates reasoning_delta chunks on a placeholder until reasoning_end", () => { + const fake = fakeClient(); + const { result } = renderHook(() => useNanobotStream("chat-r", EMPTY_MESSAGES), { + wrapper: wrap(fake.client), + }); + + act(() => { + fake.emit("chat-r", { + event: "reasoning_delta", + chat_id: "chat-r", + text: "Let me think ", + }); + fake.emit("chat-r", { + event: "reasoning_delta", + chat_id: "chat-r", + text: "step by step.", + }); + }); + + expect(result.current.messages).toHaveLength(1); + expect(result.current.messages[0].role).toBe("assistant"); + expect(result.current.messages[0].reasoning).toBe("Let me think step by step."); + expect(result.current.messages[0].reasoningStreaming).toBe(true); + + act(() => { + fake.emit("chat-r", { event: "reasoning_end", chat_id: "chat-r" }); + }); + + expect(result.current.messages[0].reasoningStreaming).toBe(false); + expect(result.current.messages[0].reasoning).toBe("Let me think step by step."); + }); + + it("absorbs a streaming reasoning placeholder into the answer turn that follows", () => { + const fake = fakeClient(); + const { result } = renderHook(() => useNanobotStream("chat-r2", EMPTY_MESSAGES), { + wrapper: wrap(fake.client), + }); + + act(() => { + fake.emit("chat-r2", { + event: "reasoning_delta", + chat_id: "chat-r2", + text: "Plan first.", + }); + fake.emit("chat-r2", { event: "reasoning_end", chat_id: "chat-r2" }); + fake.emit("chat-r2", { + event: "delta", + chat_id: "chat-r2", + text: "The answer is 42.", + }); + fake.emit("chat-r2", { event: "stream_end", chat_id: "chat-r2" }); + }); + + expect(result.current.messages).toHaveLength(1); + expect(result.current.messages[0].content).toBe("The answer is 42."); + expect(result.current.messages[0].reasoning).toBe("Plan first."); + expect(result.current.messages[0].reasoningStreaming).toBe(false); + }); + + it("ignores empty reasoning_delta frames", () => { + const fake = fakeClient(); + const { result } = renderHook(() => useNanobotStream("chat-r3", EMPTY_MESSAGES), { + wrapper: wrap(fake.client), + }); + + act(() => { + fake.emit("chat-r3", { + event: "reasoning_delta", + chat_id: "chat-r3", + text: "", + }); + }); + + expect(result.current.messages).toHaveLength(0); + }); + + it("treats legacy kind=reasoning messages as a complete delta + end pair", () => { + const fake = fakeClient(); + const { result } = renderHook(() => useNanobotStream("chat-r4", EMPTY_MESSAGES), { + wrapper: wrap(fake.client), + }); + + act(() => { + fake.emit("chat-r4", { + event: "message", + chat_id: "chat-r4", + text: "one-shot reasoning", + kind: "reasoning", + }); + }); + + expect(result.current.messages).toHaveLength(1); + expect(result.current.messages[0].reasoning).toBe("one-shot reasoning"); + expect(result.current.messages[0].reasoningStreaming).toBe(false); + }); + + it("attaches post-hoc reasoning to the same assistant turn above the answer", () => { + const fake = fakeClient(); + const { result } = renderHook(() => useNanobotStream("chat-r5", EMPTY_MESSAGES), { + wrapper: wrap(fake.client), + }); + + act(() => { + fake.emit("chat-r5", { + event: "delta", + chat_id: "chat-r5", + text: "hi~", + }); + fake.emit("chat-r5", { event: "stream_end", chat_id: "chat-r5" }); + fake.emit("chat-r5", { + event: "reasoning_delta", + chat_id: "chat-r5", + text: "This reasoning arrived after the answer stream.", + }); + fake.emit("chat-r5", { event: "reasoning_end", chat_id: "chat-r5" }); + }); + + expect(result.current.messages).toHaveLength(1); + expect(result.current.messages[0].content).toBe("hi~"); + expect(result.current.messages[0].reasoning).toBe( + "This reasoning arrived after the answer stream.", + ); + expect(result.current.messages[0].reasoningStreaming).toBe(false); + }); + + it("does not attach a new turn's reasoning across the latest user boundary", () => { + const fake = fakeClient(); + const initialMessages = [ + { + id: "a-prev", + role: "assistant" as const, + content: "Previous answer.", + reasoning: "Previous thought.", + createdAt: Date.now(), + }, + { + id: "u-next", + role: "user" as const, + content: "Next question", + createdAt: Date.now(), + }, + ]; + const { result } = renderHook( + () => useNanobotStream("chat-r6", initialMessages), + { wrapper: wrap(fake.client) }, + ); + + act(() => { + fake.emit("chat-r6", { + event: "reasoning_delta", + chat_id: "chat-r6", + text: "New turn thinking.", + }); + }); + + expect(result.current.messages).toHaveLength(3); + expect(result.current.messages[0].reasoning).toBe("Previous thought."); + expect(result.current.messages[2].role).toBe("assistant"); + expect(result.current.messages[2].content).toBe(""); + expect(result.current.messages[2].reasoning).toBe("New turn thinking."); + expect(result.current.messages[2].reasoningStreaming).toBe(true); + }); + + it("does not attach reasoning across a tool trace boundary", () => { + const fake = fakeClient(); + const { result } = renderHook(() => useNanobotStream("chat-r7", EMPTY_MESSAGES), { + wrapper: wrap(fake.client), + }); + + act(() => { + fake.emit("chat-r7", { + event: "reasoning_delta", + chat_id: "chat-r7", + text: "First reasoning.", + }); + fake.emit("chat-r7", { event: "reasoning_end", chat_id: "chat-r7" }); + fake.emit("chat-r7", { + event: "message", + chat_id: "chat-r7", + text: "web_search({\"query\":\"OpenClaw\"})", + kind: "tool_hint", + }); + fake.emit("chat-r7", { + event: "reasoning_delta", + chat_id: "chat-r7", + text: "Second reasoning.", + }); + }); + + expect(result.current.messages).toHaveLength(3); + expect(result.current.messages.map((m) => m.kind ?? "message")).toEqual([ + "message", + "trace", + "message", + ]); + expect(result.current.messages[0].reasoning).toBe("First reasoning."); + expect(result.current.messages[1].traces).toEqual([ + "web_search({\"query\":\"OpenClaw\"})", + ]); + expect(result.current.messages[2].reasoning).toBe("Second reasoning."); + }); + it("attaches assistant media_urls to complete messages", () => { const fake = fakeClient(); const { result } = renderHook(() => useNanobotStream("chat-m", EMPTY_MESSAGES), { diff --git a/webui/src/tests/useSessions.test.tsx b/webui/src/tests/useSessions.test.tsx index 4805c6567..988b97252 100644 --- a/webui/src/tests/useSessions.test.tsx +++ b/webui/src/tests/useSessions.test.tsx @@ -170,6 +170,92 @@ describe("useSessions", () => { ]); }); + it("hydrates persisted assistant reasoning into the replayed message", async () => { + vi.mocked(api.fetchSessionMessages).mockResolvedValue({ + key: "websocket:chat-reasoning", + created_at: "2026-04-20T10:00:00Z", + updated_at: "2026-04-20T10:05:00Z", + messages: [ + { + role: "assistant", + content: "final answer", + timestamp: "2026-04-20T10:00:01Z", + reasoning_content: "hidden but persisted reasoning", + }, + ], + }); + + const { result } = renderHook(() => useSessionHistory("websocket:chat-reasoning"), { + wrapper: wrap(fakeClient()), + }); + + await waitFor(() => expect(result.current.loading).toBe(false)); + + expect(result.current.messages).toHaveLength(1); + expect(result.current.messages[0].role).toBe("assistant"); + expect(result.current.messages[0].content).toBe("final answer"); + expect(result.current.messages[0].reasoning).toBe("hidden but persisted reasoning"); + expect(result.current.messages[0].reasoningStreaming).toBe(false); + }); + + it("hydrates historical assistant tool calls into a replay trace row", async () => { + vi.mocked(api.fetchSessionMessages).mockResolvedValue({ + key: "websocket:chat-tools", + created_at: "2026-04-20T10:00:00Z", + updated_at: "2026-04-20T10:05:00Z", + messages: [ + { + role: "user", + content: "research this", + timestamp: "2026-04-20T10:00:00Z", + }, + { + role: "assistant", + content: "", + timestamp: "2026-04-20T10:00:01Z", + tool_calls: [ + { + id: "call-1", + type: "function", + function: { name: "web_search", arguments: "{\"query\":\"agents\"}" }, + }, + { + id: "call-2", + type: "function", + function: { name: "web_fetch", arguments: "{\"url\":\"https://example.com\"}" }, + }, + ], + }, + { + role: "tool", + content: "tool output that should not render directly", + timestamp: "2026-04-20T10:00:02Z", + tool_call_id: "call-1", + }, + { + role: "assistant", + content: "summary", + timestamp: "2026-04-20T10:00:03Z", + }, + ], + }); + + const { result } = renderHook(() => useSessionHistory("websocket:chat-tools"), { + wrapper: wrap(fakeClient()), + }); + + await waitFor(() => expect(result.current.loading).toBe(false)); + + expect(result.current.messages.map((m) => m.role)).toEqual(["user", "tool", "assistant"]); + const trace = result.current.messages[1]; + expect(trace.kind).toBe("trace"); + expect(trace.traces).toEqual([ + "web_search({\"query\":\"agents\"})", + "web_fetch({\"url\":\"https://example.com\"})", + ]); + expect(result.current.messages[2].content).toBe("summary"); + }); + it("flags history with trailing assistant tool calls as still pending", async () => { vi.mocked(api.fetchSessionMessages).mockResolvedValue({ key: "websocket:chat-pending",