Live rendering merged reasoning chunks by scanning backward to the latest
assistant row. That fixed late reasoning, but the scan skipped trace rows,
so reasoning after a tool call crossed the Used tools block and attached to
the previous assistant iteration. Refresh looked correct because persisted
history reconstructs assistant/tool boundaries.
Treat trace rows as hard phase boundaries, just like user messages. A
reasoning_delta after Used tools now starts a fresh assistant placeholder,
so live rendering matches replay: Thinking -> Used tools -> Thinking ->
Used tools / answer.
Add a regression for reasoning_delta -> reasoning_end -> tool_hint ->
reasoning_delta.
Co-authored-by: Cursor <cursoragent@cursor.com>
Live reasoning/tool frames were rendering correctly, but refreshing WebUI
replayed only role/content/media from `/api/sessions/:key/messages`.
Assistant `reasoning_content` / `thinking_blocks` and `tool_calls` were
already persisted by the backend and returned by the history endpoint, but
useSessionHistory discarded them.
Hydrate persisted assistant reasoning into `UIMessage.reasoning` and
reconstruct assistant tool calls as `kind: "trace"` rows so the replayed
thread keeps the same Thinking bubble and Used tools block as the live
stream. Tool result rows remain hidden from the conversation view to avoid
replaying raw tool output as chat text.
Adds regression coverage for both persisted reasoning and historical tool
call trace hydration.
Co-authored-by: Cursor <cursoragent@cursor.com>
The post-hoc reasoning fix allowed late reasoning frames to attach back to
the nearest assistant message, but the scan crossed a newer user message.
That made the next turn's Thinking bubble render above the previous
assistant reply.
Treat the latest user message as a hard boundary: reasoning after it must
start a new assistant placeholder and can no longer attach to earlier
assistant turns. Add a regression covering previous assistant -> new user
-> reasoning_delta.
Co-authored-by: Cursor <cursoragent@cursor.com>
Some providers only surface structured `reasoning_content` after answer
text has already streamed. The WebUI was treating those late
`reasoning_delta` frames as a fresh assistant placeholder, so the
Thinking bubble rendered below the already-visible answer.
Attach late reasoning back to the active assistant turn instead. The
bubble still renders above the message content, preserving the expected
Thinking -> answer order even when the provider protocol delivers the
reasoning post-hoc. Added a regression test for answer-first followed by
reasoning_delta/reasoning_end.
Co-authored-by: Cursor <cursoragent@cursor.com>
Reasoning now flows as its own stream — symmetric to the answer's
``delta`` / ``stream_end`` pair — instead of being shipped as one
oversized progress message. This lets WebUI render a live "Thinking…"
bubble that updates in place, then auto-collapses when the stream
closes. Other channels remain plugin no-ops by default.
## Protocol
New metadata: ``_reasoning_delta`` (chunk) and ``_reasoning_end``
(close marker). ChannelManager routes both to the dedicated plugin
hooks below; the legacy one-shot ``_reasoning`` is kept for back-compat
and BaseChannel expands it into a single delta + end pair so plugins
only ever implement the streaming primitives.
WebSocket emits two new events:
- ``reasoning_delta`` (event, chat_id, text, optional stream_id)
- ``reasoning_end`` (event, chat_id, optional stream_id)
## BaseChannel surface
- ``send_reasoning_delta(chat_id, delta, metadata)`` — no-op default
- ``send_reasoning_end(chat_id, metadata)`` — no-op default
- ``send_reasoning(msg)`` — back-compat wrapper, base impl forwards
to the streaming primitives
A channel adds reasoning support by overriding the two streaming
primitives. Telegram / Slack / Discord / Feishu / WeChat / Matrix keep
the base no-ops until their bubble UIs are adapted; reasoning silently
drops at dispatch, never as a stray text message.
## AgentHook
Adds ``emit_reasoning_end`` to the hook lifecycle. ``_LoopHook`` tracks
whether a reasoning segment is open and closes it on:
- the first answer delta arriving (so the UI locks the bubble before
the answer renders below),
- ``on_stream_end``,
- one-shot ``reasoning_content`` / ``thinking_blocks`` after a single
non-streaming response.
## WebUI
- ``UIMessage.reasoning`` is now a single accumulated string with a
companion ``reasoningStreaming`` flag.
- ``useNanobotStream`` consumes ``reasoning_delta`` / ``reasoning_end``;
legacy ``kind: "reasoning"`` is auto-translated to a delta + end.
- New ``ReasoningBubble``: shimmer header + auto-expanded while
streaming, collapses to a clickable "Thinking" pill once closed,
respects ``prefers-reduced-motion``.
- Answer deltas adopt the reasoning placeholder so the bubble and the
answer share one assistant row.
## Tests
- ``tests/channels/test_channel_manager_reasoning.py`` — manager routes
delta + end, drops on channel opt-out, expands one-shot back-compat.
- ``tests/channels/test_websocket_channel.py`` — new ``reasoning_delta``
/ ``reasoning_end`` frames, empty-chunk safety, no-subscriber safety,
back-compat expansion.
- ``tests/agent/test_runner_reasoning.py`` — runner closes the segment
on streaming answer start and after one-shot reasoning.
- WebUI ``useNanobotStream`` + ``message-bubble`` cover the new
protocol and the shimmer styling.
## Docs
``docs/configuration.md`` and ``docs/websocket.md`` document the new
events and the plugin contract.
Co-authored-by: Cursor <cursoragent@cursor.com>
Reasoning was being shipped to every channel as a generic progress
message with a `_reasoning: true` flag. Two problems with that:
1. Channels without a low-emphasis UI primitive (Telegram, Slack,
Discord, Feishu...) would dump raw model thoughts as ordinary
replies, polluting the conversation.
2. The agent loop double-gated by inspecting `channels_config`, which
coupled the loop to display policy.
Treat reasoning as its own plugin action — `BaseChannel.send_reasoning`
defaults to a documented no-op; channels that have a fitting affordance
override. ChannelManager routes `_reasoning` outbounds to that method
only when the channel opts in via `show_reasoning` (camelCase alias
`showReasoning` mirrors `sendProgress`). Plugins that don't override
silently drop reasoning — "no fit, no leak" is the contract.
Reference implementation lands for WebSocket / WebUI: a new
`kind: "reasoning"` frame, parked on the active assistant bubble as a
collapsible `Thinking` group above the answer. CLI keeps its existing
direct path (it doesn't go through the bus). `ChannelsConfig.show_reasoning`
flips to `true` by default — only adapted channels surface anything,
others stay quiet.
Loop net diff is -3 lines: the `channels_config.show_reasoning` check
moves out, leaving emit_reasoning a one-liner that publishes and trusts
the channel to decide.
Co-authored-by: Cursor <cursoragent@cursor.com>
The ask_user tool used AskUserInterrupt(BaseException) for mid-turn
blocking, creating heavy coupling across runner, loop, and session
management. The model now asks questions naturally in response text,
the turn ends normally, and the user's next message starts a new turn
with session history providing continuity.
Removed:
- nanobot/agent/tools/ask.py (tool, interrupt, helpers)
- tests/agent/test_ask_user.py
- webui/src/components/thread/AskUserPrompt.tsx
- AskUserInterrupt handling in runner.py
- Dual-path message building in loop.py
- Pending ask detection via history scanning
- button_prompt/buttons emission in WebSocket channel
- ask_user references in Slack channel docstrings
Preserved (MessageTool uses these independently):
- OutboundMessage.buttons field
- Channel button rendering (Telegram, Slack, WebSocket)
Align the WebUI sidebar and chat chrome with the updated design, and generate WebUI session titles asynchronously without blocking turns.
Co-authored-by: Cursor <cursoragent@cursor.com>
Keep the new turn-end signal scoped to WebSocket clients, preserve pending tool-call state across trailing tool result rows, and drop the accidental npm lockfile from the Bun-based WebUI.
Co-authored-by: Cursor <cursoragent@cursor.com>
Add signed media URLs to live WebSocket replies and teach the WebUI to classify and render video attachments, so bot-sent videos can play inline in both live chats and session history.
Made-with: Cursor