Feishu was doing its own is_allowed check before _handle_message
without considering is_dm, so unrecognised p2p senders were silently
ignored instead of receiving a pairing code.
- Remove the early self.is_allowed() return so BaseChannel can handle
permission checks and pairing uniformly.
- Pass is_dm=chat_type == "p2p" to _handle_message so DM pairing
works for Feishu/Lark private chats.
WebSocket already authenticates clients at handshake time via token
or issued-token validation. Setting is_dm=True caused unrecognised
clients to receive a pairing code after they had already passed
token auth, which is nonsensical for a browser-tab client.
Treat WebSocket as non-DM so pairing is never offered; access control
remains at the WS handshake level (allow_from + token gate).
- Extract format_pairing_reply() and format_expiry() to eliminate
duplication between BaseChannel and SlackChannel.
- Use _write_text_atomic() from helpers.py instead of hand-rolled
fsync logic in pairing store.
- Convert approved lists to in-memory sets for O(1) lookup.
- Remove collision retry loop (8-char entropy is sufficient).
- Fix /pairing command parsing to split prefix exactly.
- Remove unused import time from base.py.
- Fix tests to pass subcommand_text, not full /pairing string.
- Add os.fsync with Windows-compatible directory flush in pairing store
- Increase pairing code length from 6 -> 8 characters for higher entropy
- Remove SystemExit on empty allowFrom; empty list now defers to pairing
- Update is_allowed docstring to document pairing fallback semantics
- Propagate is_dm to Matrix (direct rooms) and Slack (im channels)
- Slack _is_allowed now checks pairing store for DM allowlist mode
- Fix /pairing revoke to accept optional channel argument
- Move inline import time to module top-level
- Add WebSocket comment explaining is_dm=True assumption
- Add comprehensive tests for store and BaseChannel pairing integration
- Fix existing tests that expected empty allowFrom to hard-exit
Refs #3774
Replace the file-editing onboarding workflow with a chat-native pairing flow:
- New pairing store (nanobot/pairing/store.py) persists approved senders
and pending codes in ~/.nanobot/pairing.json.
- DM messages from unknown senders receive a short pairing code instead of
silent denial. Group chats remain silently ignored.
- Existing allowFrom semantics are fully preserved; approved pairing users
are merged at runtime so no config migration is needed.
- nanobot pairing list/approve/deny/revoke CLI commands for bootstrap and
emergency management.
- /pairing slash commands intercepted in-channel so owners can approve
senders without leaving the chat.
- is_dm flag added to BaseChannel._handle_message; Telegram, Discord and
WebSocket updated to pass it.
Closes#3768
Shortcut commands (e.g. /help, /pairing) skip BUILD and SAVE states,
so their turns were never persisted to the session. This caused WebUI
chats to appear empty after _turn_end because history hydration reads
from the session file.
Fix by persisting the user message and assistant response inside
_state_command, but tag them with _command=True so Session.get_history
filters them out of LLM context. /new is excluded because it
intentionally clears the session.
- AgentLoop._persist_user_message_early now accepts **kwargs so
_state_command can pass _command=True for the user turn.
- Session.get_history skips messages with _command=True.
Register handlers for im.chat.member.bot.added_v1 and
im.chat.member.bot.deleted_v1 to silence "processor not found"
errors that appear when any bot is added to or removed from a group.
Closes#3772
When an MCP server configured as streamableHttp or SSE is unreachable,
streamable_http_client's anyio task group cleanup raises RuntimeError /
ExceptionGroup that escapes the caller's try/except and crashes the
event loop with "Unhandled exception in event loop".
Fix: add a lightweight TCP probe (_probe_http_url) before entering the
MCP SDK transport. If the port is closed, the server is skipped with a
warning instead of crashing. stdio transport is not probed (local
process).
Closes#3739
Resolve fallbackModels as preset references or explicit inline provider configs so failover uses complete model settings without exposing fallback logic to the agent loop.
Co-authored-by: Cursor <cursoragent@cursor.com>
Bind fallback model chains to the active model configuration so defaults and presets do not inherit or merge fallback behavior implicitly. Require explicit fallback providers while preserving per-fallback generation overrides and context-window safety.
Co-authored-by: Cursor <cursoragent@cursor.com>
When the primary model returns a non-transient error and no content
has been streamed yet, the runner now tries each model listed in the
active preset's fallback_models in order. Each fallback model may
reside on a different provider — a temporary provider instance is
created on-the-fly via make_provider(config, model=...).
Key design:
- Failover is request-scoped (does not affect subagents/dream/consolidator)
- Provider is restored via try/finally after each fallback attempt
- Skipped when content was already streamed to avoid duplicate output
- Recursive failover prevented by clearing fallback_models on fallback spec
- Circuit breaker trips open after 3 consecutive primary failures (60s cooldown)
- Cross-provider routing: fallback model prefix (e.g. groq/) determines provider
Fixes: cross-provider fallback was broken because the factory passed the
original preset (with provider forced to primary's provider) when creating
fallback providers. Now uses provider="auto" so the model string prefix
correctly routes to the right provider.
Also fixes: log messages now distinguish between primary-failed,
previous-fallback-failed, and circuit-open scenarios.
closes: https://github.com/HKUDS/nanobot/issues/3376
Thinking and Used tools are both auxiliary rows, but Thinking still carried
an internal mb-2 even when it was standalone. That made collapsed Thinking
rows visually taller than tool trace rows despite the shared thread spacing.
Only add the extra bottom margin when a Thinking bubble has answer content
below it in the same assistant message. Standalone Thinking rows now share
the same outer box model as Used tools. Tests lock both standalone and
answer-backed cases.
Co-authored-by: Cursor <cursoragent@cursor.com>
Thinking and Used tools are both auxiliary trace rows, but the thread list
was applying the same large gap used between full chat turns. That made
alternating Thinking / Used tools sequences look uneven and too airy.
Move row spacing from a fixed flex gap to per-row margins: full chat turns
keep mt-5, while consecutive auxiliary rows use mt-2. Add coverage for
Thinking -> Used tools -> Thinking spacing.
Co-authored-by: Cursor <cursoragent@cursor.com>
Tool trace groups are supporting details, so default them to collapsed.
Match the Thinking bubble's expanded body to the tool trace affordance by
using the same grouped header and animated fade/slide body treatment.
Update MessageBubble tests to assert tool traces start collapsed and expand
on click.
Co-authored-by: Cursor <cursoragent@cursor.com>
Live rendering merged reasoning chunks by scanning backward to the latest
assistant row. That fixed late reasoning, but the scan skipped trace rows,
so reasoning after a tool call crossed the Used tools block and attached to
the previous assistant iteration. Refresh looked correct because persisted
history reconstructs assistant/tool boundaries.
Treat trace rows as hard phase boundaries, just like user messages. A
reasoning_delta after Used tools now starts a fresh assistant placeholder,
so live rendering matches replay: Thinking -> Used tools -> Thinking ->
Used tools / answer.
Add a regression for reasoning_delta -> reasoning_end -> tool_hint ->
reasoning_delta.
Co-authored-by: Cursor <cursoragent@cursor.com>
Live reasoning/tool frames were rendering correctly, but refreshing WebUI
replayed only role/content/media from `/api/sessions/:key/messages`.
Assistant `reasoning_content` / `thinking_blocks` and `tool_calls` were
already persisted by the backend and returned by the history endpoint, but
useSessionHistory discarded them.
Hydrate persisted assistant reasoning into `UIMessage.reasoning` and
reconstruct assistant tool calls as `kind: "trace"` rows so the replayed
thread keeps the same Thinking bubble and Used tools block as the live
stream. Tool result rows remain hidden from the conversation view to avoid
replaying raw tool output as chat text.
Adds regression coverage for both persisted reasoning and historical tool
call trace hydration.
Co-authored-by: Cursor <cursoragent@cursor.com>
The post-hoc reasoning fix allowed late reasoning frames to attach back to
the nearest assistant message, but the scan crossed a newer user message.
That made the next turn's Thinking bubble render above the previous
assistant reply.
Treat the latest user message as a hard boundary: reasoning after it must
start a new assistant placeholder and can no longer attach to earlier
assistant turns. Add a regression covering previous assistant -> new user
-> reasoning_delta.
Co-authored-by: Cursor <cursoragent@cursor.com>
Some providers only surface structured `reasoning_content` after answer
text has already streamed. The WebUI was treating those late
`reasoning_delta` frames as a fresh assistant placeholder, so the
Thinking bubble rendered below the already-visible answer.
Attach late reasoning back to the active assistant turn instead. The
bubble still renders above the message content, preserving the expected
Thinking -> answer order even when the provider protocol delivers the
reasoning post-hoc. Added a regression test for answer-first followed by
reasoning_delta/reasoning_end.
Co-authored-by: Cursor <cursoragent@cursor.com>
Reasoning now flows as its own stream — symmetric to the answer's
``delta`` / ``stream_end`` pair — instead of being shipped as one
oversized progress message. This lets WebUI render a live "Thinking…"
bubble that updates in place, then auto-collapses when the stream
closes. Other channels remain plugin no-ops by default.
## Protocol
New metadata: ``_reasoning_delta`` (chunk) and ``_reasoning_end``
(close marker). ChannelManager routes both to the dedicated plugin
hooks below; the legacy one-shot ``_reasoning`` is kept for back-compat
and BaseChannel expands it into a single delta + end pair so plugins
only ever implement the streaming primitives.
WebSocket emits two new events:
- ``reasoning_delta`` (event, chat_id, text, optional stream_id)
- ``reasoning_end`` (event, chat_id, optional stream_id)
## BaseChannel surface
- ``send_reasoning_delta(chat_id, delta, metadata)`` — no-op default
- ``send_reasoning_end(chat_id, metadata)`` — no-op default
- ``send_reasoning(msg)`` — back-compat wrapper, base impl forwards
to the streaming primitives
A channel adds reasoning support by overriding the two streaming
primitives. Telegram / Slack / Discord / Feishu / WeChat / Matrix keep
the base no-ops until their bubble UIs are adapted; reasoning silently
drops at dispatch, never as a stray text message.
## AgentHook
Adds ``emit_reasoning_end`` to the hook lifecycle. ``_LoopHook`` tracks
whether a reasoning segment is open and closes it on:
- the first answer delta arriving (so the UI locks the bubble before
the answer renders below),
- ``on_stream_end``,
- one-shot ``reasoning_content`` / ``thinking_blocks`` after a single
non-streaming response.
## WebUI
- ``UIMessage.reasoning`` is now a single accumulated string with a
companion ``reasoningStreaming`` flag.
- ``useNanobotStream`` consumes ``reasoning_delta`` / ``reasoning_end``;
legacy ``kind: "reasoning"`` is auto-translated to a delta + end.
- New ``ReasoningBubble``: shimmer header + auto-expanded while
streaming, collapses to a clickable "Thinking" pill once closed,
respects ``prefers-reduced-motion``.
- Answer deltas adopt the reasoning placeholder so the bubble and the
answer share one assistant row.
## Tests
- ``tests/channels/test_channel_manager_reasoning.py`` — manager routes
delta + end, drops on channel opt-out, expands one-shot back-compat.
- ``tests/channels/test_websocket_channel.py`` — new ``reasoning_delta``
/ ``reasoning_end`` frames, empty-chunk safety, no-subscriber safety,
back-compat expansion.
- ``tests/agent/test_runner_reasoning.py`` — runner closes the segment
on streaming answer start and after one-shot reasoning.
- WebUI ``useNanobotStream`` + ``message-bubble`` cover the new
protocol and the shimmer styling.
## Docs
``docs/configuration.md`` and ``docs/websocket.md`` document the new
events and the plugin contract.
Co-authored-by: Cursor <cursoragent@cursor.com>
Reasoning was being shipped to every channel as a generic progress
message with a `_reasoning: true` flag. Two problems with that:
1. Channels without a low-emphasis UI primitive (Telegram, Slack,
Discord, Feishu...) would dump raw model thoughts as ordinary
replies, polluting the conversation.
2. The agent loop double-gated by inspecting `channels_config`, which
coupled the loop to display policy.
Treat reasoning as its own plugin action — `BaseChannel.send_reasoning`
defaults to a documented no-op; channels that have a fitting affordance
override. ChannelManager routes `_reasoning` outbounds to that method
only when the channel opts in via `show_reasoning` (camelCase alias
`showReasoning` mirrors `sendProgress`). Plugins that don't override
silently drop reasoning — "no fit, no leak" is the contract.
Reference implementation lands for WebSocket / WebUI: a new
`kind: "reasoning"` frame, parked on the active assistant bubble as a
collapsible `Thinking` group above the answer. CLI keeps its existing
direct path (it doesn't go through the bus). `ChannelsConfig.show_reasoning`
flips to `true` by default — only adapted channels surface anything,
others stay quiet.
Loop net diff is -3 lines: the `channels_config.show_reasoning` check
moves out, leaving emit_reasoning a one-liner that publishes and trusts
the channel to decide.
Co-authored-by: Cursor <cursoragent@cursor.com>
Resolves conflicts after main landed the state-machine turn refactor
and the test_runner.py 9-file split:
- nanobot/agent/loop.py: take main's `_state_build`/`_persist_user_message_early`
flow; restore the `reasoning: bool` parameter on `_build_bus_progress_callback`
so the loop hook can mark progress as reasoning-channel without coupling to
the answer stream.
- nanobot/cli/stream.py: keep main's configurable `bot_name`/`bot_icon` header
while preserving the PR's `transient=True` Live + `self._console` routing
+ `_renderable()` final-render path that fixed TUI duplication.
- tests/agent/test_runner.py was deleted on main and split into 9 focused
files; relocated all 6 reasoning tests into a new `test_runner_reasoning.py`
matching the new layout, deduplicated the per-test `ReasoningHook` boilerplate
through a shared `_RecordingHook` helper.
Co-authored-by: Cursor <cursoragent@cursor.com>
Reasoning surfacing was split across three branches in runner.py plus
two separate streaming buffers (loop hook and runner progress stream),
with three independent display-side gates in the CLI. This collapsed
the policy into one source of truth and fixed two real bugs:
- Structured `reasoning_content` was suppressed whenever the answer was
streamed, because the runner gated emission on `streamed_content`.
Providers don't stream `reasoning_content`; it only arrives on the
final response, so the answer stream and the reasoning channel are
independent. Added `streamed_reasoning` to `AgentHookContext` to track
the right bit.
- `channels.showReasoning` was subordinated to `sendProgress`. They are
orthogonal — turning off progress streaming shouldn't silence
reasoning. Reworked the CLI gates accordingly.
Single-helper consolidation:
- `extract_reasoning(reasoning_content, thinking_blocks, content)`
returns `(reasoning_text, cleaned_content)` with a defined fallback
order: dedicated field → Anthropic thinking_blocks → inline
`<think>`/`<thought>` tags. Models that expose none of these
short-circuit to `(None, content)` — zero overhead.
- `IncrementalThinkExtractor` replaces the ad-hoc `emit_incremental_think`
function and its hand-rolled "emitted cursor" state in both the loop
hook and the runner progress stream.
Also documented the new `showReasoning` channel option in
docs/configuration.md and noted its independence from sendProgress.
Co-authored-by: Cursor <cursoragent@cursor.com>
- Remove auto-selection of the most recent session on initial load,
so the app opens to a blank new-chat page instead of the last session.
- Preserve active session state when navigating to/from settings:
keep ThreadShell mounted (hidden via CSS) so scroll position, message
cache, and streaming state are not lost.
- Update onBackToChat to return to blank page when no session was active
instead of falling back to the most recent session.
- Update related test expectations to match the new navigation behavior.
Add extract_think() and emit_incremental_think() helpers to extract thinking content from inline <think> and <thought> tags in the content field. This handles models served via Ollama, self-hosted vLLM, or other compatible endpoints that embed reasoning as inline tags instead of using the dedicated reasoning_content API field.
Also adds Anthropic thinking_blocks support for extended thinking via the thinking content blocks array.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
The ask_user tool used AskUserInterrupt(BaseException) for mid-turn
blocking, creating heavy coupling across runner, loop, and session
management. The model now asks questions naturally in response text,
the turn ends normally, and the user's next message starts a new turn
with session history providing continuity.
Removed:
- nanobot/agent/tools/ask.py (tool, interrupt, helpers)
- tests/agent/test_ask_user.py
- webui/src/components/thread/AskUserPrompt.tsx
- AskUserInterrupt handling in runner.py
- Dual-path message building in loop.py
- Pending ask detection via history scanning
- button_prompt/buttons emission in WebSocket channel
- ask_user references in Slack channel docstrings
Preserved (MessageTool uses these independently):
- OutboundMessage.buttons field
- Channel button rendering (Telegram, Slack, WebSocket)
Remove unused code confirmed dead via vulture scan, grep verification,
and coverage analysis:
- _get_bridge_dir (cli/commands.py): 82-line function with zero callers
- add_assistant_message (agent/context.py): method body never executed,
also removed now-unused build_assistant_message import
- _tool_parameters_schema (agent/tools/base.py): redundant copy of schema
already exposed via the `parameters` property
- MSTEAMS_REF_TTL_S (channels/msteams.py): unused constant (production
uses config.ref_ttl_days directly); inlined in test
- MESSAGE_TYPE_USER (channels/weixin.py): unused constant