Previously _validate_allow_from raised SystemExit when allowFrom was
missing, forcing every channel to declare an explicit allowlist.
With the pairing feature this is no longer necessary: a channel with
no allowFrom simply operates in pairing-only mode, letting users
approve senders via /pairing approve <code> from the WebUI or CLI.
- Replace SystemExit with an info log in _validate_allow_from
- Add test_validate_allow_from_allows_missing_allow_from
- Assert pending_user_turn is cleared from session metadata after
shortcut commands (e.g. /help) in test_auto_compact.py.
- Add test for None allow_from / allowFrom values in
test_base_channel.py to prevent TypeError regressions.
- AgentLoop._state_command now persists user message and assistant
response for shortcut commands (e.g. /pairing) so WebUI history
hydration after _turn_end no longer shows an empty chat. /new is
excluded because it intentionally clears the session.
- Feishu _on_message sends pairing codes for unauthorized DMs before
any media side effects (reactions, downloads, transcription).
Group chat unauthorized senders are still silently ignored early.
- Update test_feishu_reply to assert the new DM pairing behavior.
/pairing is now a first-class built-in command dispatched through
CommandRouter, just like /status, /model, /dream, etc.
Benefits:
- WebUI automatically shows /pairing in the slash command palette
(because builtin_command_palette() feeds /api/commands).
- All channels (Telegram, Discord, WebSocket, etc.) use the same
dispatch path for /pairing; no more channel-level interception.
- The command still only works for already-authorised users because
is_allowed() gates message ingestion before the bus.
Changes:
- Add handle_pairing_command() to nanobot.pairing.store — pure
function callable from CLI, CommandRouter, and tests.
- Add cmd_pairing to nanobot.command.builtin and register in
BUILTIN_COMMAND_SPECS + register_builtin_commands().
- Remove BaseChannel._handle_pairing_command() and the /pairing
interception logic from _handle_message().
- Clean up unused pairing imports from base.py.
- Add unit tests for handle_pairing_command and cmd_pairing dispatch.
- Extract format_pairing_reply() and format_expiry() to eliminate
duplication between BaseChannel and SlackChannel.
- Use _write_text_atomic() from helpers.py instead of hand-rolled
fsync logic in pairing store.
- Convert approved lists to in-memory sets for O(1) lookup.
- Remove collision retry loop (8-char entropy is sufficient).
- Fix /pairing command parsing to split prefix exactly.
- Remove unused import time from base.py.
- Fix tests to pass subcommand_text, not full /pairing string.
- Add os.fsync with Windows-compatible directory flush in pairing store
- Increase pairing code length from 6 -> 8 characters for higher entropy
- Remove SystemExit on empty allowFrom; empty list now defers to pairing
- Update is_allowed docstring to document pairing fallback semantics
- Propagate is_dm to Matrix (direct rooms) and Slack (im channels)
- Slack _is_allowed now checks pairing store for DM allowlist mode
- Fix /pairing revoke to accept optional channel argument
- Move inline import time to module top-level
- Add WebSocket comment explaining is_dm=True assumption
- Add comprehensive tests for store and BaseChannel pairing integration
- Fix existing tests that expected empty allowFrom to hard-exit
Refs #3774
Reasoning now flows as its own stream — symmetric to the answer's
``delta`` / ``stream_end`` pair — instead of being shipped as one
oversized progress message. This lets WebUI render a live "Thinking…"
bubble that updates in place, then auto-collapses when the stream
closes. Other channels remain plugin no-ops by default.
## Protocol
New metadata: ``_reasoning_delta`` (chunk) and ``_reasoning_end``
(close marker). ChannelManager routes both to the dedicated plugin
hooks below; the legacy one-shot ``_reasoning`` is kept for back-compat
and BaseChannel expands it into a single delta + end pair so plugins
only ever implement the streaming primitives.
WebSocket emits two new events:
- ``reasoning_delta`` (event, chat_id, text, optional stream_id)
- ``reasoning_end`` (event, chat_id, optional stream_id)
## BaseChannel surface
- ``send_reasoning_delta(chat_id, delta, metadata)`` — no-op default
- ``send_reasoning_end(chat_id, metadata)`` — no-op default
- ``send_reasoning(msg)`` — back-compat wrapper, base impl forwards
to the streaming primitives
A channel adds reasoning support by overriding the two streaming
primitives. Telegram / Slack / Discord / Feishu / WeChat / Matrix keep
the base no-ops until their bubble UIs are adapted; reasoning silently
drops at dispatch, never as a stray text message.
## AgentHook
Adds ``emit_reasoning_end`` to the hook lifecycle. ``_LoopHook`` tracks
whether a reasoning segment is open and closes it on:
- the first answer delta arriving (so the UI locks the bubble before
the answer renders below),
- ``on_stream_end``,
- one-shot ``reasoning_content`` / ``thinking_blocks`` after a single
non-streaming response.
## WebUI
- ``UIMessage.reasoning`` is now a single accumulated string with a
companion ``reasoningStreaming`` flag.
- ``useNanobotStream`` consumes ``reasoning_delta`` / ``reasoning_end``;
legacy ``kind: "reasoning"`` is auto-translated to a delta + end.
- New ``ReasoningBubble``: shimmer header + auto-expanded while
streaming, collapses to a clickable "Thinking" pill once closed,
respects ``prefers-reduced-motion``.
- Answer deltas adopt the reasoning placeholder so the bubble and the
answer share one assistant row.
## Tests
- ``tests/channels/test_channel_manager_reasoning.py`` — manager routes
delta + end, drops on channel opt-out, expands one-shot back-compat.
- ``tests/channels/test_websocket_channel.py`` — new ``reasoning_delta``
/ ``reasoning_end`` frames, empty-chunk safety, no-subscriber safety,
back-compat expansion.
- ``tests/agent/test_runner_reasoning.py`` — runner closes the segment
on streaming answer start and after one-shot reasoning.
- WebUI ``useNanobotStream`` + ``message-bubble`` cover the new
protocol and the shimmer styling.
## Docs
``docs/configuration.md`` and ``docs/websocket.md`` document the new
events and the plugin contract.
Co-authored-by: Cursor <cursoragent@cursor.com>
Reasoning was being shipped to every channel as a generic progress
message with a `_reasoning: true` flag. Two problems with that:
1. Channels without a low-emphasis UI primitive (Telegram, Slack,
Discord, Feishu...) would dump raw model thoughts as ordinary
replies, polluting the conversation.
2. The agent loop double-gated by inspecting `channels_config`, which
coupled the loop to display policy.
Treat reasoning as its own plugin action — `BaseChannel.send_reasoning`
defaults to a documented no-op; channels that have a fitting affordance
override. ChannelManager routes `_reasoning` outbounds to that method
only when the channel opts in via `show_reasoning` (camelCase alias
`showReasoning` mirrors `sendProgress`). Plugins that don't override
silently drop reasoning — "no fit, no leak" is the contract.
Reference implementation lands for WebSocket / WebUI: a new
`kind: "reasoning"` frame, parked on the active assistant bubble as a
collapsible `Thinking` group above the answer. CLI keeps its existing
direct path (it doesn't go through the bus). `ChannelsConfig.show_reasoning`
flips to `true` by default — only adapted channels surface anything,
others stay quiet.
Loop net diff is -3 lines: the `channels_config.show_reasoning` check
moves out, leaving emit_reasoning a one-liner that publishes and trusts
the channel to decide.
Co-authored-by: Cursor <cursoragent@cursor.com>
The ask_user tool used AskUserInterrupt(BaseException) for mid-turn
blocking, creating heavy coupling across runner, loop, and session
management. The model now asks questions naturally in response text,
the turn ends normally, and the user's next message starts a new turn
with session history providing continuity.
Removed:
- nanobot/agent/tools/ask.py (tool, interrupt, helpers)
- tests/agent/test_ask_user.py
- webui/src/components/thread/AskUserPrompt.tsx
- AskUserInterrupt handling in runner.py
- Dual-path message building in loop.py
- Pending ask detection via history scanning
- button_prompt/buttons emission in WebSocket channel
- ask_user references in Slack channel docstrings
Preserved (MessageTool uses these independently):
- OutboundMessage.buttons field
- Channel button rendering (Telegram, Slack, WebSocket)
Let WebUI users configure the single web search provider credential from BYOK while keeping saved secrets masked and hot-reloaded for new searches.
Co-authored-by: Cursor <cursoragent@cursor.com>
Add a session-scoped slash command palette sourced from backend command metadata, and keep welcome-page quick actions localized across all WebUI languages.
Co-authored-by: Cursor <cursoragent@cursor.com>
_send_text() swallowed API errors (non-zero errcode) with just a
warning log, and send() had three silent return paths (no client,
session paused, no context_token). Neither triggered ChannelManager's
retry logic, causing persistent message loss until a new inbound
message refreshed the context_token.
Now all failure paths raise RuntimeError, matching BaseChannel's
contract and enabling proper retry behavior.
When host is set to 0.0.0.0, the gateway now enforces that either token
or token_issue_secret must be configured — it refuses to start otherwise.
Bootstrap endpoint behavior:
- token_issue_secret configured: always validate regardless of source IP
(handles reverse-proxy scenarios where all connections appear as localhost)
- No secret: only localhost can bootstrap (local dev mode)
The frontend shows an authentication form when bootstrap returns 401/403,
persists the secret in localStorage, and retries automatically on reload.
The previous LAN-access fix (PR #3656) relaxed the bootstrap localhost
check when host was 0.0.0.0, but did not require any authentication —
any device on the network could obtain a token without credentials.
New behavior:
- token_issue_secret configured: always validate, regardless of source
IP (handles reverse-proxy scenarios where all connections appear as
localhost).
- No secret configured: only localhost can bootstrap (local dev mode).
This supersedes the host-based check from PR #3656.
The webui bootstrap endpoint (/webui/bootstrap) rejected all non-localhost
connections with HTTP 403, preventing the embedded webui from working when
accessed from another device on the LAN — even when host was set to 0.0.0.0.
Skip the localhost check when the server is explicitly bound to 0.0.0.0 or ::,
since that signals intent to accept external connections.
Align the WebUI sidebar and chat chrome with the updated design, and generate WebUI session titles asynchronously without blocking turns.
Co-authored-by: Cursor <cursoragent@cursor.com>
A single transient failure between the agent and an OpenAI/Groq Whisper
endpoint currently vanishes as `return ""` in transcribe(). The voice
message arrives as the empty string and there is no way to tell real
silence apart from a failed upload. A malformed but successful response
body is even worse: the JSON-decode error escapes the helper unhandled.
Add a shared `_post_transcription_with_retry` used by both providers.
Retry behaviour:
- exponential backoff 1s -> 2s -> 4s, up to 3 retries (4 attempts)
- retryable HTTP statuses: 408, 429, 500, 502, 503, 504
- retryable exceptions: TimeoutException, ConnectError, ReadError,
WriteError, RemoteProtocolError
Non-transient failures short-circuit to "" on the first attempt --
retrying a misconfigured key or a broken upload only burns rate-limit
quota. Branches that short-circuit:
- missing API key, missing audio file
- file-read errors (PermissionError, OSError) on the audio path,
preserving the nightly contract for direct provider callers
- HTTP auth/4xx body issues via raise_for_status()
- response.json() parse failures
- non-dict JSON payloads
Sharing one helper means OpenAI and Groq cannot drift apart silently.
Thread `language` through the helper. The multipart files dict is rebuilt
inside the per-attempt loop, so when a caller sets self.language the
`language` field is sent on every attempt -- not just the first.
Tests cover:
- every advertised retryable status and exception, parameterized
- language present on attempts 1 and 2 of a 503->200 sequence
- language absent when unset; present when set (both providers)
- malformed JSON body and non-dict JSON body short-circuit to ""
- PermissionError on file read short-circuits with no HTTP attempt
- max-attempts give-up, exponential-backoff schedule, auth no-retry,
missing-key / missing-file short-circuit
Test stub fix: the _StubResponse in tests/channels/test_channel_plugins.py
declared no status_code, which the new helper reads for retry classification.
Set status_code = 200 so the stub advertises the successful response that
those tests already simulate. Also moved the two transcription-provider
imports to the top of that file (previously placed mid-file) so the file
is ruff-clean (E402).
When the Matrix homeserver returns M_UNKNOWN_TOKEN / M_FORBIDDEN /
M_UNAUTHORIZED (or soft_logout), the previous _sync_loop kept retrying
sync_forever every 2 seconds forever, spamming the homeserver and
filling logs (#1851). The auth state cannot recover by retrying, so
this is pure noise and a soft DoS on the homeserver.
- Extract `_is_fatal_auth_response()` helper
- In `_on_sync_error`, on fatal auth: set `_running=False` and call
`stop_sync_forever()` so the loop exits cleanly
- Add exponential backoff (2s → 60s cap) to the generic exception path
in `_sync_loop` so transient network blips also stop hammering
Closes#1851
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Matrix sync replays the room timeline on each startup or `/restart`,
causing already-handled messages to be reprocessed (#3553). Even with
`store_sync_tokens=True`, the sync token isn't reliably re-injected
when restoring a session via access_token + load_store(), so the
client re-reads recent timeline entries.
Filter `event.server_timestamp` against the process start time so old
events are dropped at the `_on_message` / `_on_media_message` entry
points. Trade-off: messages received during downtime won't be
processed, which matches the issue reporter's expectation.
Closes#3553
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Stream-end events are emitted at the end of every assistant turn. When
the agent has more tool-call rounds queued, the runner sets
`_resuming=True` on the metadata. Without a guard, every intermediate
stream end removed the OnIt reaction (the first one wins, since
`_reaction_ids.pop` empties the slot) and re-added `done_emoji`,
producing a DONE reaction after every tool call instead of only at
final completion.
Wrap the OnIt removal and `done_emoji` add in a `not _resuming` guard
so the OnIt indicator persists across tool-call rounds and DONE fires
exactly once when the agent's final response lands.
`_resuming` already flows through outbound metadata
(`nanobot/agent/loop.py:747`) and survives `_coalesce_stream_deltas`
because pure `_stream_end` messages without `_stream_delta` skip the
merge branch.
Tests:
- test_no_removal_when_resuming
- test_done_emoji_only_on_final_stream_end