Resolve fallbackModels as preset references or explicit inline provider configs so failover uses complete model settings without exposing fallback logic to the agent loop.
Co-authored-by: Cursor <cursoragent@cursor.com>
Bind fallback model chains to the active model configuration so defaults and presets do not inherit or merge fallback behavior implicitly. Require explicit fallback providers while preserving per-fallback generation overrides and context-window safety.
Co-authored-by: Cursor <cursoragent@cursor.com>
Reasoning now flows as its own stream — symmetric to the answer's
``delta`` / ``stream_end`` pair — instead of being shipped as one
oversized progress message. This lets WebUI render a live "Thinking…"
bubble that updates in place, then auto-collapses when the stream
closes. Other channels remain plugin no-ops by default.
## Protocol
New metadata: ``_reasoning_delta`` (chunk) and ``_reasoning_end``
(close marker). ChannelManager routes both to the dedicated plugin
hooks below; the legacy one-shot ``_reasoning`` is kept for back-compat
and BaseChannel expands it into a single delta + end pair so plugins
only ever implement the streaming primitives.
WebSocket emits two new events:
- ``reasoning_delta`` (event, chat_id, text, optional stream_id)
- ``reasoning_end`` (event, chat_id, optional stream_id)
## BaseChannel surface
- ``send_reasoning_delta(chat_id, delta, metadata)`` — no-op default
- ``send_reasoning_end(chat_id, metadata)`` — no-op default
- ``send_reasoning(msg)`` — back-compat wrapper, base impl forwards
to the streaming primitives
A channel adds reasoning support by overriding the two streaming
primitives. Telegram / Slack / Discord / Feishu / WeChat / Matrix keep
the base no-ops until their bubble UIs are adapted; reasoning silently
drops at dispatch, never as a stray text message.
## AgentHook
Adds ``emit_reasoning_end`` to the hook lifecycle. ``_LoopHook`` tracks
whether a reasoning segment is open and closes it on:
- the first answer delta arriving (so the UI locks the bubble before
the answer renders below),
- ``on_stream_end``,
- one-shot ``reasoning_content`` / ``thinking_blocks`` after a single
non-streaming response.
## WebUI
- ``UIMessage.reasoning`` is now a single accumulated string with a
companion ``reasoningStreaming`` flag.
- ``useNanobotStream`` consumes ``reasoning_delta`` / ``reasoning_end``;
legacy ``kind: "reasoning"`` is auto-translated to a delta + end.
- New ``ReasoningBubble``: shimmer header + auto-expanded while
streaming, collapses to a clickable "Thinking" pill once closed,
respects ``prefers-reduced-motion``.
- Answer deltas adopt the reasoning placeholder so the bubble and the
answer share one assistant row.
## Tests
- ``tests/channels/test_channel_manager_reasoning.py`` — manager routes
delta + end, drops on channel opt-out, expands one-shot back-compat.
- ``tests/channels/test_websocket_channel.py`` — new ``reasoning_delta``
/ ``reasoning_end`` frames, empty-chunk safety, no-subscriber safety,
back-compat expansion.
- ``tests/agent/test_runner_reasoning.py`` — runner closes the segment
on streaming answer start and after one-shot reasoning.
- WebUI ``useNanobotStream`` + ``message-bubble`` cover the new
protocol and the shimmer styling.
## Docs
``docs/configuration.md`` and ``docs/websocket.md`` document the new
events and the plugin contract.
Co-authored-by: Cursor <cursoragent@cursor.com>
Reasoning was being shipped to every channel as a generic progress
message with a `_reasoning: true` flag. Two problems with that:
1. Channels without a low-emphasis UI primitive (Telegram, Slack,
Discord, Feishu...) would dump raw model thoughts as ordinary
replies, polluting the conversation.
2. The agent loop double-gated by inspecting `channels_config`, which
coupled the loop to display policy.
Treat reasoning as its own plugin action — `BaseChannel.send_reasoning`
defaults to a documented no-op; channels that have a fitting affordance
override. ChannelManager routes `_reasoning` outbounds to that method
only when the channel opts in via `show_reasoning` (camelCase alias
`showReasoning` mirrors `sendProgress`). Plugins that don't override
silently drop reasoning — "no fit, no leak" is the contract.
Reference implementation lands for WebSocket / WebUI: a new
`kind: "reasoning"` frame, parked on the active assistant bubble as a
collapsible `Thinking` group above the answer. CLI keeps its existing
direct path (it doesn't go through the bus). `ChannelsConfig.show_reasoning`
flips to `true` by default — only adapted channels surface anything,
others stay quiet.
Loop net diff is -3 lines: the `channels_config.show_reasoning` check
moves out, leaving emit_reasoning a one-liner that publishes and trusts
the channel to decide.
Co-authored-by: Cursor <cursoragent@cursor.com>
Resolves conflicts after main landed the state-machine turn refactor
and the test_runner.py 9-file split:
- nanobot/agent/loop.py: take main's `_state_build`/`_persist_user_message_early`
flow; restore the `reasoning: bool` parameter on `_build_bus_progress_callback`
so the loop hook can mark progress as reasoning-channel without coupling to
the answer stream.
- nanobot/cli/stream.py: keep main's configurable `bot_name`/`bot_icon` header
while preserving the PR's `transient=True` Live + `self._console` routing
+ `_renderable()` final-render path that fixed TUI duplication.
- tests/agent/test_runner.py was deleted on main and split into 9 focused
files; relocated all 6 reasoning tests into a new `test_runner_reasoning.py`
matching the new layout, deduplicated the per-test `ReasoningHook` boilerplate
through a shared `_RecordingHook` helper.
Co-authored-by: Cursor <cursoragent@cursor.com>
Reasoning surfacing was split across three branches in runner.py plus
two separate streaming buffers (loop hook and runner progress stream),
with three independent display-side gates in the CLI. This collapsed
the policy into one source of truth and fixed two real bugs:
- Structured `reasoning_content` was suppressed whenever the answer was
streamed, because the runner gated emission on `streamed_content`.
Providers don't stream `reasoning_content`; it only arrives on the
final response, so the answer stream and the reasoning channel are
independent. Added `streamed_reasoning` to `AgentHookContext` to track
the right bit.
- `channels.showReasoning` was subordinated to `sendProgress`. They are
orthogonal — turning off progress streaming shouldn't silence
reasoning. Reworked the CLI gates accordingly.
Single-helper consolidation:
- `extract_reasoning(reasoning_content, thinking_blocks, content)`
returns `(reasoning_text, cleaned_content)` with a defined fallback
order: dedicated field → Anthropic thinking_blocks → inline
`<think>`/`<thought>` tags. Models that expose none of these
short-circuit to `(None, content)` — zero overhead.
- `IncrementalThinkExtractor` replaces the ad-hoc `emit_incremental_think`
function and its hand-rolled "emitted cursor" state in both the loop
hook and the runner progress stream.
Also documented the new `showReasoning` channel option in
docs/configuration.md and noted its independence from sendProgress.
Co-authored-by: Cursor <cursoragent@cursor.com>
The config field was added but never passed from config to AgentLoop.
The value was always falling back to the default (40) regardless of
what was set in config.json.
Now passes tool_hint_max_length through all AgentLoop() call sites:
- nanobot/nanobot.py (main bot)
- nanobot/cli/commands.py (CLI agent, dev, webui commands)
Also adds documentation in docs/configuration.md.
Replace the asyncio.Semaphore queueing approach with a simple count
check in SpawnTool.execute(). When the concurrency limit is reached,
the tool returns an error string so the agent can perceive the reason
and adjust its behavior instead of silently queueing.
- Remove max_concurrent_subagents parameter threading through
AgentLoop, commands.py, and nanobot.py
- SubagentManager reads the limit directly from AgentDefaults
- SpawnTool checks get_running_count() before calling spawn()
- Simplify tests to verify rejection behavior
Adds Olostep (https://www.olostep.com) as an optional web_search backend
using the official olostep Python SDK (client.answers.create()).
Changes:
- pyproject.toml: adds olostep>=0.1.0 optional dependency
- schema.py: adds olostep to provider comment in WebSearchConfig
- web.py: adds _search_olostep() with lazy import and provider branching
- docs/configuration.md: documents Olostep setup under web search config
- tests: unit tests for the new provider
Backward compatible: existing users see no behavior change unless they
opt into provider: "olostep". No hard dependency at runtime path.
Co-authored-by: umerkay <umerkk164@gmail.com>
Resolve the MSTeams stale-reference cleanup conflict by keeping the PR's locked, atomic sidecar-meta implementation and aligning the merged test expectation locally.
Made-with: Cursor
Slack inbound events with subtype=file_share were silently dropped, so
nanobot never saw messages that included attachments. Allow file_share
through, download Slack-private files using the bot token into the
local media dir, and pass them to the agent as media paths plus a
"[file: name]" / "[image: name]" placeholder in the content. Reject
responses that look like Slack's login HTML so an auth page is never
saved as if it were the user's file. Document the required files:read
scope alongside files:write so installs that read attachments are not
quietly missing the permission.
Capture Slack thread metadata for cron and message-tool deliveries so replies stay in the originating thread, and hydrate first thread mentions with recent Slack context.
Made-with: Cursor
- Separate updated_at into a meta sidecar file (msteams_conversations_meta.json)
to keep backward compatibility with legacy data that never had updated_at.
On first upgrade, legacy refs are kept alive by initializing updated_at to now
instead of purging them immediately.
- Add cross-process locking via fcntl (with Windows fallback) to prevent
concurrent writes from different gateway processes overwriting each other.
- Add ref_touch_interval_s config (default 300s) to throttle how often
successful sends refresh updated_at, preventing unnecessary I/O.
- Touch active refs on send success to prevent them from expiring while in use.
- Add _safe_float and _normalize_ref_record for robust schema migration.
- All refs operations now use threading.RLock within a process.
Discord threads use their own channel IDs, so allowChannels was blocking
thread replies unless each thread ID was listed explicitly.
- Include the thread parent channel ID as an allowlist candidate
- Enforce allow_channels on slash commands (previously bypassed)
- Show parent channel ID in runtime context, reply to the thread
- Fix subagent cancel key via effective_key propagation
- Detect bot mentions via raw_mentions and reply-to-bot references
- Cache seen thread channels for outbound delivery
- Ignore system messages that become empty prompts
Non-priority slash commands (e.g. /new, /help, /dream-log) arriving
while a session has an active LLM turn were silently queued into the
pending injection buffer and later injected as raw user messages into
the LLM conversation. This caused the model to respond to "/new" as
plain text instead of executing the command.
Root cause: the run() loop only checked priority commands (/stop,
/restart, /status) before routing messages to the pending queue. All
other command tiers (exact, prefix) bypassed command dispatch entirely.
Changes:
- Add CommandRouter.is_dispatchable_command() to match exact/prefix
tiers, mirroring the existing is_priority() pattern.
- In run(), intercept dispatchable commands before pending queue
insertion and dispatch them directly via _dispatch_command_inline().
- Extract _cancel_active_tasks() from cmd_stop for reuse; cmd_new now
cancels active tasks before clearing the session to prevent shared
mutable state corruption from concurrent asyncio coroutines.
- Update /new semantics: stops active task first, then clears session.
- Update documentation in help text, docs, and Discord command list.
The PyPI package `nanobot` is a different project ("Minimalist robot
navigation framework"), not this one. This project publishes as
`nanobot-ai` (see pyproject.toml). Following the guide as-written would
pull down the wrong package — flagged by vansatchen in #3188.
Same toml block as the build-backend fix, one-word change.
Made-with: Cursor
The previous setuptools.backends._legacy:_Backend has been removed in
Python 3.14 and newer setuptools, causing 'Cannot import setuptools.backends.legacy' error.
Using hatchling (same as main project) ensures compatibility across Python versions.
Closes#3188
Add a built-in tool that lets the agent inspect and modify its own
runtime state (model, iterations, context window, etc.).
Key features:
- inspect: view current config, usage stats, and subagent status
- modify: adjust parameters at runtime (protected by type/range validation)
- Subagent observability: inspect running subagent tasks (phase,
iteration, tool events, errors) — subagents are no longer a black box
- Watchdog corrects out-of-bounds values on each iteration
- Enabled by default in read-only mode (self_modify: false)
- All changes are in-memory only; restart restores defaults
- Comprehensive test suite (90 tests)
Includes a self-awareness skill (always-on) with progressive disclosure:
SKILL.md for core rules, references/examples.md for detailed scenarios.