The hosted Xiaomi MiMo API accepts {"thinking": {"type": "enabled"|"disabled"}}
to toggle reasoning, which is exactly the shape produced by the existing
thinking_type style. The xiaomi_mimo ProviderSpec just needed to opt in.
Before this fix, setting reasoning_effort="none" had no effect on MiMo
because no thinking_style was configured, so the disable signal never
reached the server. Default-on models (mimo-v2.5-pro and friends) kept
reasoning regardless of user configuration.
Source: https://platform.xiaomimimo.com/docs/en-US/api/chat/openai-api
Co-authored with Claude Opus 4.7. Strategy and review via Claude Desktop,
implementation via Claude Code.
Both StreamRenderer instantiations in the agent command (single-message
mode and interactive mode) now read bot_name and bot_icon from
config.agents.defaults and forward them to the renderer.
This is the wiring step that makes the schema fields actually take
effect at runtime. With safe defaults of "nanobot" and "🐈", existing
users see no change.
Threads bot_name/bot_icon through ThinkingSpinner and StreamRenderer
with safe defaults that preserve current behavior.
- ThinkingSpinner uses bot_name in its status text
- StreamRenderer header is "<icon> <name>" when icon is set,
or just "<name>" when icon is empty
- Removes the now-unused __logo__ import (the cat emoji is the
default value of bot_icon, not a hardcoded constant)
Two new fields with safe defaults that preserve current branding:
- bot_name: str = "nanobot"
- bot_icon: str = "🐈"
Empty string for bot_icon is allowed and lets users opt out of the
leading icon. camelCase keys (botName, botIcon) bind via the existing
to_camel alias generator.
Six tests covering:
- AgentDefaults preserves 'nanobot' and the cat icon by default
- camelCase config keys (botName/botIcon) bind to the new fields
- Empty bot_icon is accepted (opt-out of the leading icon)
- ThinkingSpinner uses bot_name in its status text
- StreamRenderer header combines icon and name when icon is set
- StreamRenderer header is just the name when icon is empty
- Append [Archived Context Summary] to system prompt instead of injecting
it into the user message runtime context, improving KV cache reuse across
turns and avoiding consecutive same-role messages.
- _last_summary persists in metadata (no pop) for restart survival;
summary is re-injected every turn via the stable system prompt.
- Remove dynamic "Inactive for X minutes" from _format_summary — use
static last_active timestamp instead to preserve KV cache stability.
- Pass session_summary through build_messages() so both normal and
ask_user paths receive the archived summary in the system prompt.
- estimate_session_prompt_tokens now reads _last_summary from metadata
to include the summary in token budget estimation.
- Remove obsolete session_summary parameter from
maybe_consolidate_by_tokens and estimate_session_prompt_tokens
call sites in loop.py (summary flows through build_messages instead).
- Ensure /new (session.clear()) clears _last_summary from metadata.
The for loop at line 168 never executes because start is assigned
i + 1 immediately before slicing messages[start : i + 1], which
is always an empty list. Remove the dead code.
Fixes#3716
- TurnContext now carries a turn_id (session_key:time_ns)
- All state transition debug logs include [turn_id] prefix
- RuntimeError messages also include turn_id for observability
- State handlers now return event strings ('ok', 'dispatch', 'shortcut')
- Driver loop uses _TRANSITIONS lookup table: (state, event) -> next_state
- State graph is centralized and visible at a glance
- Added StateTraceEntry to record per-state timing and events
- Driver loop logs state duration + event at debug level
- Exception paths are traced with error field for observability
- Fix _assemble_outbound on_stream type annotation (Callable[[str], Awaitable[None]] | None)
- Use last_msg consistently in _state_save instead of re-indexing
- Remove dead fallback in _state_respond (guaranteed non-None by _state_save)
- Change pending_summary type from Any to str | None
- Make session optional in TurnContext to avoid redundant fetch
- Add defensive dispatch with RuntimeError for missing handlers
Re-applies the safe portion of c01f8599 after the revert in 2e8e674e.
Drops the uv cache which broke last time because uv.lock is gitignored
in this repo, and keeps lint as a step inside the test job (matching
the pre-c01f8599 layout).
What's added (all metadata-only, no external dependencies):
- concurrency: cancel superseded runs on the same ref
- permissions: tighten GITHUB_TOKEN to contents: read
- timeout-minutes: 20 to bound runaway jobs
- fail-fast: false so all matrix combinations surface failures
- matrix conditional: PRs run Linux x {3.11, 3.14} for fast feedback;
push to main/nightly still runs the full 2-OS x 4-Python matrix
What's intentionally NOT added (each removed for a reason):
- uv cache: depends on uv.lock which is gitignored
- separate lint job: kept inline as a step, matches original
- workflow_dispatch / paths-ignore: scope creep, not needed now
All jobs continue to run on standard GitHub-hosted runners
(ubuntu-latest, windows-latest), keeping CI within the free tier.
Co-authored-by: Cursor <cursoragent@cursor.com>
The optimized workflow in c01f8599 set astral-sh/setup-uv@v4 with
cache-dependency-glob: "uv.lock", but uv.lock is gitignored in this
repo, so the hosted runner's checkout never contains it and the
Install uv step fails with:
Error: No file matched to [uv.lock], make sure you have
checked out the target repository
Reverting the workflow to the pre-c01f8599 version to unbreak CI.
The "Modifying CI Workflows" section added to CONTRIBUTING.md in the
same commit is left in place; it documents general guidance and is
independent of this specific implementation choice.
Co-authored-by: Cursor <cursoragent@cursor.com>
Workflow changes (.github/workflows/ci.yml):
- Add concurrency to cancel superseded runs on the same ref
- Enable uv dependency caching keyed on uv.lock
- Split lint into a dedicated job; gate test on lint via needs
- Split matrix: PRs run Linux x {3.11, 3.14} for fast feedback;
push to main/nightly still runs the full 2-OS x 4-Python matrix
- Add fail-fast: false so all platforms surface failures together
- Add timeouts (lint: 5m, test: 20m) to bound runaway jobs
- Tighten GITHUB_TOKEN to contents: read
Docs (CONTRIBUTING.md):
- Add a short "Modifying CI Workflows" section so contributors know
to stay within standard runners / no metered storage / no paid
actions before touching .github/workflows/
All jobs continue to run on standard GitHub-hosted runners
(ubuntu-latest, windows-latest), keeping CI within the free tier.
Co-authored-by: Cursor <cursoragent@cursor.com>
- Pop model and context_window_tokens from extra kwargs before
forwarding to __init__, allowing callers like _run_gateway to
pass snapshot-derived values instead of config defaults
- _run_gateway now explicitly passes model/context_window_tokens
from provider_snapshot to preserve pre-refactor behavior
Extract duplicated bus/provider/loop initialization from CLI commands
(serve, _run_gateway, agent) and Nanobot facade into a single
AgentLoop.from_config() classmethod.
- Remove _make_provider() from cli/commands.py and nanobot.py
- Remove inline provider creation in all three CLI entry points
- AgentLoop.from_config() creates MessageBus, calls make_provider(),
and assembles AgentLoop with all standard config-derived parameters
- Supports **extra overrides for callers that need custom args
(e.g. cron_service, session_manager, provider_snapshot_loader)
- Update tests to mock make_provider at nanobot.providers.factory
and add from_config classmethod to _FakeAgentLoop fixtures
This is PR 1/4 of the model-preset feature decomposition.
The previous implementation popped _last_summary from session.metadata
after injecting it into the prompt, then saved the session. This caused
the summary to be permanently lost after a process restart, making the
AI forget archived context and appear to ignore memory or reference
non-existent previous messages.
Replace the destructive pop with a _last_summary_used sentinel:
- _last_summary stays in metadata for restart survival
- _last_summary_used prevents duplicate injection within the same turn
- Clear the sentinel whenever a new summary is generated
Updates tests to match the new persistence behavior.
Let WebUI users configure the single web search provider credential from BYOK while keeping saved secrets masked and hot-reloaded for new searches.
Co-authored-by: Cursor <cursoragent@cursor.com>
Clarify nanobot's preference for small core changes, reviewable PR boundaries, and careful handling of prompt/context surfaces so AI contributors preserve the project's maintenance philosophy.
Co-authored-by: Cursor <cursoragent@cursor.com>
Add CLAUDE.md at the repository root to orient future Claude Code
instances, and split detailed constraints into .agent/:
- .agent/design.md — architectural constraints (core small, duplication
over abstraction, minimal changes, explicit over magical)
- .agent/security.md — workspace/SSRF/shell sandbox boundaries
- .agent/gotchas.md — config ${VAR}, Windows compat, templates,
heartbeat virtual tool call, atomic writes, ruff format warning,
skills extension point
Also updates .gitignore to not ignore .agent/.
The previous version changed return fail/pass to raise, which broke
graceful degradation — tests expect upload/content failures to be
caught and handled, not propagated.
Now logs errors with exc_info=True while preserving existing control
flow (return fail for upload/content send, stop typing for stream).
The Matrix channel had 4 bare except blocks that silently swallowed
transport errors with no logging — stream send/edit failures, media
upload failures, server config fetch failures, and room content send
failures. The Weixin channel had 1 silent state-load failure.
This mirrors commit 98c2f7cc ('fix(weixin): raise exceptions instead
of silently dropping messages') for the Matrix channel and adds a
warning for the remaining silent catch in Weixin's _load_state.
All failures now log at warning level with exc_info=True so operators
can diagnose intermittent Matrix/Weixin transport issues.
On Windows, prompt_toolkit produces lone surrogate code points (e.g.
🐈) for emoji input. These propagate through the message bus
and crash at json.dumps() / file write time because surrogates cannot
be encoded as UTF-8.
Extract _sanitize_surrogates() that round-trips through UTF-16 to
reconstruct paired surrogates into real characters (e.g. 🐈
→ 🐈), replacing unpaired surrogates with U+FFFD. Apply it at the CLI
input path and reuse in SafeFileHistory.
Fixes two related input-handling bugs in the onboard wizard:
1. _input_text treated "" as None, preventing users from clearing
optional string fields or entering empty strings intentionally.
2. _input_model_with_autocomplete used `if value else None`, which
discarded falsy values such as empty strings or 0.
To support clearing optional string fields, add _is_str_or_none() and
normalize empty strings to None inside _configure_pydantic_model only
when the field annotation is `str | None`. Required str fields keep
"" as a valid value.
Also included:
- Remember last selected item in provider/channel/model menus for
better UX when configuring multiple items.
- Rename _SIMPLE_TYPES and _MENU_DISPATCH to lowercase to follow
Python naming conventions (they are local variables, not constants).
- Remove unused imports in test file.
Extracted from PR #3358.
The HTTP compression buffer in aiohttp held all SSE chunks until
the stream ended, making streaming appear batched instead of
incremental. SSE payloads are small and frequent, so compression
provides negligible benefit while breaking real-time delivery.