- Add `ModelPresetConfig` schema for named model presets
- Add `model_presets` dict to `Config` and `model_preset` field to `AgentDefaults`
- Add `resolve_preset()` to return effective model params from preset or defaults
- Add `@model_validator` to reject unknown preset names
- Update `_match_provider()` to use resolved preset model/provider
- Update `make_provider()` and `provider_signature()` to use `resolve_preset()`
- Add `model_preset` property to `AgentLoop` for atomic runtime switching
- Update `AgentLoop.from_config()` to inject a runtime `default` preset
- Wire self-tool to inspect/clear preset state
- Update CLI display strings to show active preset
VolcEngine's OpenAI-compatible gateway rejects requests when both
max_tokens and max_completion_tokens are present (the latter added
by openai-python SDK v2.x serialization). Set the flag so nanobot
sends max_completion_tokens instead of max_tokens for volcengine,
volcengine_coding_plan, and by extension byteplus variants.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit implements a progressive refactoring of the tool system to support
plugin discovery, scoped loading, and protocol-driven runtime context injection.
Key changes:
- Add Tool ABC metadata (tool_name, _scopes) and ToolContext dataclass for
dependency injection.
- Introduce ToolLoader with pkgutil-based builtin discovery and
entry_points-based third-party plugin loading.
- Add scope filtering (core/subagent/memory) so different contexts load
appropriate tool sets.
- Introduce ContextAware protocol and RequestContext dataclass to replace
hardcoded per-tool context injection in AgentLoop.
- Add RuntimeState / MutableRuntimeState protocols to decouple MyTool from
AgentLoop.
- Migrate all built-in tools to declare scopes and implement create()/enabled()
hooks.
- Migrate MessageTool, SpawnTool, CronTool, and MyTool to ContextAware.
- Refactor AgentLoop to use ToolLoader and protocol-driven context injection.
- Refactor SubagentManager to use ToolLoader(scope="subagent") with per-run
FileStates isolation.
- Register all built-in tools via pyproject.toml entry_points.
- Add comprehensive tests for loader scopes, entry_points, ContextAware,
subagent tools, and runtime state sync.
The hosted Xiaomi MiMo API accepts {"thinking": {"type": "enabled"|"disabled"}}
to toggle reasoning, which is exactly the shape produced by the existing
thinking_type style. The xiaomi_mimo ProviderSpec just needed to opt in.
Before this fix, setting reasoning_effort="none" had no effect on MiMo
because no thinking_style was configured, so the disable signal never
reached the server. Default-on models (mimo-v2.5-pro and friends) kept
reasoning regardless of user configuration.
Source: https://platform.xiaomimimo.com/docs/en-US/api/chat/openai-api
Co-authored with Claude Opus 4.7. Strategy and review via Claude Desktop,
implementation via Claude Code.
Both StreamRenderer instantiations in the agent command (single-message
mode and interactive mode) now read bot_name and bot_icon from
config.agents.defaults and forward them to the renderer.
This is the wiring step that makes the schema fields actually take
effect at runtime. With safe defaults of "nanobot" and "🐈", existing
users see no change.
Threads bot_name/bot_icon through ThinkingSpinner and StreamRenderer
with safe defaults that preserve current behavior.
- ThinkingSpinner uses bot_name in its status text
- StreamRenderer header is "<icon> <name>" when icon is set,
or just "<name>" when icon is empty
- Removes the now-unused __logo__ import (the cat emoji is the
default value of bot_icon, not a hardcoded constant)
Two new fields with safe defaults that preserve current branding:
- bot_name: str = "nanobot"
- bot_icon: str = "🐈"
Empty string for bot_icon is allowed and lets users opt out of the
leading icon. camelCase keys (botName, botIcon) bind via the existing
to_camel alias generator.
- Append [Archived Context Summary] to system prompt instead of injecting
it into the user message runtime context, improving KV cache reuse across
turns and avoiding consecutive same-role messages.
- _last_summary persists in metadata (no pop) for restart survival;
summary is re-injected every turn via the stable system prompt.
- Remove dynamic "Inactive for X minutes" from _format_summary — use
static last_active timestamp instead to preserve KV cache stability.
- Pass session_summary through build_messages() so both normal and
ask_user paths receive the archived summary in the system prompt.
- estimate_session_prompt_tokens now reads _last_summary from metadata
to include the summary in token budget estimation.
- Remove obsolete session_summary parameter from
maybe_consolidate_by_tokens and estimate_session_prompt_tokens
call sites in loop.py (summary flows through build_messages instead).
- Ensure /new (session.clear()) clears _last_summary from metadata.
The for loop at line 168 never executes because start is assigned
i + 1 immediately before slicing messages[start : i + 1], which
is always an empty list. Remove the dead code.
Fixes#3716
- TurnContext now carries a turn_id (session_key:time_ns)
- All state transition debug logs include [turn_id] prefix
- RuntimeError messages also include turn_id for observability
- State handlers now return event strings ('ok', 'dispatch', 'shortcut')
- Driver loop uses _TRANSITIONS lookup table: (state, event) -> next_state
- State graph is centralized and visible at a glance
- Added StateTraceEntry to record per-state timing and events
- Driver loop logs state duration + event at debug level
- Exception paths are traced with error field for observability
- Fix _assemble_outbound on_stream type annotation (Callable[[str], Awaitable[None]] | None)
- Use last_msg consistently in _state_save instead of re-indexing
- Remove dead fallback in _state_respond (guaranteed non-None by _state_save)
- Change pending_summary type from Any to str | None
- Make session optional in TurnContext to avoid redundant fetch
- Add defensive dispatch with RuntimeError for missing handlers
- Pop model and context_window_tokens from extra kwargs before
forwarding to __init__, allowing callers like _run_gateway to
pass snapshot-derived values instead of config defaults
- _run_gateway now explicitly passes model/context_window_tokens
from provider_snapshot to preserve pre-refactor behavior
Extract duplicated bus/provider/loop initialization from CLI commands
(serve, _run_gateway, agent) and Nanobot facade into a single
AgentLoop.from_config() classmethod.
- Remove _make_provider() from cli/commands.py and nanobot.py
- Remove inline provider creation in all three CLI entry points
- AgentLoop.from_config() creates MessageBus, calls make_provider(),
and assembles AgentLoop with all standard config-derived parameters
- Supports **extra overrides for callers that need custom args
(e.g. cron_service, session_manager, provider_snapshot_loader)
- Update tests to mock make_provider at nanobot.providers.factory
and add from_config classmethod to _FakeAgentLoop fixtures
This is PR 1/4 of the model-preset feature decomposition.
The previous implementation popped _last_summary from session.metadata
after injecting it into the prompt, then saved the session. This caused
the summary to be permanently lost after a process restart, making the
AI forget archived context and appear to ignore memory or reference
non-existent previous messages.
Replace the destructive pop with a _last_summary_used sentinel:
- _last_summary stays in metadata for restart survival
- _last_summary_used prevents duplicate injection within the same turn
- Clear the sentinel whenever a new summary is generated
Updates tests to match the new persistence behavior.
Let WebUI users configure the single web search provider credential from BYOK while keeping saved secrets masked and hot-reloaded for new searches.
Co-authored-by: Cursor <cursoragent@cursor.com>
The previous version changed return fail/pass to raise, which broke
graceful degradation — tests expect upload/content failures to be
caught and handled, not propagated.
Now logs errors with exc_info=True while preserving existing control
flow (return fail for upload/content send, stop typing for stream).
The Matrix channel had 4 bare except blocks that silently swallowed
transport errors with no logging — stream send/edit failures, media
upload failures, server config fetch failures, and room content send
failures. The Weixin channel had 1 silent state-load failure.
This mirrors commit 98c2f7cc ('fix(weixin): raise exceptions instead
of silently dropping messages') for the Matrix channel and adds a
warning for the remaining silent catch in Weixin's _load_state.
All failures now log at warning level with exc_info=True so operators
can diagnose intermittent Matrix/Weixin transport issues.
On Windows, prompt_toolkit produces lone surrogate code points (e.g.
🐈) for emoji input. These propagate through the message bus
and crash at json.dumps() / file write time because surrogates cannot
be encoded as UTF-8.
Extract _sanitize_surrogates() that round-trips through UTF-16 to
reconstruct paired surrogates into real characters (e.g. 🐈
→ 🐈), replacing unpaired surrogates with U+FFFD. Apply it at the CLI
input path and reuse in SafeFileHistory.