Replace the asyncio.Semaphore queueing approach with a simple count
check in SpawnTool.execute(). When the concurrency limit is reached,
the tool returns an error string so the agent can perceive the reason
and adjust its behavior instead of silently queueing.
- Remove max_concurrent_subagents parameter threading through
AgentLoop, commands.py, and nanobot.py
- SubagentManager reads the limit directly from AgentDefaults
- SpawnTool checks get_running_count() before calling spawn()
- Simplify tests to verify rejection behavior
Logout previously claimed to support github-copilot in --help text but had
no registered handler, so `provider logout github-copilot` failed with
"Logout not implemented". Add the handler, sharing token deletion with the
codex flow via `_delete_oauth_files`. Tighten handler-table types, fix the
codex test fixture filename, and cover github-copilot plus the unknown
provider path.
- Implement \
anobot provider logout <provider>\ to clear OAuth credentials.
- Add \_LOGOUT_HANDLERS\ registration mechanism mirroring login.
- Implement logout for \openai-codex\ by deleting local \oauth-cli-kit\ token and lock files.
- Fallback gracefully when attempting to logout from providers lacking local credentials or implementations.
- Fixes#2665
The max_messages config field in AgentDefaults was accepted by the
schema but never threaded through to the actual get_history() calls
in the agent loop. Both call sites in _process_message hardcoded the
default, so sessions with slow or local models accumulated unbounded
history that inflated prompt tokens and caused LLM timeouts.
Changes:
- Add max_messages field to AgentDefaults (default 0 = use built-in
constant, any positive value caps history replay)
- Store the value on AgentLoop and pass it to get_history() when
non-zero
- Wire the config through all three AgentLoop construction sites in
commands.py (gateway, API server, CLI chat)
- 14 focused tests covering schema validation, init storage, history
slicing, boundary alignment, integration wiring, and the
zero/default path
Three failure modes addressed:
1. Model reflects HEARTBEAT.md instructions back as output instead of
executing them ("HEARTBEAT.md has active tasks listed...")
2. Model narrates decision logic ("Best judgment call: stay quiet")
3. Model produces empty output for silence, runner treats it as failure,
finalization retry generates "couldn't produce a final answer" which
gets delivered to the user
Changes:
- Add _is_deliverable() pre-filter in HeartbeatService._tick() that catches
finalization fallback messages and leaked reasoning patterns before they
reach the evaluator
- Wrap Phase 2 task input with a delivery-awareness preamble telling the
model its output goes directly to the user's messaging app
- Add meta-reasoning suppression criterion to evaluator template
No changes to agent/loop.py, runner.py, providers, or config schema.
The old prompt framed cron firing as a "task triggered" status report,
which led the agent to reply with things like "Done ✅ 已提醒
U0AV8BJPV8D 喝水" — exposing the user id and reading like a system log
instead of a friendly reminder. Reword it to instruct the agent to
speak directly to the user and forbid status-style language.
Made-with: Cursor
Capture Slack thread metadata for cron and message-tool deliveries so replies stay in the originating thread, and hydrate first thread mentions with recent Slack context.
Made-with: Cursor
Only mark message-tool deliveries for channel-session recording while cron jobs are running, avoiding duplicate session writes during normal user turns.
Made-with: Cursor
Route heartbeat, cron, and message-tool deliveries through one gateway helper so user-visible proactive messages are available when the channel replies.
Made-with: Cursor
When heartbeat delivers output to a channel (e.g. Telegram), the message
is a raw OutboundMessage that bypasses the channel's session. If the user
replies, their reply enters a different session with no context about the
heartbeat message, so the agent cannot follow through.
This change injects the delivered heartbeat message as an assistant turn
into the target channel's session before publishing the outbound. When
the user replies, the channel session has conversational context.
Handles unified_session mode by resolving to UNIFIED_SESSION_KEY when
enabled, matching the agent loop's own session routing.
No changes to agent/loop.py, session/manager.py, channels, providers,
or config schema — uses existing add_message() and save() APIs.
Extend the existing on_progress callback to carry structured tool-event
payloads alongside the plain-text hint, so channels can render rich
tool execution state (start/finish/error, arguments, results, file
attachments) rather than only the pre-formatted hint string.
Changes
-------
- AgentLoop._tool_event_start_payload() — builds a version-1 start
payload from a ToolCallRequest
- AgentLoop._tool_event_result_extras() — extracts files/embeds from a
tool result dict
- AgentLoop._tool_event_finish_payloads() — maps tool_calls +
tool_results + tool_events from AgentHookContext into finish payloads
- _LoopHook.before_execute_tools() — passes tool_events=[...] to
on_progress together with the existing tool_hint flag
- _LoopHook.after_iteration() — emits a second on_progress call with
the finish payloads once tool results are available
- _bus_progress() — forwards tool_events as _tool_events in OutboundMessage
metadata so channel implementations can read them
- on_progress type widened to Callable[..., Awaitable[None]] on all
public entry points; _cli_progress updated to accept and ignore
tool_events
The contract is additive: callers that only accept (content, *, tool_hint)
continue to work unchanged. Callers that also accept tool_events receive
the structured data.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
On filesystems with write-back caching (rclone VFS, NFS, FUSE mounts)
the OS page cache may buffer recent session writes. If the process is
killed before the cache flushes, the most recent conversation turns are
silently lost — causing the agent to "forget" recent context and
respond to stale history on the next startup.
Changes:
- session/manager.py: add fsync=True option to save() that flushes the
file and its parent directory to durable storage. Add flush_all() that
re-saves every cached session with fsync. Default save() behavior is
unchanged (no fsync) to avoid performance regression in normal
operation.
- cli/commands.py: call agent.sessions.flush_all() in the gateway
shutdown finally block, after stopping heartbeat/cron/channels.
- tests/session/test_session_fsync.py: 8 tests covering fsync flag
behavior, flush_all with empty/multiple/errored sessions, and
data survival across simulated process restart.
- tests/cli/test_commands.py: add sessions attribute to _FakeAgentLoop
so the gateway health endpoint test passes with the new shutdown
flush.
Cron jobs now pass on_progress=_silent to process_direct, matching
the heartbeat pattern. Previously, tool hints and streaming deltas
were published to the user channel via bus during execution, but the
final response could be rejected by evaluate_response — leaving users
with confusing partial output and no conclusion.
Closes#3319
Follow-up to #3212, fully backward compatible:
- Extract the 14-day staleness threshold as `_STALE_THRESHOLD_DAYS` module
constant and pass it into the Phase 1 prompt template as
`{{ stale_threshold_days }}`. The number lived in three places before
(code threshold, prompt instruction, docstring); now there is one.
- Add `DreamConfig.annotate_line_ages` (default True = current behavior)
and propagate it through `Dream.__init__` and the gateway wiring in
cli/commands.py. Gives users a knob to disable the feature without a
code patch if an LLM reacts poorly to the `← Nd` suffix.
- Harden `_annotate_with_ages` against dirty working trees: when HEAD
blob line count disagrees with the working-tree content length, skip
annotation entirely instead of assigning ages to the wrong lines. The
previous `i >= len(ages)` guard only handled one direction of the
mismatch.
- Inline-comment the `max_iterations` 10→15 bump with a pointer to
exp002 so future blame has context.
- Add 4 regression tests: end-to-end `← 30d` reaches prompt, 14/15
threshold boundary, `annotate_line_ages=False` bypasses git entirely
(verified via `assert_not_called`), length-mismatch defense, and
template-var rendering.
Made-with: Cursor
Add a built-in tool that lets the agent inspect and modify its own
runtime state (model, iterations, context window, etc.).
Key features:
- inspect: view current config, usage stats, and subagent status
- modify: adjust parameters at runtime (protected by type/range validation)
- Subagent observability: inspect running subagent tasks (phase,
iteration, tool events, errors) — subagents are no longer a black box
- Watchdog corrects out-of-bounds values on each iteration
- Enabled by default in read-only mode (self_modify: false)
- All changes are in-memory only; restart restores defaults
- Comprehensive test suite (90 tests)
Includes a self-awareness skill (always-on) with progressive disclosure:
SKILL.md for core rules, references/examples.md for detailed scenarios.
Bind the gateway health listener to localhost by default and reduce the probe response to a minimal status payload so accidental public exposure leaks less information.
Made-with: Cursor
Keep the gateway health endpoint patch current with the latest gateway runtime changes, and lock the new HTTP routes in with CLI regression coverage and README guidance.
Made-with: Cursor
When a user is idle for longer than a configured TTL, nanobot **proactively** compresses the session context into a summary. This reduces token cost and first-token latency when the user returns — instead of re-processing a long stale context with an expired KV cache, the model receives a compact summary and fresh input.
Introduce a disabled_skills option in the config schema that allows
users to specify a list of skill names to be excluded. The setting is
threaded from config through Nanobot -> AgentLoop -> ContextBuilder ->
SkillsLoader. Disabled skills are filtered out from list_skills,
get_always_skills, and build_skills_summary. Four new test cases cover
the filtering behavior.
The Enabled column in channels status and plugins list commands had a default green style that overrode the dim markup for disabled items. This caused no values to appear green instead of dimmed. Remove the default style to let cell-level markup control the display correctly.
On Windows, certain Unicode input (emoji, mixed-script text, surrogate
pairs) causes prompt_toolkit's FileHistory to crash with
UnicodeEncodeError when writing the history file.
Fix: wrap FileHistory with a _SafeFileHistory subclass that sanitizes
surrogate characters before writing, replacing invalid sequences instead
of crashing.
Fixes#2846
Allow config.json to reference environment variables via ${VAR_NAME}
syntax. Variables are resolved at runtime by resolve_config_env_vars(),
keeping the raw templates in the Pydantic model so save_config()
preserves them. This lets secrets live in a separate env file
(e.g. loaded by systemd EnvironmentFile=) instead of plain text
in config.json.
Fixes#2591
The "nanobot is thinking..." spinner was printing ANSI escape codes
literally in some terminals, causing garbled output like:
?[2K?[32m⠧?[0m ?[2mnanobot is thinking...?[0m
Root causes:
1. Console created without force_terminal=True, so Rich couldn't
reliably detect terminal capabilities
2. Spinner continued running during user input prompt, conflicting
with prompt_toolkit
Changes:
- Set force_terminal=True in _make_console() for proper ANSI handling
- Add stop_for_input() method to StreamRenderer
- Call stop_for_input() before reading user input in interactive mode
- Add tests for the new functionality
Replace single-stage MemoryConsolidator with a two-stage architecture:
- Consolidator: lightweight token-budget triggered summarization,
appends to HISTORY.md with cursor-based tracking
- Dream: cron-scheduled two-phase processor that analyzes HISTORY.md
and updates SOUL.md, USER.md, MEMORY.md via AgentRunner with
edit_file tools for surgical, fault-tolerant updates
New files: MemoryStore (pure file I/O), Dream class, DreamConfig,
/dream and /dream-log commands. 89 tests covering all components.