Move the fenced-code-block-aware splitting logic out of the shared
split_message helper (used by Signal, Slack, Discord, Weixin, etc.)
and into a Telegram-specific _split_telegram_markdown function.
The shared split_message remains a plain-text chunker. The Telegram
channel now uses _split_telegram_markdown for its raw Markdown paths
that feed _markdown_to_telegram_html, preventing broken HTML rendering
when splits fall inside fenced code blocks.
Also fixes a regression where content beginning with whitespace before
a fence could emit a whitespace-only chunk.
Addresses review feedback on #4257.
When split_message splits a long message, it now checks whether the
split point falls inside a fenced code block. If so, it either moves
the split to before the opening fence or closes/reopens the fence
across chunks, preventing broken HTML rendering.
Addresses #4250
Remove standalone nanobot/heartbeat/ service and replace it with an
auto-registered system cron job on gateway startup. Key behaviors preserved:
- HeartbeatConfig (enabled, interval_s, keep_recent_messages) remains in
GatewayConfig for backward compatibility.
- On startup, if enabled, a system cron job "heartbeat" is registered with
schedule derived from interval_s.
- HEARTBEAT.md is checked on each tick; empty/template-identical files skip
to avoid wasting LLM calls.
- Post-run evaluate_response and session history truncation
(keep_recent_messages) are retained.
- Delivery target selection, deliverable filtering, and preamble guidance
are preserved.
Files removed:
- nanobot/heartbeat/__init__.py
- nanobot/heartbeat/service.py
- tests/heartbeat/*
- tests/agent/test_heartbeat_service.py
Templates and docs updated to reflect cron-based usage.
Resolves conflicts after main landed the state-machine turn refactor
and the test_runner.py 9-file split:
- nanobot/agent/loop.py: take main's `_state_build`/`_persist_user_message_early`
flow; restore the `reasoning: bool` parameter on `_build_bus_progress_callback`
so the loop hook can mark progress as reasoning-channel without coupling to
the answer stream.
- nanobot/cli/stream.py: keep main's configurable `bot_name`/`bot_icon` header
while preserving the PR's `transient=True` Live + `self._console` routing
+ `_renderable()` final-render path that fixed TUI duplication.
- tests/agent/test_runner.py was deleted on main and split into 9 focused
files; relocated all 6 reasoning tests into a new `test_runner_reasoning.py`
matching the new layout, deduplicated the per-test `ReasoningHook` boilerplate
through a shared `_RecordingHook` helper.
Co-authored-by: Cursor <cursoragent@cursor.com>
Reasoning surfacing was split across three branches in runner.py plus
two separate streaming buffers (loop hook and runner progress stream),
with three independent display-side gates in the CLI. This collapsed
the policy into one source of truth and fixed two real bugs:
- Structured `reasoning_content` was suppressed whenever the answer was
streamed, because the runner gated emission on `streamed_content`.
Providers don't stream `reasoning_content`; it only arrives on the
final response, so the answer stream and the reasoning channel are
independent. Added `streamed_reasoning` to `AgentHookContext` to track
the right bit.
- `channels.showReasoning` was subordinated to `sendProgress`. They are
orthogonal — turning off progress streaming shouldn't silence
reasoning. Reworked the CLI gates accordingly.
Single-helper consolidation:
- `extract_reasoning(reasoning_content, thinking_blocks, content)`
returns `(reasoning_text, cleaned_content)` with a defined fallback
order: dedicated field → Anthropic thinking_blocks → inline
`<think>`/`<thought>` tags. Models that expose none of these
short-circuit to `(None, content)` — zero overhead.
- `IncrementalThinkExtractor` replaces the ad-hoc `emit_incremental_think`
function and its hand-rolled "emitted cursor" state in both the loop
hook and the runner progress stream.
Also documented the new `showReasoning` channel option in
docs/configuration.md and noted its independence from sendProgress.
Co-authored-by: Cursor <cursoragent@cursor.com>
Add extract_think() and emit_incremental_think() helpers to extract thinking content from inline <think> and <thought> tags in the content field. This handles models served via Ollama, self-hosted vLLM, or other compatible endpoints that embed reasoning as inline tags instead of using the dedicated reasoning_content API field.
Also adds Anthropic thinking_blocks support for extended thinking via the thinking content blocks array.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
The for loop at line 168 never executes because start is assigned
i + 1 immediately before slicing messages[start : i + 1], which
is always an empty list. Remove the dead code.
Fixes#3716
Some models / Ollama renderers occasionally emit tokenizer-level template
leaks that the existing regexes miss:
1. Malformed opening tags with no closing `>`, running straight into
user-facing content — e.g. `<think广场照明灯目前…` (observed with
Gemma 4 via Ollama). The earlier `<think>[\s\S]*?</think>` and
`^\s*<think>[\s\S]*$` patterns both require `>`, so these leak into
rendered messages.
2. Harmony-style channel markers like `<channel|>` / `<|channel|>` at
the start of a response.
3. Orphan `</think>` / `</thought>` closing tags left behind when only
the opener was consumed upstream.
Handles each case conservatively:
- Malformed `<think` / `<thought` only match when the next char is NOT
a tag-name continuation (`[A-Za-z0-9_\-:>/]`). Explicit ASCII class
instead of `\w` because Python's Unicode `\w` matches CJK and would
defeat the primary fix.
- Orphan closing tags and channel markers are stripped **only at the
start or end of the text**. `strip_think` is also applied before
persisting history (memory.py), so mid-text stripping would silently
rewrite transcripts where the tokens themselves are discussed.
Preserves: `<thinker>`, `<think-foo>`, `<think_foo>`, `<think1>`,
`<think:foo>`, `<thought/>`, literal `` `</think>` `` / `` `<channel|>` ``
inside prose or code blocks.
Adds 16 new regression tests covering both the leak cases and the
preserved-prose cases.
- Pass resolved self.context_window_tokens to Consolidator instead of
raw parameter that could be None, preventing consolidation failures
- Calculate percentage against input budget (ctx - max_completion - 1024)
instead of raw context window, consistent with Consolidator/snip formulas
- Pass actual max_completion_tokens from provider to build_status_content
- Cap percentage display at 999 to prevent runaway values
- Add tests for budget-based percentage and cap behavior
Fix accidental line corruption in split_message() where 'break' was
merged with unrelated code during manual editing.
The actual fix: build_assistant_message() now returns content or ""
instead of content (which could be None), preventing providers like
MiMo V2 Omni from rejecting tool-call messages with missing text field.
Fixes#2519
The /status command divided context_used by 1000 but context_total by
1024, producing inconsistent values. For example a 128000-token window
displayed as 125k instead of 128k. Tokens are not a binary unit, so
both should use 1000.
- Add GitStore class wrapping dulwich for memory file versioning
- Auto-commit memory changes during Dream consolidation
- Add /dream-log and /dream-restore commands for history browsing
- Pass tracked_files as constructor param, generate .gitignore dynamically
Replace single-stage MemoryConsolidator with a two-stage architecture:
- Consolidator: lightweight token-budget triggered summarization,
appends to HISTORY.md with cursor-based tracking
- Dream: cron-scheduled two-phase processor that analyzes HISTORY.md
and updates SOUL.md, USER.md, MEMORY.md via AgentRunner with
edit_file tools for surgical, fault-tolerant updates
New files: MemoryStore (pure file I/O), Dream class, DreamConfig,
/dream and /dream-log commands. 89 tests covering all components.
Add agent-level timezone configuration with a UTC default, propagate it into runtime context and heartbeat prompts, and document valid IANA timezone usage in the README.
- Add strip_think() to helpers.py as single source of truth
- Filter deltas in agent loop before dispatching to consumers
- Implement send_delta in TelegramChannel with progressive edit_message_text
- Remove duplicate think filtering from CLI stream.py and telegram.py
- Remove legacy fake streaming (send_message_draft) from Telegram
- Default Telegram streaming to true
- Update CHANNEL_PLUGIN_GUIDE.md with streaming documentation
Made-with: Cursor
estimate_prompt_tokens() only counted the `content` text field, completely
missing tool_calls JSON (~72% of actual payload), reasoning_content,
tool_call_id, name, and per-message framing overhead. This caused the
memory consolidator to never trigger for tool-heavy sessions (e.g. cron
jobs), leading to context window overflow errors from the LLM provider.
Also adds reasoning_content counting and proper per-message overhead to
estimate_message_tokens() for consistent boundary detection.
Made-with: Cursor
Merge process_direct() and process_direct_outbound() into a single
interface returning OutboundMessage | None. This eliminates the
dual-path detection logic in CLI single-message mode that relied on
inspect.iscoroutinefunction to distinguish between the two APIs.
Extract status rendering into a pure function build_status_content()
in utils/helpers.py, decoupling it from AgentLoop internals.
Made-with: Cursor
Keep multimodal tool outputs on the native content-block path while
restoring redirect SSRF checks for web_fetch image responses. Also share
image block construction, simplify persisted history sanitization, and
add regression tests for image reads and blocked private redirects.
Made-with: Cursor
Share assistant message construction between the main agent and subagents, and add a regression test to keep reasoning_content and thinking_blocks in follow-up tool rounds.
Move consolidation policy into MemoryConsolidator, keep backward compatibility for legacy config, and compress history by token budget instead of message count.
Add support for running multiple nanobot instances with complete isolation:
- Add --config parameter to gateway command for custom config file path
- Implement set_config_path() in config/loader.py for dynamic config path
- Derive data directory from config file location (e.g., ~/.nanobot-xxx/)
- Update get_data_path() to use unified data directory from config loader
- Ensure cron jobs use instance-specific data directory
This enables running multiple isolated nanobot instances by specifying
different config files, with each instance maintaining separate:
- Configuration files
- Workspace (memory, sessions, skills)
- Cron jobs
- Logs and media
Example usage:
nanobot gateway --config ~/.nanobot-instance2/config.json --port 18791
Extract the _split_message function from discord.py and telegram.py
into a shared utility function in utils/helpers.py.
Changes:
- Add split_message() to nanobot/utils/helpers.py with configurable max_len
- Update Discord channel to use shared utility (2000 char limit)
- Update Telegram channel to use shared utility (4000 char limit)
- Remove duplicate implementations from both channels
Benefits:
- Reduces code duplication
- Centralizes message splitting logic for easier maintenance
- Makes the function reusable for future channels
The function splits content into chunks within max_len, preferring
to break at newlines or spaces rather than mid-word.
- Remove trailing whitespace and normalize blank lines
- Unify string quotes and line breaks for long lines
- Sort imports alphabetically across modules