nanobot

mirror of https://github.com/HKUDS/nanobot.git synced 2026-06-15 15:24:06 +00:00

Author	SHA1	Message	Date
axelray-dev	a5a816abaf	fix(telegram): move fenced-code-block splitting into Telegram-specific helper Move the fenced-code-block-aware splitting logic out of the shared split_message helper (used by Signal, Slack, Discord, Weixin, etc.) and into a Telegram-specific _split_telegram_markdown function. The shared split_message remains a plain-text chunker. The Telegram channel now uses _split_telegram_markdown for its raw Markdown paths that feed _markdown_to_telegram_html, preventing broken HTML rendering when splits fall inside fenced code blocks. Also fixes a regression where content beginning with whitespace before a fence could emit a whitespace-only chunk. Addresses review feedback on #4257.	2026-06-11 13:52:19 +08:00
axelray-dev	131446fa61	fix(utils): make split_message fenced-code-block-aware When split_message splits a long message, it now checks whether the split point falls inside a fenced code block. If so, it either moves the split to before the opening fence or closes/reopens the fence across chunks, preventing broken HTML rendering. Addresses #4250	2026-06-11 13:52:19 +08:00
chengyongru	fe2af64e04	refactor(heartbeat): migrate heartbeat service to cron-based auto-registration Remove standalone nanobot/heartbeat/ service and replace it with an auto-registered system cron job on gateway startup. Key behaviors preserved: - HeartbeatConfig (enabled, interval_s, keep_recent_messages) remains in GatewayConfig for backward compatibility. - On startup, if enabled, a system cron job "heartbeat" is registered with schedule derived from interval_s. - HEARTBEAT.md is checked on each tick; empty/template-identical files skip to avoid wasting LLM calls. - Post-run evaluate_response and session history truncation (keep_recent_messages) are retained. - Delivery target selection, deliverable filtering, and preamble guidance are preserved. Files removed: - nanobot/heartbeat/__init__.py - nanobot/heartbeat/service.py - tests/heartbeat/* - tests/agent/test_heartbeat_service.py Templates and docs updated to reflect cron-based usage.	2026-05-28 20:20:28 +08:00
Xubin Ren	d29fcaf5d1	refactor(agent): internalize tool contract prompt	2026-05-21 15:21:39 +08:00
Xubin Ren	01fa362c03	Merge origin/main into feat/show-reasoning Resolves conflicts after main landed the state-machine turn refactor and the test_runner.py 9-file split: - nanobot/agent/loop.py: take main's `_state_build`/`_persist_user_message_early` flow; restore the `reasoning: bool` parameter on `_build_bus_progress_callback` so the loop hook can mark progress as reasoning-channel without coupling to the answer stream. - nanobot/cli/stream.py: keep main's configurable `bot_name`/`bot_icon` header while preserving the PR's `transient=True` Live + `self._console` routing + `_renderable()` final-render path that fixed TUI duplication. - tests/agent/test_runner.py was deleted on main and split into 9 focused files; relocated all 6 reasoning tests into a new `test_runner_reasoning.py` matching the new layout, deduplicated the per-test `ReasoningHook` boilerplate through a shared `_RecordingHook` helper. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-13 05:07:14 +00:00
Xubin Ren	352aaf0627	refactor(reasoning): unify reasoning extraction across providers Reasoning surfacing was split across three branches in runner.py plus two separate streaming buffers (loop hook and runner progress stream), with three independent display-side gates in the CLI. This collapsed the policy into one source of truth and fixed two real bugs: - Structured `reasoning_content` was suppressed whenever the answer was streamed, because the runner gated emission on `streamed_content`. Providers don't stream `reasoning_content`; it only arrives on the final response, so the answer stream and the reasoning channel are independent. Added `streamed_reasoning` to `AgentHookContext` to track the right bit. - `channels.showReasoning` was subordinated to `sendProgress`. They are orthogonal — turning off progress streaming shouldn't silence reasoning. Reworked the CLI gates accordingly. Single-helper consolidation: - `extract_reasoning(reasoning_content, thinking_blocks, content)` returns `(reasoning_text, cleaned_content)` with a defined fallback order: dedicated field → Anthropic thinking_blocks → inline `<think>`/`<thought>` tags. Models that expose none of these short-circuit to `(None, content)` — zero overhead. - `IncrementalThinkExtractor` replaces the ad-hoc `emit_incremental_think` function and its hand-rolled "emitted cursor" state in both the loop hook and the runner progress stream. Also documented the new `showReasoning` channel option in docs/configuration.md and noted its independence from sendProgress. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-12 17:14:19 +00:00
Flinn Xie	3a851f8f8d	feat(reasoning): add inline think tag extraction and Anthropic thinking_blocks support Add extract_think() and emit_incremental_think() helpers to extract thinking content from inline <think> and <thought> tags in the content field. This handles models served via Ollama, self-hosted vLLM, or other compatible endpoints that embed reasoning as inline tags instead of using the dedicated reasoning_content API field. Also adds Anthropic thinking_blocks support for extended thinking via the thinking content blocks array. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-12 23:02:59 +08:00
chengyongru	73a8d8a875	fix(utils): remove unreachable dead code in find_legal_message_start The for loop at line 168 never executes because start is assigned i + 1 immediately before slicing messages[start : i + 1], which is always an empty list. Remove the dead code. Fixes #3716	2026-05-09 18:53:13 +08:00
chengyongru	05e0106592	refactor(logging): preserve tracebacks and add channel context - Preserve tracebacks: logger.error in except blocks → logger.exception - Channel context: BaseChannel injects self.logger = logger.bind(channel=name) - Third-party bridge: redirect_lib_logging() replaces ad-hoc stdlib-to-loguru bridges - Log levels: network timeouts downgraded from ERROR → WARNING - Fix --verbose flag to actually work with loguru (set handler to DEBUG)	2026-05-06 21:17:45 +08:00
yorkhellen	ee364c6ac1	fix(helpers): restore tiktoken fallback in estimate_prompt_tokens_chain	2026-05-02 00:07:45 +08:00
Xubin Ren	188e6df757	fix(utils): cover complete trailing think markers Made-with: Cursor	2026-05-01 20:09:59 +08:00
bravel	2c397ad442	fix: strip partial think tags in streaming output	2026-05-01 20:09:59 +08:00
Jack Lu	d9800ecdd2	refactor: replace try-except blocks with contextlib.suppress for cleaner error handling across multiple files	2026-05-01 19:30:11 +08:00
hlg	8e7d8bef6a	fix(utils): handle malformed think tags and channel markers in strip_think Some models / Ollama renderers occasionally emit tokenizer-level template leaks that the existing regexes miss: 1. Malformed opening tags with no closing `>`, running straight into user-facing content — e.g. `<think广场照明灯目前…` (observed with Gemma 4 via Ollama). The earlier `<think>[\s\S]?</think>` and `^\s<think>[\s\S]$` patterns both require `>`, so these leak into rendered messages. 2. Harmony-style channel markers like `<channel\|>` / `<\|channel\|>` at the start of a response. 3. Orphan `</think>` / `</thought>` closing tags left behind when only the opener was consumed upstream. Handles each case conservatively: - Malformed `<think` / `<thought` only match when the next char is NOT a tag-name continuation (`[A-Za-z0-9_\-:>/]`). Explicit ASCII class instead of `\w` because Python's Unicode `\w` matches CJK and would defeat the primary fix. - Orphan closing tags and channel markers are stripped only at the start or end of the text*. `strip_think` is also applied before persisting history (memory.py), so mid-text stripping would silently rewrite transcripts where the tokens themselves are discussed. Preserves: `<thinker>`, `<think-foo>`, `<think_foo>`, `<think1>`, `<think:foo>`, `<thought/>`, literal `` `</think>` `` / `` `<channel\|>` `` inside prose or code blocks. Adds 16 new regression tests covering both the leak cases and the preserved-prose cases.	2026-04-20 17:04:48 +08:00
chengyongru	e1fdca7d40	fix(status): correct context percentage calculation and sync consolidator - Pass resolved self.context_window_tokens to Consolidator instead of raw parameter that could be None, preventing consolidation failures - Calculate percentage against input budget (ctx - max_completion - 1024) instead of raw context window, consistent with Consolidator/snip formulas - Pass actual max_completion_tokens from provider to build_status_content - Cap percentage display at 999 to prevent runaway values - Add tests for budget-based percentage and cap behavior	2026-04-16 20:30:39 +08:00
aiguozhi123456	634f4b45c1	feat: show active task count in /status output	2026-04-15 01:49:42 +08:00
04cb	e392c27f7e	fix(utils): anchor unclosed think-tag regex to string start (#3004 )	2026-04-11 13:46:15 +08:00
flobo3	6b7e78a8e0	fix: strip <thought> blocks from Gemma 4 and similar models	2026-04-10 12:10:23 +08:00
Alfredo Arenas	6445b3b0cf	fix(helpers): repair corrupted split_message and ensure content never None Fix accidental line corruption in split_message() where 'break' was merged with unrelated code during manual editing. The actual fix: build_assistant_message() now returns content or "" instead of content (which could be None), preventing providers like MiMo V2 Omni from rejecting tool-call messages with missing text field. Fixes #2519	2026-04-09 11:42:53 +08:00
Alfredo Arenas	6d74c88014	fix(helpers): ensure assistant message content is never None	2026-04-09 11:42:53 +08:00
Leo fu	66409784f4	fix(status): use consistent divisor (1000) for token count display The /status command divided context_used by 1000 but context_total by 1024, producing inconsistent values. For example a 128000-token window displayed as 125k instead of 128k. Tokens are not a binary unit, so both should use 1000.	2026-04-09 10:40:20 +08:00
whs	bc0ff7f214	feat(status): add web search provider usage to /status command	2026-04-06 13:37:55 +08:00
Xubin Ren	0a3a60a7a4	refactor(memory): simplify Dream config naming and rename gitstore module	2026-04-04 10:01:45 +00:00
Xubin Ren	7e0c196797	fix(memory): repair Dream follow-up paths and move GitStore to utils Made-with: Cursor	2026-04-04 04:49:42 +00:00
chengyongru	f824a629a8	feat(memory): add git-backed version control for dream memory files - Add GitStore class wrapping dulwich for memory file versioning - Auto-commit memory changes during Dream consolidation - Add /dream-log and /dream-restore commands for history browsing - Pass tracked_files as constructor param, generate .gitignore dynamically	2026-04-03 00:32:54 +08:00
chengyongru	b9616674f0	feat(agent): two-stage memory system with Dream consolidation Replace single-stage MemoryConsolidator with a two-stage architecture: - Consolidator: lightweight token-budget triggered summarization, appends to HISTORY.md with cursor-based tracking - Dream: cron-scheduled two-phase processor that analyzes HISTORY.md and updates SOUL.md, USER.md, MEMORY.md via AgentRunner with edit_file tools for surgical, fault-tolerant updates New files: MemoryStore (pure file I/O), Dream class, DreamConfig, /dream and /dream-log commands. 89 tests covering all components.	2026-04-02 22:42:25 +08:00
Xubin Ren	e4b335ce81	refactor: extract runtime response guards into utils runtime module	2026-04-02 13:54:40 +00:00
Xubin Ren	714a4c7bb6	fix(runtime): address review feedback on retry and cleanup	2026-04-02 10:57:12 +00:00
Xubin Ren	eefd7e60f2	Merge remote-tracking branch 'origin/main' into feat/runtime-hardening	2026-04-02 10:40:49 +00:00
chengyongru	da08dee144	feat(provider): show cache hit rate in /status (#2645 )	2026-04-02 12:51:45 +08:00
Xubin Ren	fbedf7ad77	feat: harden agent runtime for long-running tasks	2026-04-01 19:12:49 +00:00
04cb	929ee09499	fix(utils): ensure reasoning_content present with thinking_blocks (#2579 )	2026-03-31 11:49:23 +08:00
Xubin Ren	13d6c0ae52	feat(config): add configurable timezone for runtime context Add agent-level timezone configuration with a UTC default, propagate it into runtime context and heartbeat prompts, and document valid IANA timezone usage in the README.	2026-03-25 22:07:14 +08:00
Xubin Ren	9d5e511a6e	feat(streaming): centralize think-tag filtering and add Telegram streaming - Add strip_think() to helpers.py as single source of truth - Filter deltas in agent loop before dispatching to consumers - Implement send_delta in TelegramChannel with progressive edit_message_text - Remove duplicate think filtering from CLI stream.py and telegram.py - Remove legacy fake streaming (send_message_draft) from Telegram - Default Telegram streaming to true - Update CHANNEL_PLUGIN_GUIDE.md with streaming documentation Made-with: Cursor	2026-03-23 10:20:41 +08:00
Xubin Ren	1c71489121	fix(agent): count all message fields in token estimation estimate_prompt_tokens() only counted the `content` text field, completely missing tool_calls JSON (~72% of actual payload), reasoning_content, tool_call_id, name, and per-message framing overhead. This caused the memory consolidator to never trigger for tool-heavy sessions (e.g. cron jobs), leading to context window overflow errors from the LLM provider. Also adds reasoning_content counting and proper per-message overhead to estimate_message_tokens() for consistent boundary detection. Made-with: Cursor	2026-03-22 12:19:44 +08:00
Xubin Ren	48c71bb61e	refactor(agent): unify process_direct to return OutboundMessage Merge process_direct() and process_direct_outbound() into a single interface returning OutboundMessage \| None. This eliminates the dual-path detection logic in CLI single-message mode that relied on inspect.iscoroutinefunction to distinguish between the two APIs. Extract status rendering into a pure function build_status_content() in utils/helpers.py, decoupling it from AgentLoop internals. Made-with: Cursor	2026-03-22 00:39:38 +08:00
Xubin Ren	445a96ab55	fix(agent): harden multimodal tool result flow Keep multimodal tool outputs on the native content-block path while restoring redirect SSRF checks for web_fetch image responses. Also share image block construction, simplify persisted history sanitization, and add regression tests for image reads and blocked private redirects. Made-with: Cursor	2026-03-21 05:34:56 +00:00
Xubin Ren	5d1528a5f3	fix(heartbeat): inject shared current time context into phase 1	2026-03-16 10:52:26 +08:00
Re-bin	76c6063141	chore: normalize helpers.py file mode	2026-03-11 03:50:54 +00:00
Re-bin	f339f505cd	Merge remote-tracking branch 'origin/main' into pr-1856	2026-03-11 03:48:05 +00:00
Re-bin	ddccf25bb1	fix(subagent): preserve reasoning fields across tool turns Share assistant message construction between the main agent and subagents, and add a regression test to keep reasoning_content and thinking_blocks in follow-up tool rounds.	2026-03-11 03:47:24 +00:00
YinAnPing	d1df53aaf7	fix: exclude hidden files when syncing workspace templates Skip files starting with '.' (e.g., macOS extended attributes like ._AGENTS.md) to prevent UnicodeDecodeError during template synchronization.	2026-03-11 09:30:33 +08:00
Re-bin	62ccda43b9	refactor(memory): switch consolidation to token-based context windows Move consolidation policy into MemoryConsolidator, keep backward compatibility for legacy config, and compress history by token budget instead of message count.	2026-03-10 19:55:06 +00:00
Re-bin	20dfaa5d34	refactor: unify instance path resolution and preserve workspace override	2026-03-08 02:58:25 +00:00
Re-bin	bdac08161b	Merge branch 'main' into pr-1581	2026-03-08 02:05:23 +00:00
Re-bin	7e9616cbd3	merge origin/main into pr-1567	2026-03-06 06:51:28 +00:00
Re-bin	3a01fe536a	refactor: move detect_image_mime to utils/helpers for reuse	2026-03-06 06:49:09 +00:00
samsonchoi	4e4d40ef33	feat: multi-instance support with --config parameter Add support for running multiple nanobot instances with complete isolation: - Add --config parameter to gateway command for custom config file path - Implement set_config_path() in config/loader.py for dynamic config path - Derive data directory from config file location (e.g., ~/.nanobot-xxx/) - Update get_data_path() to use unified data directory from config loader - Ensure cron jobs use instance-specific data directory This enables running multiple isolated nanobot instances by specifying different config files, with each instance maintaining separate: - Configuration files - Workspace (memory, sessions, skills) - Cron jobs - Logs and media Example usage: nanobot gateway --config ~/.nanobot-instance2/config.json --port 18791	2026-03-05 23:48:45 +08:00
suger-m	323e5f22cc	refactor(channels): extract split_message utility to reduce code duplication Extract the _split_message function from discord.py and telegram.py into a shared utility function in utils/helpers.py. Changes: - Add split_message() to nanobot/utils/helpers.py with configurable max_len - Update Discord channel to use shared utility (2000 char limit) - Update Telegram channel to use shared utility (4000 char limit) - Remove duplicate implementations from both channels Benefits: - Reduces code duplication - Centralizes message splitting logic for easier maintenance - Makes the function reusable for future channels The function splits content into chunks within max_len, preferring to break at newlines or spaces rather than mid-word.	2026-03-05 17:16:47 +08:00
JK_Lu	977ca725f2	style: unify code formatting and import order - Remove trailing whitespace and normalize blank lines - Unify string quotes and line breaks for long lines - Sort imports alphabetically across modules	2026-02-28 20:55:43 +08:00

1 2

57 Commits