nanobot

mirror of https://github.com/HKUDS/nanobot.git synced 2026-05-31 14:01:17 +00:00

Author	SHA1	Message	Date
chengyongru	dec26396ed	fix(feishu): remove resuming to avoid 10-min streaming card timeout Feishu streaming cards auto-close after 10 minutes from creation, regardless of update activity. With resuming enabled, a single card lives across multiple tool-call rounds and can exceed this limit, causing the final response to be silently lost. Remove the _resuming logic from send_delta so each tool-call round gets its own short-lived streaming card (well under 10 min). Add a fallback that sends a regular interactive card when the final streaming update fails.	2026-04-14 17:01:26 +08:00
chengyongru	4c684540c5	Merge remote-tracking branch 'origin/main' into nightly	2026-04-14 00:37:21 +08:00
Xubin Ren	a38bc637bd	fix(runner): preserve injection flag after max-iteration drain Keep late follow-up injections observable when they are drained during max-iteration shutdown so loop-level response suppression still makes the right decision. Made-with: Cursor	2026-04-14 00:30:30 +08:00
chengyongru	a1e1eed2f1	refactor(runner): consolidate all injection drain paths and deduplicate tests - Migrate "after tools" inline drain to use _try_drain_injections, completing the refactoring (all 6 drain sites now use the helper). - Move checkpoint emission into _try_drain_injections via optional iteration parameter, eliminating the leaky split between helper and caller for the final-response path. - Extract _make_injection_callback() test helper to replace 7 identical inject_cb function bodies. - Add test_injection_cycle_cap_on_error_path to verify the cycle cap is enforced on error exit paths.	2026-04-14 00:30:30 +08:00
chengyongru	d849a3fa06	fix(agent): drain injection queue on error/edge-case exit paths When the agent runner exits due to LLM error, tool error, empty response, or max_iterations, it breaks out of the iteration loop without draining the pending injection queue. This causes leftover messages to be re-published as independent inbound messages, resulting in duplicate or confusing replies to the user. Extract the injection drain logic into a `_try_drain_injections` helper and call it before each break in the error/edge-case paths. If injections are found, continue the loop instead of breaking. For max_iterations (where the loop is exhausted), drain injections to prevent re-publish without continuing.	2026-04-14 00:30:30 +08:00
chengyongru	3c06db7e4e	fix(log): remove noisy no-op logs from auto-compact Remove two debug log lines that fire on every idle channel check: - "scheduling archival" (logged before knowing if there's work) - "skipping, no un-consolidated messages" (the common no-op path) The meaningful "archived" info log (only on real work) is preserved.	2026-04-13 20:14:58 +08:00
chengyongru	b3288fbc87	fix(log): only log auto-compact when messages are actually archived	2026-04-13 16:52:47 +08:00
chengyongru	b311759e87	fix(log): remove noisy no-op logs from auto-compact Remove two debug log lines that fire on every idle channel check: - "scheduling archival" (logged before knowing if there's work) - "skipping, no un-consolidated messages" (the common no-op path) The meaningful "archived" info log (only on real work) is preserved.	2026-04-13 16:09:42 +08:00
haosenwang1018	d33bf22e91	docs(provider): clarify responses api routing	2026-04-13 15:59:36 +08:00
haosenwang1018	85c7996766	docs(api): clarify cross-channel message delivery	2026-04-13 15:59:36 +08:00
chengyongru	ac714803f6	fix(provider): recover trailing assistant message as user to prevent empty request When a subagent result is injected with current_role="assistant", _enforce_role_alternation drops the trailing assistant message, leaving only the system prompt. Providers like Zhipu/GLM reject such requests with error 1214 ("messages parameter invalid"). Now the last popped assistant message is recovered as a user message when no user/tool messages remain.	2026-04-13 12:54:39 +08:00
chengyongru	becaff3e9d	fix(agent): skip auto-compact for sessions with active agent tasks Prevent proactive compaction from archiving sessions that have an in-flight agent task, avoiding mid-turn context truncation when a task runs longer than the idle TTL.	2026-04-13 12:51:37 +08:00
chengyongru	89ea2375fd	fix(provider): recover trailing assistant message as user to prevent empty request When a subagent result is injected with current_role="assistant", _enforce_role_alternation drops the trailing assistant message, leaving only the system prompt. Providers like Zhipu/GLM reject such requests with error 1214 ("messages parameter invalid"). Now the last popped assistant message is recovered as a user message when no user/tool messages remain.	2026-04-13 12:01:45 +08:00
chengyongru	62bd54ac4a	fix(agent): skip auto-compact for sessions with active agent tasks Prevent proactive compaction from archiving sessions that have an in-flight agent task, avoiding mid-turn context truncation when a task runs longer than the idle TTL.	2026-04-13 12:01:29 +08:00
Xubin Ren	6484c7c47a	fix(agent): close interrupted early-persisted user turns Track text-only user messages that were flushed before the turn loop completes, then materialize an interrupted assistant placeholder on the next request so session history stays legal and later turns do not skip their own assistant reply. Made-with: Cursor	2026-04-13 10:26:09 +08:00
Xubin Ren	b964a894d2	test(agent): cover early user-message persistence Use session.add_message for the pre-turn user-message flush and add focused regression tests for crash-time persistence and duplicate-free successful saves. Made-with: Cursor	2026-04-13 10:26:09 +08:00
nikube	ea94a9c088	fix(agent): persist user message before running turn loop The existing runtime_checkpoint mechanism preserves the in-flight assistant/tool state if the process dies mid-turn, but the triggering user message is only written to session history at the end of the turn via _save_turn(). If the worker is killed (OOM, SIGKILL, a self- triggered systemctl restart, container eviction, etc.) before the turn completes, the user's message is silently lost: on restart, the session log only shows the interrupted assistant turn without any record of what the user asked. Any recovery tooling built on top of session logs cannot reply because it has no prompt to reply to. This patch appends the incoming user message to the session and flushes it to disk immediately after the session is loaded and before the agent loop runs, then adjusts the _save_turn skip offset so the final persistence step does not duplicate it. Limited to textual content (isinstance(msg.content, str)); list-shaped content (media blocks) still flows through _save_turn's sanitization at end of turn, preserving existing behavior for those cases.	2026-04-13 10:26:09 +08:00
Xubin Ren	49355b2bd6	test(tools): lock non-object parameter validation Add focused registry coverage so the new read_file/read_write parameter guard stays actionable without changing generic validation behavior for other tools. Made-with: Cursor	2026-04-13 09:55:05 +08:00
ramonpaolo	830644c352	fix: add guard for non-dict tool call parameters - Add type validation in registry.prepare_call() to catch list/other invalid params - Add logger.warning() in provider layer when non-dict args detected - Works for OpenAI-compatible and Anthropic providers - Registry returns clear error hint for model to self-correct	2026-04-13 09:55:05 +08:00
haosenwang1018	92ef594b6a	fix(mcp): hint on stdio protocol pollution	2026-04-13 09:41:55 +08:00
haosenwang1018	3573109408	fix(provider): preserve static error helper compatibility	2026-04-13 09:37:31 +08:00
haosenwang1018	c68b3edb9d	fix(provider): clarify local 502 recovery hints	2026-04-13 09:37:31 +08:00
bahtya	f879d81b28	fix(channels/qq): propagate network errors in send() instead of swallowing The catch-all except Exception in QQ send() was swallowing aiohttp.ClientError and OSError that _send_media correctly re-raises. Add explicit catch for network errors before the generic handler.	2026-04-13 00:30:45 +08:00
bahtya	fa98524944	fix(channels): prevent retry amplification and silent message loss across channels Audited all channel implementations for overly broad exception handling that causes retry amplification or silent message loss during network errors. This is the same class of bug as #3050 (Telegram _send_text). Fixes by channel: Telegram (send_delta): - _stream_end path used except Exception for HTML edit fallback - Network errors (TimedOut, NetworkError) triggered redundant plain text edit, doubling connection demand during pool exhaustion - Changed to except BadRequest, matching the _send_text fix Discord: - send() caught all exceptions without re-raising - ChannelManager._send_with_retry() saw successful return, never retried - Messages silently dropped on any send failure - Added raise after error logging DingTalk: - _send_batch_message() returned False on all exceptions including network errors — no retry, fallback text sent unnecessarily - _read_media_bytes() and _upload_media() swallowed transport errors, causing _send_media_ref() to cascade through doomed fallback attempts - Added except httpx.TransportError handlers that re-raise immediately WeChat: - Media send failure triggered text fallback even for network errors - During network issues: 3×(media + text) = 6 API calls per message - Added specific catches: TimeoutException/TransportError re-raise, 5xx HTTPStatusError re-raises, 4xx falls back to text QQ: - _send_media() returned False on all exceptions - Network errors triggered fallback text instead of retry - Added except (aiohttp.ClientError, OSError) that re-raises Tests: 331 passed (283 existing + 48 new across 5 channel test files) Fixes: #3054 Related: #3050, #3053	2026-04-13 00:30:45 +08:00
bahtya	7e91aecd7d	fix(telegram): narrow exception catch in _send_text to prevent retry amplification Previously _send_text() caught all exceptions (except Exception) when sending HTML-formatted messages, falling back to plain text even for network errors like TimedOut and NetworkError. This caused connection demand to double during pool exhaustion scenarios (3 retries × 2 fallback attempts = 6 calls per message instead of 3). Now only catches BadRequest (HTML parse errors), letting network errors propagate immediately to the retry layer where they belong. Fixes: HKUDS/nanobot#3050	2026-04-13 00:30:45 +08:00
Xubin Ren	217e1fc957	test(retry): lock in-place image fallback behavior Add a focused regression test for the successful no-image retry path so the original message history stays stripped after fallback and the repeated retry loop cannot silently return. Made-with: Cursor	2026-04-12 20:10:06 +08:00
yanghan-cyber	b261201985	fix(retry): strip images in-place to prevent repeated error-retry cycles When a non-transient LLM error occurs with image content, the retry mechanism strips images from a copy but never updates the original conversation history. Subsequent iterations rebuild context from the unmodified history, causing the same error-retry cycle to repeat every iteration until max_iterations is reached. Add _strip_image_content_inplace() that mutates the original message content lists in-place after a successful no-image retry, so callers sharing those references (e.g. the runner's conversation history) also see the stripped version.	2026-04-12 20:10:06 +08:00
Xubin Ren	7a7f5c9689	fix(dream): use valid builtin skill template paths Point Dream skill creation at a readable builtin skill-creator template, keep skill writes rooted at the workspace, and document the new skill discovery behavior in README. Made-with: Cursor	2026-04-12 16:49:55 +08:00
chengyongru	2a243bfe4f	feat(agent): integrate skill discovery into Dream consolidation Instead of a separate skill discovery system, extend Dream's two-phase pipeline to also detect reusable behavioral patterns from conversation history and generate SKILL.md files. Phase 1 gains a [SKILL] output type for pattern detection. Phase 2 gains write_file (scoped to skills/) and read access to builtin skills, enabling it to check for duplicates and follow skill-creator's format conventions before creating new skills. Inspired by PR #3039 by @wanghesong2019. Co-authored-by: wanghesong2019 <wanghesong2019@users.noreply.github.com>	2026-04-12 16:49:55 +08:00
Xubin Ren	5dc238c7ef	fix(shell): allow read-only copies from internal state files Keep the new exec guard focused on writes to history.jsonl and .dream_cursor while still allowing read-only copy operations out of those files. Made-with: Cursor	2026-04-12 16:38:55 +08:00
04cb	3f59bd1443	fix(shell): reject LLM-supplied working_dir outside workspace (#2826 )	2026-04-12 16:38:55 +08:00
04cb	00fb491bc9	fix(shell): block exec writes to history.jsonl and cursor files (#2989 )	2026-04-12 16:38:55 +08:00
Xubin Ren	a81e4c1791	Merge PR #2959 : feat(skills): add disabled_skills config to exclude skills from loading feat(skills): add disabled_skills config to exclude skills from loading	2026-04-12 10:46:50 +08:00
Xubin Ren	a142788da9	docs(readme): document disabledSkills config Explain the new agents.defaults.disabledSkills option so users can discover and configure skill exclusion from the main agent and subagents. Made-with: Cursor	2026-04-12 02:42:52 +00:00
Xubin Ren	e229c2ebc0	fix(pr): remove internal .docs file from PR Keep the local review note out of the GitHub diff while preserving the actual code and test changes for this PR. Made-with: Cursor	2026-04-12 02:21:46 +00:00
Xubin Ren	09c238ca0f	Merge origin/main into pr-2959 Resolve the config plumbing conflicts and keep disabled skill filtering consistent for subagent prompts after syncing with main. Made-with: Cursor	2026-04-12 02:02:39 +00:00
Dianqi Ji	ee946d96ca	feat(channels/feishu): add domain config for Lark global support Add 'domain' field to FeishuConfig (Literal['feishu', 'lark'], default 'feishu'). Pass domain to lark.Client.builder() and lark.ws.Client to support Lark global (open.larksuite.com) in addition to Feishu China (open.feishu.cn). Existing configs default to 'feishu' for backward compatibility. Also add documentation for domain field in README.md and add tests for domain config.	2026-04-12 09:56:17 +08:00
Xubin Ren	a70928cc5c	Merge PR #3045 : fix(agent): preserve tool results on fatal error to prevent orphan tool_calls fix(agent): preserve tool results on fatal error to prevent orphan tool_calls (#2943)	2026-04-11 23:08:03 +08:00
layla	f25cdb7138	Merge branch 'main' into fix/tool-call-result-order-2943	2026-04-11 22:00:07 +08:00
04cb	4cd4ed8ada	fix(agent): preserve tool results on fatal error to prevent orphan tool_calls (#2943 )	2026-04-11 21:50:44 +08:00
chengyongru	9f433cab01	fix(wecom): use reply_stream for progress messages to avoid errcode=40008 The plain reply() uses cmd="reply" which does not support "text" msgtype and causes WeCom API to return errcode=40008 (invalid message type). Unify both progress and final text messages to use reply_stream() (cmd="aibot_respond_msg"), differentiating via finish flag. Fixes #2999	2026-04-11 21:47:19 +08:00
chengyongru	0d03f10fa0	test(channels): add media support tests for QQ and WeCom channels Cover helpers (sanitize_filename, guess media type), outbound send (exception handling, media-then-text order, fallback), inbound message processing (attachments, dedup, empty content), _post_base64file payload filtering, and WeCom upload/download flows.	2026-04-11 21:47:19 +08:00
chengyongru	f6f712a2ae	fix(wecom): harden upload/download, extract media type helper - Use asyncio.to_thread for file I/O to avoid blocking event loop - Add 200MB upload size limit with early rejection - Fix file handle leak by using context manager - Use memoryview for upload chunking to reduce peak memory - Add inbound download size check to prevent OOM - Use asyncio.to_thread for write_bytes in download path - Extract inline media_type detection to _guess_wecom_media_type()	2026-04-11 21:47:19 +08:00
chengyongru	f900e4f259	fix(wecom): harden upload and inbound media handling - Use asyncio.to_thread for file I/O to avoid blocking event loop - Add 200MB upload size limit with early rejection - Fix file handle leak by using context manager - Free raw bytes early after chunking to reduce memory pressure - Add file attachments to media_paths (was text-only, inconsistent with image) - Use robust _sanitize_filename() instead of os.path.basename() for path safety - Remove re-raise in send() for consistency with QQ channel - Fix truncated media_id logging for short IDs	2026-04-11 21:47:19 +08:00
gem12	48f6bbd256	feat(channels): Add full media support for QQ and WeCom channels QQ channel improvements (on top of nightly): - Add top-level try/except in _on_message and send() for resilience - Use defensive getattr() for attachment attributes (botpy version compat) - Skip file_name for image uploads to avoid QQ rendering as file attachment - Extract only file_info from upload response to avoid extra fields - Handle protocol-relative URLs (//...) in attachment downloads WeCom channel improvements: - Add _upload_media_ws() for WebSocket 3-step media upload protocol - Send media files (image/video/voice/file) via WeCom rich media API - Support progress messages (plain reply) vs final response (streaming) - Support proactive send when no frame available (cron push) - Pass media_paths to message bus for downstream processing	2026-04-11 21:47:19 +08:00
Xubin Ren	cf8381f517	feat(agent): enhance message injection handling and content merging	2026-04-11 21:43:23 +08:00
Xubin Ren	f6c39ec946	feat(agent): enhance session key handling for follow-up messages	2026-04-11 21:43:23 +08:00
chengyongru	36d2a11e73	feat(agent): mid-turn message injection for responsive follow-ups (#2985 ) * feat(agent): add mid-turn message injection for responsive follow-ups Allow user messages sent during an active agent turn to be injected into the running LLM context instead of being queued behind a per-session lock. Inspired by Claude Code's mid-turn queue drain mechanism (query.ts:1547-1643). Key design decisions: - Messages are injected as natural user messages between iterations, no tool cancellation or special system prompt needed - Two drain checkpoints: after tool execution and after final LLM response ("last-mile" to prevent dropping late arrivals) - Bounded by MAX_INJECTION_CYCLES (5) to prevent consuming the iteration budget on rapid follow-ups - had_injections flag bypasses _sent_in_turn suppression so follow-up responses are always delivered Closes #1609 * fix(agent): harden mid-turn injection with streaming fix, bounded queue, and message safety - Fix streaming protocol violation: Checkpoint 2 now checks for injections BEFORE calling on_stream_end, passing resuming=True when injections found so streaming channels (Feishu) don't prematurely finalize the card - Bound pending queue to maxsize=20 with QueueFull handling - Add warning log when injection batch exceeds _MAX_INJECTIONS_PER_TURN - Re-publish leftover queue messages to bus in _dispatch finally block to prevent silent message loss on early exit (max_iterations, tool_error, cancel) - Fix PEP 8 blank line before dataclass and logger.info indentation - Add 12 new tests covering drain, checkpoints, cycle cap, queue routing, cleanup, and leftover re-publish	2026-04-11 21:43:23 +08:00
Jiajun Xie	f5640d69fe	fix(feishu): improve voice message download with detailed logging - Add explicit error logging for missing file_key and message_id - Add logging for download failures - Change audio extension from .opus to .ogg for better Whisper compatibility - Feishu voice messages are opus in OGG container; .ogg is more widely recognized	2026-04-11 20:48:35 +08:00
Xubin Ren	e0b9edf985	Merge PR #3017 : feat(tool): improve file editing and add notebook tool feat(tool): improve file editing and add notebook tool	2026-04-11 18:02:25 +08:00

1 2 3 4 5 ...

1898 Commits