nanobot

mirror of https://github.com/HKUDS/nanobot.git synced 2026-04-30 06:45:55 +00:00

Author	SHA1	Message	Date
Xubin Ren	8a0917db7a	fix(slack): polish thread UX and media support	2026-04-27 12:45:00 +08:00
Xubin Ren	1fe3f0eb22	fix(restart): preserve channel metadata across /restart so reply lands in thread cmd_restart only persisted channel + chat_id across the os.execv boundary, so when the new process announced "Restart completed" the OutboundMessage had no Slack thread_ts and the reply fell back to the channel root. Serialize msg.metadata into NANOBOT_RESTART_NOTIFY_METADATA, restore it on the RestartNotice, and forward it to OutboundMessage so the completion message follows the same routing as the original /restart invocation. Made-with: Cursor	2026-04-27 12:45:00 +08:00
Xubin Ren	1ef41052da	fix(cron): rephrase fire-time prompt so agent delivers a natural reminder The old prompt framed cron firing as a "task triggered" status report, which led the agent to reply with things like "Done ✅ 已提醒 U0AV8BJPV8D 喝水" — exposing the user id and reading like a system log instead of a friendly reminder. Reword it to instruct the agent to speak directly to the user and forbid status-style language. Made-with: Cursor	2026-04-27 12:45:00 +08:00
Xubin Ren	4801f54f5b	fix(cron): persist channel_meta and session_key across reloads Without writing these fields into jobs.json, cron jobs created in a Slack thread lost their thread_ts (and original session_key) after the service was reloaded, so reminders fired into the channel root. Made-with: Cursor	2026-04-27 12:45:00 +08:00
chengyongru	6eb178113e	fix(mcp): sanitize MCP capability names for model API compatibility MCP resource/prompt/tool names containing spaces or special characters (e.g. "PostgreSQL System Information") were forwarded verbatim to model provider APIs, causing validation errors from both Anthropic and OpenAI which require names matching ^[a-zA-Z0-9_-]{1,128}$. Add _sanitize_name() that replaces invalid characters with underscores and collapses consecutive underscores. Applied in MCPToolWrapper, MCPResourceWrapper, MCPPromptWrapper constructors and the enabled_tools filtering logic. Closes #3468	2026-04-27 11:49:50 +08:00
Xubin Ren	4a4ba1efc1	Merge branch 'main' into fix/session-history-timestamps Made-with: Cursor	2026-04-26 18:13:11 +00:00
Xubin Ren	038a140ad3	fix(slack): preserve thread context for proactive replies Capture Slack thread metadata for cron and message-tool deliveries so replies stay in the originating thread, and hydrate first thread mentions with recent Slack context. Made-with: Cursor	2026-04-27 02:10:38 +08:00
Xubin Ren	df37a36174	fix(agent): expose session timestamps in model context Include persisted turn timestamps when assembling LLM prompts so relative-date references like yesterday and today have concrete anchors. Made-with: Cursor	2026-04-26 17:42:58 +00:00
Xubin Ren	b2aec5528a	refactor(agent): move provider refresh into subsystem owners	2026-04-26 14:18:37 +00:00
Xubin Ren	f670da6c70	refactor(providers): move provider snapshot creation into factory	2026-04-26 14:05:13 +00:00
Xubin Ren	65b0ae81af	Merge origin/main into webui-settings Made-with: Cursor	2026-04-26 13:05:32 +00:00
Xubin Ren	82b8a3af7e	fix(provider): handle incomplete DeepSeek reasoning history	2026-04-26 20:47:55 +08:00
Xubin Ren	3b82e14f85	fix(shell): preserve login PATH for path append Made-with: Cursor	2026-04-26 20:32:38 +08:00
yorkhellen	814345dd78	fix: update tests for path_append env dict change	2026-04-26 20:32:38 +08:00
yorkhellen	2f2ac96ac7	fix: update tests for path_append env dict change	2026-04-26 20:32:38 +08:00
yorkhellen	23dde7b84c	fix: prevent shell injection via path_append in ExecTool	2026-04-26 20:32:38 +08:00
Xubin Ren	727086ddac	test: tighten consolidation ratio coverage Made-with: Cursor	2026-04-26 20:24:42 +08:00
chengyongru	fca56d324a	test: add unit tests for configurable consolidation_ratio Cover ratio propagation, schema validation, and consolidation behavior with different ratio values (0.1, 0.5, 0.9).	2026-04-26 20:24:42 +08:00
chengyongru	3de843a229	fix(provider): gate reasoning-to-content fallback behind spec flag The non-streaming parse path unconditionally promoted the `reasoning` response field to `content` when content was empty. This was intended for StepFun (whose API returns the actual answer in `reasoning`), but it applied to every OpenAI-compatible provider — causing internal thinking chains from models like Xiaomi MIMO to be leaked as formal replies. Add `reasoning_as_content: bool` to ProviderSpec (default False) and set it only for StepFun. The fallback now requires this flag rather than running globally. Fixes #3443	2026-04-26 20:11:08 +08:00
Xubin Ren	6036355ac5	fix(message): limit session recording to proactive sends Only mark message-tool deliveries for channel-session recording while cron jobs are running, avoiding duplicate session writes during normal user turns. Made-with: Cursor	2026-04-26 20:08:21 +08:00
Xubin Ren	799db33517	fix(heartbeat): record proactive deliveries in channel sessions Route heartbeat, cron, and message-tool deliveries through one gateway helper so user-visible proactive messages are available when the channel replies. Made-with: Cursor	2026-04-26 20:08:21 +08:00
hussein1362	1572626100	fix(heartbeat): inject delivered messages into channel session for reply continuity When heartbeat delivers output to a channel (e.g. Telegram), the message is a raw OutboundMessage that bypasses the channel's session. If the user replies, their reply enters a different session with no context about the heartbeat message, so the agent cannot follow through. This change injects the delivered heartbeat message as an assistant turn into the target channel's session before publishing the outbound. When the user replies, the channel session has conversational context. Handles unified_session mode by resolving to UNIFIED_SESSION_KEY when enabled, matching the agent loop's own session routing. No changes to agent/loop.py, session/manager.py, channels, providers, or config schema — uses existing add_message() and save() APIs.	2026-04-26 20:08:21 +08:00
Xubin Ren	1e11b35b45	fix(providers): tighten local endpoint detection Parse the endpoint host before disabling keepalive so public hostnames that merely contain private-network substrings keep the default connection pool behavior. Made-with: Cursor	2026-04-26 16:14:24 +08:00
hussein1362	5943ab386d	fix(providers): disable HTTP keepalive for local/LAN endpoints Local model servers (Ollama, llama.cpp, vLLM) often close idle HTTP connections before the client-side keepalive timer expires. When two LLM calls happen seconds apart — for example the heartbeat _decide() phase followed immediately by process_direct() — the second call grabs a now-dead pooled connection, causing a transient APIConnectionError on every first attempt. The fix detects local endpoints via: - ProviderSpec.is_local (Ollama, LM Studio, vLLM, OVMS) - Private-network URL patterns (localhost, 127.x, 192.168.x, 10.x, 172.16-31.x, host.docker.internal, [::1]) For these endpoints, the AsyncOpenAI client is created with a custom httpx.AsyncClient that sets keepalive_expiry=0, forcing a fresh TCP connection for each request. This is cheap on LAN (sub-5ms connect) and eliminates the stale-connection retry tax entirely. Cloud providers (OpenAI, Anthropic, OpenRouter, etc.) keep the default 5-second keepalive, which is fine for high-frequency API usage. The private-network heuristic also covers the common case where users configure provider='openai' but point apiBase at a LAN IP running llama.cpp — the spec says is_local=False, but the URL clearly is.	2026-04-26 16:14:24 +08:00
Xubin Ren	d0e1b1393a	fix(feishu): scope streaming buffers by message Keep concurrent Feishu group replies from sharing one streaming card buffer when sessions are split by topic or top-level message. Made-with: Cursor	2026-04-26 16:09:31 +08:00
chengyongru	39eea1b762	feat(feishu): per-message session for group top-level messages Align with deer-flow: group top-level messages (no root_id) now get their own session keyed by message_id instead of sharing a single group-wide session. Topic replies continue to share session via root_id.	2026-04-26 16:09:31 +08:00
chengyongru	0e92936cf3	chore(test): remove stale reaction_id from test metadata The production code no longer reads reaction_id from metadata, so remove the leftover key from the test_no_removal_when_message_id_missing test case.	2026-04-26 16:09:31 +08:00
chengyongru	3eb8838dd9	fix(test): update reaction cleanup test for _reaction_ids dict The stream-end reaction cleanup now reads from _reaction_ids instead of metadata, so pre-populate the dict in the test instead of passing reaction_id via metadata.	2026-04-26 16:09:31 +08:00
chengyongru	2a9fc9392b	fix(feishu): use message_id as reply target and fix keyword-only arg Align reply targeting with deer-flow: always reply to the inbound message_id (not root_id). The Feishu Reply API keeps responses in the same topic automatically when the target message is inside a topic. Also fix run_in_executor calls that passed reply_in_thread as a positional arg to a keyword-only parameter, and route standalone tool hints through the reply API for group chats.	2026-04-26 16:09:31 +08:00
chengyongru	d36fba8bf5	feat(feishu): add reply_in_thread for visual topic grouping When reply_to_message config is enabled, the bot's first reply now uses reply_in_thread=True to create a visual topic/thread in the Feishu client. Subsequent chunks fall back to regular create. The reply_to_message default remains False for backward compatibility. Failed replies still fall back to regular send — messages are never silently dropped.	2026-04-26 16:09:31 +08:00
chengyongru	13bb31c789	feat(feishu): add thread-scoped session isolation for group chats Thread replies (messages with root_id != message_id) in group chats now get their own session key: feishu:{chat_id}:{root_id}. This means each Feishu thread has an independent conversation context. Top-level group messages and all private chat messages keep the default session key (no override), consistent with Telegram and Slack channel behavior. Co-authored-by: shenchengtsi <228445050+shenchengtsi@users.noreply.github.com>	2026-04-26 16:09:31 +08:00
Xubin Ren	b440e76d2f	feat(webui): add model settings runtime refresh	2026-04-25 18:05:06 +00:00
T3chC0wb0y	fd3d7ea752	fix(msteams): normalize nbsp in inbound text	2026-04-26 00:56:06 +08:00
T3chC0wb0y	722d935d37	fix(msteams): prune bad notify refs	2026-04-26 00:56:06 +08:00
T3chC0wb0y	7e65884acb	fix(msteams): send threaded replies via replyToId	2026-04-26 00:56:06 +08:00
Xubin Ren	a58d9fd357	feat(webui): render ask_user choices Made-with: Cursor	2026-04-25 15:46:47 +00:00
Xubin Ren	403ce23d22	fix(agent): tighten ask_user CLI handling Made-with: Cursor	2026-04-25 22:10:19 +08:00
Xubin Ren	3b1ea99ee1	fix(agent): render ask_user options without buttons Made-with: Cursor	2026-04-25 22:10:19 +08:00
Xubin Ren	cfc76ffbbf	feat(agent): add ask_user tool Made-with: Cursor	2026-04-25 22:10:19 +08:00
Xubin Ren	39a5a77874	fix(feishu): send videos with media message type	2026-04-24 20:00:56 +00:00
yorkhellen	076e4166d7	fix(agent): add LLM request timeout to prevent session lock starvation	2026-04-25 03:40:34 +08:00
Xubin Ren	e52fe2a8e2	feat(webui): render video media attachments Add signed media URLs to live WebSocket replies and teach the WebUI to classify and render video attachments, so bot-sent videos can play inline in both live chats and session history. Made-with: Cursor	2026-04-25 03:20:40 +08:00
Xubin Ren	be05189f39	feat(channels): add video support for Telegram and WebSocket Telegram previously sent all video files as documents via send_document, so users saw a file icon instead of an inline player. WebSocket only accepted image MIME types, rejecting video uploads entirely. Telegram: - Recognize video extensions (mp4/mov/avi/mkv/webm/3gp) in _get_media_type - Route videos through send_video with supports_streaming=True - Add VIDEO/VIDEO_NOTE/ANIMATION to inbound message filters - Add video MIME mappings to _get_extension - Fix: local file sends now use _call_with_retry (previously no retry) WebSocket: - Expand upload MIME whitelist with video/mp4, video/webm, video/quicktime - Add per-type size limits (_MAX_VIDEO_BYTES=20MB, _MAX_VIDEOS_PER_MESSAGE=1) - Expand media serving endpoint to serve video with correct Content-Type Agent: - Add "video" to message tool media parameter description - Add .mp4 example to identity.md system prompt Made-with: Cursor	2026-04-25 02:20:13 +08:00
Xubin Ren	3441d5f89c	test(anthropic): cover remaining opus-4-7 temperature branches The existing test only verified the adaptive path. Add two more cases: - enabled thinking (high): temperature must also be omitted - no thinking (None): temperature must still be omitted Made-with: Cursor	2026-04-24 15:33:59 +08:00
04cb	9239429a00	fix(anthropic): omit temperature for opus-4-7 (#3417 )	2026-04-24 15:33:59 +08:00
Xubin Ren	7f1913f619	fix(provider): add DeepSeek thinking toggle; backfill reasoning_content on legacy messages Two issues with DeepSeek V4 thinking mode support: 1. Missing thinking parameter injection. DeepSeek V4 requires `extra_body: {"thinking": {"type": "enabled/disabled"}}` — identical to VolcEngine/BytePlus. The code had this for volcengine, byteplus, dashscope, minimax, and kimi but not DeepSeek. This means `reasoning_effort=minimal` (thinking off) silently has no effect. Root cause: the thinking-style→wire-format mapping was an if/elif chain on provider names. DeepSeek was forgotten. Fix: make the mapping declarative via `ProviderSpec.thinking_style`: - "thinking_type" → {"thinking": {"type": "..."}} (DeepSeek, Volc, BytePlus) - "enable_thinking" → {"enable_thinking": bool} (DashScope) - "reasoning_split" → {"reasoning_split": bool} (MiniMax) `_build_kwargs` now does a single dict lookup. Adding a new provider with an existing wire format requires zero changes to the function. 2. Legacy session messages crash thinking-mode requests. When a session was started without thinking mode (or with a different model), assistant messages lack reasoning_content. DeepSeek V4 in thinking mode rejects these with 400: "The reasoning_content in the thinking mode must be passed back to the API." This affects ALL assistant messages, not just those with tool_calls (despite the docs only mentioning the tool_calls case). Fix: `_build_kwargs` backfills `reasoning_content: ""` on every assistant message missing it, but only when thinking mode is active. This is semantically neutral — the model treats empty reasoning_content as "no thinking happened on that turn". The backfill only touches the in-memory request copy; session files on disk are untouched. Tests: +5 (3 thinking toggle, 2 backfill). Full suite: 2377 passed. Made-with: Cursor	2026-04-24 15:06:39 +08:00
Xubin Ren	4531167c12	fix(agent): bound remaining memory/history pollution paths from #3412 #3412 stopped the headline raw_archive bloat but left four adjacent leaks on the same pollution chain: - archive() success path appended uncapped LLM summaries to history.jsonl, so a misbehaving LLM could re-open the #3412 bug from the happy path. - maybe_consolidate_by_tokens did not advance last_consolidated when archive() fell back to raw_archive, causing duplicate [RAW] dumps of the same chunk on every subsequent call. - Dream's Phase 1/2 prompt injected MEMORY.md / SOUL.md / USER.md and each history entry without caps, so any legacy oversized record (or an unbounded user edit) would blow past the context window every dream. - append_history itself had no default cap, leaving future new callers one forgotten-cap-away from the same vector. Changes: - Cap LLM-produced summaries at 8K chars (_ARCHIVE_SUMMARY_MAX_CHARS) before writing to history.jsonl. - Advance session.last_consolidated after archive() regardless of whether it summarized or raw-archived — both outcomes materialize the chunk; still break the round loop on fallback so a degraded LLM isn't hammered. - Truncate MEMORY.md / SOUL.md / USER.md and each history entry in Dream's Phase 1 prompt preview (Phase 2 still reaches full files via read_file). - Add _HISTORY_ENTRY_HARD_CAP (64K) as belt-and-suspenders default in append_history with a once-per-store warning, so any new caller that forgets its own tighter cap gets caught and observable. Layer the caps by scope: raw_archive=16K, archive summary=8K, append_history default=64K. Tight per-caller values cover expected payloads; the wide default only catches regressions. Tests: +9 regression tests covering each fix. Full suite: 2372 passed. Made-with: Cursor	2026-04-24 04:17:19 +08:00
Xubin Ren	81a5af2352	test(consolidation): add regression tests for tiktoken truncation path and history char cap Cover two untested boundaries from #3412: - _truncate_to_token_budget with positive budget exercises tiktoken - _MAX_HISTORY_CHARS caps Recent History section in system prompt Made-with: Cursor	2026-04-24 03:57:59 +08:00
chengyongru	2848f69897	fix(agent): prevent history.jsonl bloat from raw_archive and stuck consolidation Root cause: when consolidation LLM fails, raw_archive() dumped full message content (~1MB) into history.jsonl with no size limit. Since build_system_prompt() injects history.jsonl into every system prompt, all subsequent LLM calls exceeded the 200K context window with error 1261. Additionally, _cap_consolidation_boundary's 60-message cap caused consolidation to get stuck on sessions with long tool chains (200+ iterations), triggering the raw_archive fallback in the first place. Three-layer fix: - Remove _cap_consolidation_boundary: let pick_consolidation_boundary drive chunk sizing based solely on token budget - Truncate archive() input: use tiktoken to cap formatted text to the model's input token budget before sending to consolidation LLM - Truncate raw_archive() output: cap history.jsonl entries at 16K chars	2026-04-24 03:57:59 +08:00
Xubin Ren	469fc90fe6	fix(agent): on_progress tool_events only when callback accepts; align progress tests with main Made-with: Cursor	2026-04-23 20:06:11 +08:00

1 2 3 4 5 ...

656 Commits