nanobot

mirror of https://github.com/HKUDS/nanobot.git synced 2026-05-20 16:42:25 +00:00

Author	SHA1	Message	Date
chengyongru	5f5f3d5d97	fix(long-task): address code review findings - Declare _scopes = {"core"} explicitly to prevent recursive nesting in subagent scope - Document fragile coupling in _extract_file_changes: path extraction depends on write_file/edit_file detail format; add debug log for unexpected formats - Align final-template threshold (max_steps - 2) with budget switch threshold - Eliminate hasattr(self, "_state") in _reset_state by initializing in __init__	2026-05-14 00:14:11 +08:00
chengyongru	5acae58a13	test(long-task): add boundary tests and fix race conditions - Add 7 edge-case tests: validation crash resilience, hook exception safety, mid-run correction injection, FIFO correction ordering, explicit file changes overriding auto-detection, final budget for max_steps=1, and dynamic budget switching boundaries - Fix assertion in test_long_task_completes_after_multiple_handoffs to match exact prompt format - Remove asyncio timing hack from test_state_exposure - Add asyncio.sleep(0) yield in test_inject_correction_during_execution to prevent race between signal injection and step continuation - All 34 tests passing	2026-05-13 01:26:01 +08:00
chengyongru	78ecb2a99a	feat(long-task): major overhaul with structured handoffs, validation, and observability - Structured HandoffState: HandoffTool now accepts files_created, files_modified, next_step_hint, and verification fields instead of a plain string. Progress is passed between steps as structured data. - Completion validation round: After complete() is called, a dedicated validator step runs to verify the claim against the original goal. If validation fails, the task continues rather than returning a false completion. - Dynamic prompt system: 3 Jinja2 templates (step_start, step_middle, step_final) selected based on step number. Final steps get tighter budget and stronger "wrap up" guidance. - Automatic file change tracking: Extracts write_file/edit_file events from tool_events and injects them into the next step's context if the subagent forgot to report them explicitly. - Budget tracking & adaptive strategy: Cumulative token usage is tracked across steps. Per-step tool budget drops from 8 to 4 in the last two steps to force handoff/completion. - Crash retry with graceful degradation: A step that crashes is retried once. Persistent crashes terminate the task and return partial progress. - Full observability hooks for future WebUI integration: - set_hooks() with on_step_start, on_step_complete, on_handoff, on_validation_started, on_validation_passed, on_validation_failed, on_task_complete, on_task_error, and catch-all on_event. - Readable state properties: current_step, total_steps, status, last_handoff, cumulative_usage, goal. - inject_correction() allows external code to send user corrections that are injected into the next step's prompt. - run_step() accepts optional max_iterations for dynamic budget control. All 27 long-task tests and 11 subagent tests pass.	2026-05-13 00:55:52 +08:00
chengyongru	e7214d96ed	fix(long-task): add debug logging for step-level observability	2026-05-12 23:37:00 +08:00
chengyongru	bf5762a3d4	feat(long-task): add LongTaskTool for multi-step agent tasks Implements a meta-ReAct loop where long-running tasks are broken into sequential subagent steps, each starting fresh with the original goal and progress from the previous step. This prevents context drift when agents work on complex, multi-step tasks. - Extract build_tool_registry() from SubagentManager for reuse - Add run_step() for synchronous subagent execution (no bus announcement) - Add HandoffTool and CompleteTool as signal mechanisms via shared dict - Add LongTaskTool orchestrator with simplified prompt (8 iterations/step) - Register LongTaskTool in main agent loop - Add _extract_handoff_from_messages fallback for robustness	2026-05-12 23:37:00 +08:00
chengyongru	ef268f47d2	chore: remove dead code identified by vulture + coverage cross-validation Remove unused code confirmed dead via vulture scan, grep verification, and coverage analysis: - _get_bridge_dir (cli/commands.py): 82-line function with zero callers - add_assistant_message (agent/context.py): method body never executed, also removed now-unused build_assistant_message import - _tool_parameters_schema (agent/tools/base.py): redundant copy of schema already exposed via the `parameters` property - MSTEAMS_REF_TTL_S (channels/msteams.py): unused constant (production uses config.ref_ttl_days directly); inlined in test - MESSAGE_TYPE_USER (channels/weixin.py): unused constant	2026-05-12 20:52:48 +08:00
Xubin Ren	35f64cd828	docs(config): document model presets Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-12 20:06:22 +08:00
Xubin Ren	079b37aac5	test(config): cover legacy model defaults without presets Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-12 20:06:22 +08:00
Xubin Ren	13eede5803	refactor(agent): inject runtime model publisher Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-12 20:06:22 +08:00
Xubin Ren	6554c1f832	refactor(agent): move preset helpers out of loop Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-12 20:06:22 +08:00
Xubin Ren	e6103d9312	fix(agent): separate preset snapshots from config reload Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-12 20:06:22 +08:00
Xubin Ren	8fcb24bb7c	refactor(agent): trim model preset runtime wiring Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-12 20:06:22 +08:00
Xubin Ren	70b8daaee6	fix(command): show default as current model preset Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-12 20:06:22 +08:00
Xubin Ren	c9b84c7b11	fix(config): reserve implicit default model preset Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-12 20:06:22 +08:00
Xubin Ren	1d14c2ba40	fix(config): accept modelPresets root alias Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-12 20:06:22 +08:00
Xubin Ren	bcc4b97183	fix(webui): broadcast runtime model updates Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-12 20:06:22 +08:00
Xubin Ren	c92345bbb1	fix(webui): sync model badge after preset switch Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-12 20:06:22 +08:00
Xubin Ren	b61c6304c3	fix(config): reconcile presets with settings reload Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-12 20:06:22 +08:00
Xubin Ren	c450d6fd3f	fix(config): make model preset switching atomic Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-12 20:06:22 +08:00
chengyongru	6f78267c82	feat(config): add ModelPresetConfig and runtime preset switching - Add `ModelPresetConfig` schema for named model presets - Add `model_presets` dict to `Config` and `model_preset` field to `AgentDefaults` - Add `resolve_preset()` to return effective model params from preset or defaults - Add `@model_validator` to reject unknown preset names - Update `_match_provider()` to use resolved preset model/provider - Update `make_provider()` and `provider_signature()` to use `resolve_preset()` - Add `model_preset` property to `AgentLoop` for atomic runtime switching - Update `AgentLoop.from_config()` to inject a runtime `default` preset - Wire self-tool to inspect/clear preset state - Update CLI display strings to show active preset	2026-05-12 20:06:22 +08:00
Xubin Ren	1175420339	test(feishu): cover topic isolation alias Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-12 11:51:25 +08:00
yorkhellen	a32be99ddc	test(feishu): add config and helper tests for topic_isolation	2026-05-12 11:51:25 +08:00
yorkhellen	03b357b12d	feat(feishu): add topic_isolation config switch	2026-05-12 11:51:25 +08:00
Xubin Ren	fd6887c274	test(providers): cover VolcEngine token parameter Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-12 11:35:52 +08:00
Albert Wang	dd4def25fa	fix(providers): set supports_max_completion_tokens for VolcEngine providers VolcEngine's OpenAI-compatible gateway rejects requests when both max_tokens and max_completion_tokens are present (the latter added by openai-python SDK v2.x serialization). Set the flag so nanobot sends max_completion_tokens instead of max_tokens for volcengine, volcengine_coding_plan, and by extension byteplus variants. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-12 11:35:52 +08:00
Xubin Ren	23312d683e	fix(tools): isolate plugin runtime state Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-12 11:28:20 +08:00
chengyongru	043f0e67f7	feat(tools): introduce plugin-based tool discovery and runtime context protocol This commit implements a progressive refactoring of the tool system to support plugin discovery, scoped loading, and protocol-driven runtime context injection. Key changes: - Add Tool ABC metadata (tool_name, _scopes) and ToolContext dataclass for dependency injection. - Introduce ToolLoader with pkgutil-based builtin discovery and entry_points-based third-party plugin loading. - Add scope filtering (core/subagent/memory) so different contexts load appropriate tool sets. - Introduce ContextAware protocol and RequestContext dataclass to replace hardcoded per-tool context injection in AgentLoop. - Add RuntimeState / MutableRuntimeState protocols to decouple MyTool from AgentLoop. - Migrate all built-in tools to declare scopes and implement create()/enabled() hooks. - Migrate MessageTool, SpawnTool, CronTool, and MyTool to ContextAware. - Refactor AgentLoop to use ToolLoader and protocol-driven context injection. - Refactor SubagentManager to use ToolLoader(scope="subagent") with per-run FileStates isolation. - Register all built-in tools via pyproject.toml entry_points. - Add comprehensive tests for loader scopes, entry_points, ContextAware, subagent tools, and runtime state sync.	2026-05-12 11:28:20 +08:00
04cb	bd0ba745dd	fix(wecom): preserve real filename from SDK when payload omits name (#3737 )	2026-05-12 10:27:32 +08:00
Xubin Ren	6d07aa6059	test(webui): cover randomUUID entry shim fallback Add a focused regression test for the non-secure-context WebUI entry shim so missing crypto.randomUUID no longer depends on manual verification. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-11 15:39:05 +08:00
NearlCrews	5ea2c37325	fix(webui): shim crypto.randomUUID for non-secure contexts `crypto.randomUUID` only exists in secure contexts (HTTPS or localhost). Over LAN HTTP it is undefined, so `ChatPane`'s welcome-message flush and streaming-message handlers crash mid-render with `TypeError`, unmounting the React tree and leaving the user a blank page. Install a Math.random-backed v4-ish fallback at app entry, gated on the feature being missing. This mirrors the shim already used in the test setup and covers all six call sites (`ChatPane.tsx`, `useNanobotStream.ts`) without touching them. These IDs are client-side message keys with no security role, so non-cryptographic randomness is fine. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-11 15:39:05 +08:00
chengyongru	49f85f5c23	docs(schema,config): clarify reasoning_effort semantics for MiMo thinking mode - Update AgentDefaults.reasoning_effort comment to document "none" (disable) and None (preserve provider default). - Add configuration.md tip explaining MiMo thinking mode behavior.	2026-05-11 14:38:28 +08:00
Alfredo Arenas	c6b7a9524c	fix(providers): wire MiMo to thinking_type to allow disabling reasoning (#3585 ) The hosted Xiaomi MiMo API accepts {"thinking": {"type": "enabled"\|"disabled"}} to toggle reasoning, which is exactly the shape produced by the existing thinking_type style. The xiaomi_mimo ProviderSpec just needed to opt in. Before this fix, setting reasoning_effort="none" had no effect on MiMo because no thinking_style was configured, so the disable signal never reached the server. Default-on models (mimo-v2.5-pro and friends) kept reasoning regardless of user configuration. Source: https://platform.xiaomimimo.com/docs/en-US/api/chat/openai-api Co-authored with Claude Opus 4.7. Strategy and review via Claude Desktop, implementation via Claude Code.	2026-05-11 14:38:28 +08:00
Alfredo Arenas	271b674bf1	feat(cli): pass bot_name/bot_icon from config to StreamRenderer (#3650 ) Both StreamRenderer instantiations in the agent command (single-message mode and interactive mode) now read bot_name and bot_icon from config.agents.defaults and forward them to the renderer. This is the wiring step that makes the schema fields actually take effect at runtime. With safe defaults of "nanobot" and "🐈", existing users see no change.	2026-05-11 11:50:18 +08:00
Alfredo Arenas	86693f5422	feat(cli): make stream renderer use bot_name and bot_icon (#3650 ) Threads bot_name/bot_icon through ThinkingSpinner and StreamRenderer with safe defaults that preserve current behavior. - ThinkingSpinner uses bot_name in its status text - StreamRenderer header is "<icon> <name>" when icon is set, or just "<name>" when icon is empty - Removes the now-unused __logo__ import (the cat emoji is the default value of bot_icon, not a hardcoded constant)	2026-05-11 11:50:18 +08:00
Alfredo Arenas	fcf9d110dd	feat(schema): add bot_name and bot_icon to AgentDefaults (#3650 ) Two new fields with safe defaults that preserve current branding: - bot_name: str = "nanobot" - bot_icon: str = "🐈" Empty string for bot_icon is allowed and lets users opt out of the leading icon. camelCase keys (botName, botIcon) bind via the existing to_camel alias generator.	2026-05-11 11:50:18 +08:00
Alfredo Arenas	dfb013659a	test(cli): add tests for configurable bot identity (#3650 ) Six tests covering: - AgentDefaults preserves 'nanobot' and the cat icon by default - camelCase config keys (botName/botIcon) bind to the new fields - Empty bot_icon is accepted (opt-out of the leading icon) - ThinkingSpinner uses bot_name in its status text - StreamRenderer header combines icon and name when icon is set - StreamRenderer header is just the name when icon is empty	2026-05-11 11:50:18 +08:00
barreler126	046d0831ef	feat: add NVIDIA NIM provider support	2026-05-11 01:25:44 +08:00
chengyongru	a6e993df25	fix(agent): move archived summary into system prompt for KV cache stability - Append [Archived Context Summary] to system prompt instead of injecting it into the user message runtime context, improving KV cache reuse across turns and avoiding consecutive same-role messages. - _last_summary persists in metadata (no pop) for restart survival; summary is re-injected every turn via the stable system prompt. - Remove dynamic "Inactive for X minutes" from _format_summary — use static last_active timestamp instead to preserve KV cache stability. - Pass session_summary through build_messages() so both normal and ask_user paths receive the archived summary in the system prompt. - estimate_session_prompt_tokens now reads _last_summary from metadata to include the summary in token budget estimation. - Remove obsolete session_summary parameter from maybe_consolidate_by_tokens and estimate_session_prompt_tokens call sites in loop.py (summary flows through build_messages instead). - Ensure /new (session.clear()) clears _last_summary from metadata.	2026-05-11 01:25:15 +08:00
chengyongru	73a8d8a875	fix(utils): remove unreachable dead code in find_legal_message_start The for loop at line 168 never executes because start is assigned i + 1 immediately before slicing messages[start : i + 1], which is always an empty list. Remove the dead code. Fixes #3716	2026-05-09 18:53:13 +08:00
chengyongru	de13e72e15	refactor(loop): log turn completion with state count	2026-05-09 17:15:23 +08:00
chengyongru	728d837e4e	refactor(loop): add turn_id for trace correlation - TurnContext now carries a turn_id (session_key:time_ns) - All state transition debug logs include [turn_id] prefix - RuntimeError messages also include turn_id for observability	2026-05-09 17:15:23 +08:00
chengyongru	5327f5e1a0	refactor(loop): event-driven state transitions + trace logging - State handlers now return event strings ('ok', 'dispatch', 'shortcut') - Driver loop uses _TRANSITIONS lookup table: (state, event) -> next_state - State graph is centralized and visible at a glance - Added StateTraceEntry to record per-state timing and events - Driver loop logs state duration + event at debug level - Exception paths are traced with error field for observability	2026-05-09 17:15:23 +08:00
chengyongru	6ef1b2c842	refactor(loop): address code review nits - Fix _assemble_outbound on_stream type annotation (Callable[[str], Awaitable[None]] \| None) - Use last_msg consistently in _state_save instead of re-indexing - Remove dead fallback in _state_respond (guaranteed non-None by _state_save) - Change pending_summary type from Any to str \| None - Make session optional in TurnContext to avoid redundant fetch - Add defensive dispatch with RuntimeError for missing handlers	2026-05-09 17:15:23 +08:00
chengyongru	8a6b769219	refactor(loop): fix line length in state handlers	2026-05-09 17:15:23 +08:00
chengyongru	02443ca208	refactor(loop): convert _process_message to functional state machine - Extract TurnState enum and TurnContext dataclass - Extract state handlers: _state_restore, _state_compact, _state_command, _state_build, _state_run, _state_save, _state_respond - Extract _process_system_message for system message short-circuit - Driver loop uses getattr dispatch over explicit state transitions - Preserve all existing behavior (794 tests passing)	2026-05-09 17:15:23 +08:00
chengyongru	9fb9f53147	refactor(loop): add TurnState and TurnContext	2026-05-09 17:15:23 +08:00
chengyongru	88cf8db164	refactor(loop): extract _assemble_outbound	2026-05-09 17:15:23 +08:00
chengyongru	0124c94d19	refactor(loop): extract _build_initial_messages	2026-05-09 17:15:23 +08:00
chengyongru	ce52070fcf	refactor(loop): extract _persist_user_message_early	2026-05-09 17:15:23 +08:00
chengyongru	d2cb8ac17f	refactor(loop): extract _build_retry_wait_callback	2026-05-09 17:15:23 +08:00

1 2 3 4 5 ...

2410 Commits