nanobot

mirror of https://github.com/HKUDS/nanobot.git synced 2026-05-07 02:05:51 +00:00

Author	SHA1	Message	Date
Xubin Ren	614b21368f	fix(agent): tighten safety guard edge cases Keep the /dev workspace guard exception scoped to the known benign device paths already handled by ExecTool, and add coverage that non-benign /dev targets still get blocked. Also add a streaming regression for tool_error responses so fatal tool failures are delivered by channels instead of being marked as already streamed. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-04 01:25:52 +08:00
chengyongru	d3689d143c	fix(agent): prevent safety guard false positives and streamed message drop Three independent fixes for issues exposed by PR #3493: 1. shell.py: allow /dev/* paths in workspace guard Commands like `rm file.txt 2>/dev/null` were blocked because _extract_absolute_paths captured /dev/null as a path outside the workspace. Allow /dev like media_path is already allowed. 2. shell.py: remove \| from home_paths regex prefix Loki query operator `\|~` was misinterpreted as pipe + home directory, causing false workspace violation errors. 3. loop.py: change _streamed from blacklist to whitelist stop_reason "tool_error" was not in the exclusion set {"ask_user", "error"}, so _streamed=True was set on fatal errors. channel manager then skipped channel.send() because it assumed the content was already streamed — but it never was. Whitelist to only {"stop", "end_turn", "max_tokens"}. Also fixes a pre-existing Windows bug in _spawn where create_subprocess_exec + list2cmdline breaks commands with paths containing spaces (e.g. D:\Program Files\python.exe). Closes: #3599, #3605	2026-05-04 01:25:52 +08:00
Xubin Ren	b8406be215	fix(runner): soft workspace boundary + per-target throttle (#3493 #3599 #3605 ) Replaces PR #3493's blanket fatal abort with a "tell the model + throttle the bypass loop" policy. Workspace-bound rejections are now ordinary recoverable tool errors enriched with a structured "this is a hard policy boundary" instruction; SSRF stays the only marker that aborts the turn. Why the fatal-abort approach broke ---------------------------------- PR #3493 promoted every shell `_guard_command` and filesystem path-resolution rejection to a turn-fatal RuntimeError. Two of those messages (`path outside working dir` and `path traversal detected`) are heuristic substring scans on the raw command, so legitimate commands like `rm <ws>/x.txt 2>/dev/null` or `find . -type f` killed the user's turn (#3599). On channels with outbound dedupe (Telegram) the user just saw silence (#3605), and the noise polluted the LLM's context until it started hallucinating guard rejections on plain relative paths (#3597). Why we still need some throttle --------------------------------- The original #3493 pain point was real: the LLM, refused once, would swap tools and try again -- read_file -> exec cat -> exec cp -> bash -c -> ln -sf -> python -c open(...). Just removing the fatal escape lets that loop run wild until max_iterations. What this commit does --------------------- - `nanobot/utils/runtime.py`: add `workspace_violation_signature` and `repeated_workspace_violation_error`. The signature normalizes filesystem `path` arguments and the first absolute path inside an exec command, so swapping tools against the same outside target hits the same throttle bucket. Two soft attempts are allowed; the third attempt's tool result is replaced with a hard "stop trying to bypass" message that quotes the target path and tells the model to ask the user for help. - `nanobot/agent/runner.py`: split classification into `_is_ssrf_violation` (still fatal) and `_is_workspace_violation` (now soft). All three failure branches in `_run_tool` (prep_error / exception / Error result) route through a shared `_classify_violation` that bumps the per-turn workspace_violation_counts dict and either keeps the tool's own message or substitutes the throttle escalation. `_execute_tools` now threads that dict alongside the existing external_lookup_counts. - `nanobot/agent/tools/shell.py`: append a structured boundary note to every workspace-bound guard rejection (`working_dir could not be resolved`, `working_dir is outside`, `path outside working dir`, `path traversal detected`). SSRF errors stay short and direct so the model doesn't try to "phrase around" them. Existing `2>/dev/null` allow-list and benign device passthrough from the previous commit remain. - `nanobot/agent/tools/filesystem.py`: append the same boundary note to the `outside allowed directory` PermissionError so read_file / write_file / list_dir errors give the LLM the same explicit hint. Tests ----- - `tests/utils/test_workspace_violation_throttle.py` (new): signature collapses across read_file/exec/python -c against the same path, different paths get independent budgets, escalation only fires after the third attempt. - `tests/agent/test_runner.py`: - `test_runner_does_not_abort_on_workspace_violation_anymore` -- v2 contract: filesystem PermissionError is now soft, runner moves to the next iteration and finalizes cleanly. - `test_is_ssrf_violation_remains_fatal` + the existing `test_runner_aborts_on_ssrf_violation` -- SSRF still aborts on the first attempt. - `test_runner_lets_llm_recover_from_shell_guard_path_outside` -- end to end recovery from `path outside working dir`. - `test_runner_throttles_repeated_workspace_bypass_attempts` -- four bypass attempts against the same outside target produce at least one `workspace_violation_escalated` event and the run completes naturally without aborting the turn. - The two `_execute_tools` direct-call tests now pass the new workspace_violation_counts dict. - `tests/tools/test_tool_validation.py`: relax three `==` assertions to `startswith` + "hard policy boundary" substring check to match the new structured error messages. - `tests/tools/test_exec_security.py` keeps the prior `2>/dev/null` regression and the `> /etc/issue` negative case from the previous commit on this branch -- they still pass under the new policy. Coverage status: full pytest 2648 passed / 2 skipped (was 2638 / 2 on origin/main). Ruff is clean for every file touched in this commit. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-04 01:18:39 +08:00
Xubin Ren	5badb75f6c	review: tighten scope and add regression tests Follow-ups from review of #3194: - ci.yml: drop unconditional --ignore=tests/channels/test_matrix_channel.py. That test file already calls pytest.importorskip("nio") at module top, so it self-skips on Windows (where nio isn't installed) without also hiding 62 tests from Linux CI. - filesystem.py: hoist `import os` to the module top and drop the duplicate inline import in ReadFileTool.execute. Document the CRLF->LF normalization as intentional (primarily a Windows UX fix so downstream StrReplace/Grep match consistently regardless of where the file was written). - test_read_enhancements.py: lock down two new behaviors * TestFileStateHashFallback: check_read warns when content changes but mtime is unchanged (coarse-mtime filesystems on Windows). * TestReadFileLineEndingNormalization: ReadFileTool strips CRLF and preserves LF-only files untouched. - test_tool_validation.py: restore list2cmdline/shlex.quote in test_exec_head_tail_truncation. The temp_path-based form was correct, but dropping the quoting broke on any Windows path containing spaces (e.g. C:\Users\John Doe\...). CI runners happen not to have spaces so this slipped through. Tests: 1993 passed locally. Made-with: Cursor	2026-04-17 16:11:37 +08:00
Jiajun Xie	3db2eb66e4	ci: add Windows and Python 3.14 support	2026-04-17 16:11:37 +08:00
Jack Lu	bcb8352235	refactor(agent): streamline hook method calls and enhance error logging - Introduced a helper method `_for_each_hook_safe` to reduce code duplication in hook method implementations. - Updated error logging to include the method name for better traceability. - Improved the `SkillsLoader` class by adding a new method `_skill_entries_from_dir` to simplify skill listing logic. - Enhanced skill loading and filtering logic, ensuring workspace skills take precedence over built-in ones. - Added comprehensive tests for `SkillsLoader` to validate functionality and edge cases.	2026-04-06 02:51:10 +08:00
Xubin Ren	05fe7d4fb1	fix(tools): isolate decorated tool schemas and add regression tests	2026-04-04 19:58:44 +08:00
Jack Lu	e7798a28ee	refactor(tools): streamline Tool class and add JSON Schema for parameters Refactor Tool methods and type handling; introduce JSON Schema support for tool parameters (schema module, validation tests). Made-with: Cursor	2026-04-04 19:58:44 +08:00
Xubin Ren	9840270f7f	test(tools): cover media dir access under workspace restriction Made-with: Cursor	2026-04-04 03:03:58 +08:00
Xubin Ren	485c75e065	test(exec): verify windows drive-root workspace guard	2026-04-02 04:00:03 +08:00
zhangxiaoyu.york	bc2e474079	Fix ExecTool to block root directory paths when restrict_to_workspace is enabled	2026-04-02 04:00:03 +08:00
chengyongru	72acba5d27	refactor(tests): optimize unit test structure	2026-03-24 15:12:22 +08:00

12 Commits