nanobot/tests/utils/test_workspace_violation_throttle.py
Xubin Ren b8406be215 fix(runner): soft workspace boundary + per-target throttle (#3493 #3599 #3605)
Replaces PR #3493's blanket fatal abort with a "tell the model + throttle
the bypass loop" policy.  Workspace-bound rejections are now ordinary
recoverable tool errors enriched with a structured "this is a hard policy
boundary" instruction; SSRF stays the only marker that aborts the turn.

Why the fatal-abort approach broke
----------------------------------
PR #3493 promoted every shell `_guard_command` and filesystem path-resolution
rejection to a turn-fatal RuntimeError.  Two of those messages (`path
outside working dir` and `path traversal detected`) are heuristic substring
scans on the raw command, so legitimate commands like `rm <ws>/x.txt
2>/dev/null` or `find . -type f` killed the user's turn (#3599).  On
channels with outbound dedupe (Telegram) the user just saw silence (#3605),
and the noise polluted the LLM's context until it started hallucinating
guard rejections on plain relative paths (#3597).

Why we still need *some* throttle
---------------------------------
The original #3493 pain point was real: the LLM, refused once, would
swap tools and try again -- read_file -> exec cat -> exec cp -> bash -c
-> ln -sf -> python -c open(...).  Just removing the fatal escape lets
that loop run wild until max_iterations.

What this commit does
---------------------
- `nanobot/utils/runtime.py`: add `workspace_violation_signature` and
  `repeated_workspace_violation_error`.  The signature normalizes
  filesystem `path` arguments and the first absolute path inside an
  exec command, so swapping tools against the same outside target hits
  the same throttle bucket.  Two soft attempts are allowed; the third
  attempt's tool result is replaced with a hard "stop trying to bypass"
  message that quotes the target path and tells the model to ask the
  user for help.

- `nanobot/agent/runner.py`: split classification into `_is_ssrf_violation`
  (still fatal) and `_is_workspace_violation` (now soft).  All three
  failure branches in `_run_tool` (prep_error / exception / Error
  result) route through a shared `_classify_violation` that bumps the
  per-turn workspace_violation_counts dict and either keeps the tool's
  own message or substitutes the throttle escalation.  `_execute_tools`
  now threads that dict alongside the existing external_lookup_counts.

- `nanobot/agent/tools/shell.py`: append a structured boundary note to
  every workspace-bound guard rejection (`working_dir could not be
  resolved`, `working_dir is outside`, `path outside working dir`,
  `path traversal detected`).  SSRF errors stay short and direct so the
  model doesn't try to "phrase around" them.  Existing `2>/dev/null`
  allow-list and benign device passthrough from the previous commit
  remain.

- `nanobot/agent/tools/filesystem.py`: append the same boundary note to
  the `outside allowed directory` PermissionError so read_file / write_file
  / list_dir errors give the LLM the same explicit hint.

Tests
-----
- `tests/utils/test_workspace_violation_throttle.py` (new): signature
  collapses across read_file/exec/python -c against the same path,
  different paths get independent budgets, escalation only fires after
  the third attempt.

- `tests/agent/test_runner.py`:
  - `test_runner_does_not_abort_on_workspace_violation_anymore` -- v2
    contract: filesystem PermissionError is now soft, runner moves to
    the next iteration and finalizes cleanly.
  - `test_is_ssrf_violation_remains_fatal` + the existing
    `test_runner_aborts_on_ssrf_violation` -- SSRF still aborts on the
    first attempt.
  - `test_runner_lets_llm_recover_from_shell_guard_path_outside` -- end
    to end recovery from `path outside working dir`.
  - `test_runner_throttles_repeated_workspace_bypass_attempts` -- four
    bypass attempts against the same outside target produce at least
    one `workspace_violation_escalated` event and the run completes
    naturally without aborting the turn.
  - The two `_execute_tools` direct-call tests now pass the new
    workspace_violation_counts dict.

- `tests/tools/test_tool_validation.py`: relax three `==` assertions
  to `startswith` + "hard policy boundary" substring check to match
  the new structured error messages.

- `tests/tools/test_exec_security.py` keeps the prior `2>/dev/null`
  regression and the `> /etc/issue` negative case from the previous
  commit on this branch -- they still pass under the new policy.

Coverage status: full pytest 2648 passed / 2 skipped (was 2638 / 2
on origin/main).  Ruff is clean for every file touched in this commit.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-04 01:18:39 +08:00

121 lines
4.2 KiB
Python

"""Tests for repeated_workspace_violation throttle and signature."""
from __future__ import annotations
from nanobot.utils.runtime import (
repeated_workspace_violation_error,
workspace_violation_signature,
)
def test_signature_for_filesystem_tools_uses_path_argument():
sig_a = workspace_violation_signature(
"read_file", {"path": "/Users/x/Downloads/01.md"}
)
sig_b = workspace_violation_signature(
"write_file", {"path": "/Users/x/Downloads/01.md"}
)
sig_c = workspace_violation_signature(
"edit_file", {"file_path": "/Users/x/Downloads/01.md"}
)
assert sig_a is not None
assert sig_a == sig_b == sig_c, (
"the throttle must collapse equivalent paths across different tools "
"so the LLM cannot bypass it by switching tool"
)
assert "/users/x/downloads/01.md" in sig_a
def test_signature_for_exec_extracts_first_absolute_path_in_command():
sig = workspace_violation_signature(
"exec",
{"command": "cat /Users/x/Downloads/01.md && echo done"},
)
assert sig is not None
assert "/users/x/downloads/01.md" in sig
def test_signature_collides_across_filesystem_and_exec_for_same_target():
"""LLM bypass loops jump tools (read_file -> exec cat). Throttle must
treat both attempts as targeting the same outside resource."""
fs_sig = workspace_violation_signature(
"read_file", {"path": "/Users/x/Downloads/01.md"}
)
exec_sig = workspace_violation_signature(
"exec", {"command": "cat /Users/x/Downloads/01.md"}
)
assert fs_sig == exec_sig
def test_signature_falls_back_to_working_dir_when_no_absolute_in_command():
sig = workspace_violation_signature(
"exec",
{"command": "ls -la", "working_dir": "/etc"},
)
assert sig is not None
assert "/etc" in sig
def test_signature_is_none_for_unknown_tool_with_no_path():
assert workspace_violation_signature("web_search", {"query": "anything"}) is None
assert workspace_violation_signature("exec", {"command": "echo hello"}) is None
def test_repeated_workspace_violation_returns_none_within_budget():
counts: dict[str, int] = {}
arguments = {"path": "/Users/x/Downloads/01.md"}
assert repeated_workspace_violation_error("read_file", arguments, counts) is None
assert repeated_workspace_violation_error("read_file", arguments, counts) is None
def test_repeated_workspace_violation_escalates_after_third_attempt():
counts: dict[str, int] = {}
arguments = {"path": "/Users/x/Downloads/01.md"}
repeated_workspace_violation_error("read_file", arguments, counts)
repeated_workspace_violation_error("read_file", arguments, counts)
third = repeated_workspace_violation_error("read_file", arguments, counts)
assert third is not None
assert "refusing repeated workspace-bypass" in third
assert "/users/x/downloads/01.md" in third
assert "ask how they want to proceed" in third
def test_repeated_workspace_violation_independent_per_target():
"""Different outside paths must each get their own retry budget."""
counts: dict[str, int] = {}
repeated_workspace_violation_error(
"read_file", {"path": "/Users/x/Downloads/01.md"}, counts,
)
repeated_workspace_violation_error(
"read_file", {"path": "/Users/x/Downloads/01.md"}, counts,
)
# Different target, fresh budget.
assert repeated_workspace_violation_error(
"read_file", {"path": "/Users/x/Documents/notes.md"}, counts,
) is None
def test_repeated_workspace_violation_collapses_tool_switching():
"""LLM switches from read_file to exec cat then to python -c open(...)
against the same path; the throttle must escalate on the third attempt."""
counts: dict[str, int] = {}
repeated_workspace_violation_error(
"read_file", {"path": "/Users/x/Downloads/01.md"}, counts,
)
repeated_workspace_violation_error(
"exec", {"command": "cat /Users/x/Downloads/01.md"}, counts,
)
third = repeated_workspace_violation_error(
"exec",
{"command": "python3 -c \"open('/Users/x/Downloads/01.md').read()\""},
counts,
)
assert third is not None
assert "refusing repeated workspace-bypass" in third