2301 Commits

Author SHA1 Message Date
Xubin Ren
1813fc5021 test(telegram): cover silent allowlist rejection
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-05 23:16:36 +08:00
DG Multica
5aa61e08d3 fix(telegram): ignore unauthorized users silently 2026-05-05 23:16:36 +08:00
futurist
358997554c fix-feishu-media-path 2026-05-05 22:28:44 +08:00
Jiajun Xie
9fa90b1034 fix: only advance dream_cursor on completed batches to prevent silent loss 2026-05-05 22:22:40 +08:00
chengyongru
c30e4d86f3 refactor(agent): simplify subagent concurrency with rejection over semaphore
Replace the asyncio.Semaphore queueing approach with a simple count
check in SpawnTool.execute(). When the concurrency limit is reached,
the tool returns an error string so the agent can perceive the reason
and adjust its behavior instead of silently queueing.

- Remove max_concurrent_subagents parameter threading through
  AgentLoop, commands.py, and nanobot.py
- SubagentManager reads the limit directly from AgentDefaults
- SpawnTool checks get_running_count() before calling spawn()
- Simplify tests to verify rejection behavior
2026-05-05 22:22:04 +08:00
04cb
9d6afd86b5 fix(provider): backfill DeepSeek reasoning_content instead of dropping history (#3554, #3584) 2026-05-04 12:14:38 +08:00
chengyongru
3ceabdecd5 feat(cli): support github-copilot in provider logout
Logout previously claimed to support github-copilot in --help text but had
no registered handler, so `provider logout github-copilot` failed with
"Logout not implemented". Add the handler, sharing token deletion with the
codex flow via `_delete_oauth_files`. Tighten handler-table types, fix the
codex test fixture filename, and cover github-copilot plus the unknown
provider path.
2026-05-04 12:10:06 +08:00
mikaku9944
807b8188e3 style(cli): use English for docstrings in oauth commands 2026-05-04 12:10:06 +08:00
mikaku9944
387988b8e9 feat(cli): add provider logout command
- Implement \
anobot provider logout <provider>\ to clear OAuth credentials.
- Add \_LOGOUT_HANDLERS\ registration mechanism mirroring login.
- Implement logout for \openai-codex\ by deleting local \oauth-cli-kit\ token and lock files.
- Fallback gracefully when attempting to logout from providers lacking local credentials or implementations.
- Fixes #2665
2026-05-04 12:10:06 +08:00
yorkhellen
0f32c0451e fix: support WhatsApp voice message download 2026-05-04 11:44:25 +08:00
Xubin Ren
614b21368f fix(agent): tighten safety guard edge cases
Keep the /dev workspace guard exception scoped to the known benign device paths already handled by ExecTool, and add coverage that non-benign /dev targets still get blocked. Also add a streaming regression for tool_error responses so fatal tool failures are delivered by channels instead of being marked as already streamed.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-04 01:25:52 +08:00
chengyongru
d3689d143c fix(agent): prevent safety guard false positives and streamed message drop
Three independent fixes for issues exposed by PR #3493:

1. shell.py: allow /dev/* paths in workspace guard
   Commands like `rm file.txt 2>/dev/null` were blocked because
   _extract_absolute_paths captured /dev/null as a path outside
   the workspace. Allow /dev like media_path is already allowed.

2. shell.py: remove | from home_paths regex prefix
   Loki query operator `|~` was misinterpreted as pipe + home
   directory, causing false workspace violation errors.

3. loop.py: change _streamed from blacklist to whitelist
   stop_reason "tool_error" was not in the exclusion set
   {"ask_user", "error"}, so _streamed=True was set on fatal
   errors. channel manager then skipped channel.send() because
   it assumed the content was already streamed — but it never
   was. Whitelist to only {"stop", "end_turn", "max_tokens"}.

Also fixes a pre-existing Windows bug in _spawn where
create_subprocess_exec + list2cmdline breaks commands with
paths containing spaces (e.g. D:\Program Files\python.exe).

Closes: #3599, #3605
2026-05-04 01:25:52 +08:00
Xubin Ren
2a7433b7ec chore(runner): tighten workspace guard comments and Windows tests
Keep the workspace-boundary changes easier to review by trimming long explanatory comments down to short local notes. Also make the #3599 POSIX command regression skip on Windows and normalize workspace violation signatures to POSIX separators so the throttle tests are platform-stable.

Tests:
- uv run pytest tests/tools/test_exec_security.py tests/utils/test_workspace_violation_throttle.py -q
- uv run pytest -q

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-04 01:18:39 +08:00
Xubin Ren
b8406be215 fix(runner): soft workspace boundary + per-target throttle (#3493 #3599 #3605)
Replaces PR #3493's blanket fatal abort with a "tell the model + throttle
the bypass loop" policy.  Workspace-bound rejections are now ordinary
recoverable tool errors enriched with a structured "this is a hard policy
boundary" instruction; SSRF stays the only marker that aborts the turn.

Why the fatal-abort approach broke
----------------------------------
PR #3493 promoted every shell `_guard_command` and filesystem path-resolution
rejection to a turn-fatal RuntimeError.  Two of those messages (`path
outside working dir` and `path traversal detected`) are heuristic substring
scans on the raw command, so legitimate commands like `rm <ws>/x.txt
2>/dev/null` or `find . -type f` killed the user's turn (#3599).  On
channels with outbound dedupe (Telegram) the user just saw silence (#3605),
and the noise polluted the LLM's context until it started hallucinating
guard rejections on plain relative paths (#3597).

Why we still need *some* throttle
---------------------------------
The original #3493 pain point was real: the LLM, refused once, would
swap tools and try again -- read_file -> exec cat -> exec cp -> bash -c
-> ln -sf -> python -c open(...).  Just removing the fatal escape lets
that loop run wild until max_iterations.

What this commit does
---------------------
- `nanobot/utils/runtime.py`: add `workspace_violation_signature` and
  `repeated_workspace_violation_error`.  The signature normalizes
  filesystem `path` arguments and the first absolute path inside an
  exec command, so swapping tools against the same outside target hits
  the same throttle bucket.  Two soft attempts are allowed; the third
  attempt's tool result is replaced with a hard "stop trying to bypass"
  message that quotes the target path and tells the model to ask the
  user for help.

- `nanobot/agent/runner.py`: split classification into `_is_ssrf_violation`
  (still fatal) and `_is_workspace_violation` (now soft).  All three
  failure branches in `_run_tool` (prep_error / exception / Error
  result) route through a shared `_classify_violation` that bumps the
  per-turn workspace_violation_counts dict and either keeps the tool's
  own message or substitutes the throttle escalation.  `_execute_tools`
  now threads that dict alongside the existing external_lookup_counts.

- `nanobot/agent/tools/shell.py`: append a structured boundary note to
  every workspace-bound guard rejection (`working_dir could not be
  resolved`, `working_dir is outside`, `path outside working dir`,
  `path traversal detected`).  SSRF errors stay short and direct so the
  model doesn't try to "phrase around" them.  Existing `2>/dev/null`
  allow-list and benign device passthrough from the previous commit
  remain.

- `nanobot/agent/tools/filesystem.py`: append the same boundary note to
  the `outside allowed directory` PermissionError so read_file / write_file
  / list_dir errors give the LLM the same explicit hint.

Tests
-----
- `tests/utils/test_workspace_violation_throttle.py` (new): signature
  collapses across read_file/exec/python -c against the same path,
  different paths get independent budgets, escalation only fires after
  the third attempt.

- `tests/agent/test_runner.py`:
  - `test_runner_does_not_abort_on_workspace_violation_anymore` -- v2
    contract: filesystem PermissionError is now soft, runner moves to
    the next iteration and finalizes cleanly.
  - `test_is_ssrf_violation_remains_fatal` + the existing
    `test_runner_aborts_on_ssrf_violation` -- SSRF still aborts on the
    first attempt.
  - `test_runner_lets_llm_recover_from_shell_guard_path_outside` -- end
    to end recovery from `path outside working dir`.
  - `test_runner_throttles_repeated_workspace_bypass_attempts` -- four
    bypass attempts against the same outside target produce at least
    one `workspace_violation_escalated` event and the run completes
    naturally without aborting the turn.
  - The two `_execute_tools` direct-call tests now pass the new
    workspace_violation_counts dict.

- `tests/tools/test_tool_validation.py`: relax three `==` assertions
  to `startswith` + "hard policy boundary" substring check to match
  the new structured error messages.

- `tests/tools/test_exec_security.py` keeps the prior `2>/dev/null`
  regression and the `> /etc/issue` negative case from the previous
  commit on this branch -- they still pass under the new policy.

Coverage status: full pytest 2648 passed / 2 skipped (was 2638 / 2
on origin/main).  Ruff is clean for every file touched in this commit.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-04 01:18:39 +08:00
Xubin Ren
7742f8fbdc fix(runner): narrow workspace_violation fatal classification (#3599, helps #3605 #3597)
PR #3493 promoted every shell `_guard_command` rejection to a turn-fatal
RuntimeError. The two heuristic outputs in that list -- `path outside
working dir` and `path traversal detected` -- routinely false-positive on
benign constructs (e.g. `2>/dev/null`, quoted `..` arguments to sed/find,
absolute paths inside inline scripts), so legitimate workspace commands
silently kill the user's turn (#3599) and the agent never gets a chance
to retry with a different approach (#3605).

Two changes, both narrowly scoped:

- `ExecTool._guard_command` now skips a small allow-list of kernel device
  files (`/dev/null`, the standard streams, `/dev/random`, `/dev/fd/N`,
  ...) before the workspace path check, matched against the pre-resolve
  string so symlinks like `/dev/stderr -> /proc/self/fd/2` still hit the
  allow-list. Real outside writes such as `> /etc/issue` remain blocked.
- `AgentRunner._WORKSPACE_BLOCK_MARKERS` keeps only the four hard
  path-resolution errors from filesystem.py / shell.py and the SSRF
  marker. The two heuristic substrings move out of the fatal list, so
  the LLM sees them as ordinary tool errors and can self-correct in the
  next iteration. SSRF stays fatal because retrying an internal URL
  with a different phrasing would defeat the safety boundary.

Tests:
- `tests/tools/test_exec_security.py`: parametrized regression for the
  exact #3599 command sample plus other stdio redirects and device
  reads; explicit negative case asserts `> /etc/issue` is still blocked.
- `tests/agent/test_runner.py`: `_is_workspace_violation` no longer
  fatals on the two heuristic markers, plus an end-to-end case proving
  the runner hands the guard error back to the LLM and finalizes the
  next turn cleanly.
2026-05-04 01:18:39 +08:00
Xubin Ren
9a9e446f3f fix(cron): clean persistence lint issues
Keep the cron persistence hardening clean under ruff without changing behavior.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-04 00:16:39 +08:00
hussein1362
75c2506c07 fix(cron): atomic write for jobs.json + don't silently overwrite corrupt store
Two related bugs that together caused scheduled jobs to disappear after
a container restart:

1. `_save_store()` used `Path.write_text(...)`, which truncates the
   destination in place.  A SIGKILL or shutdown mid-write left
   `jobs.json` either truncated or corrupt.

2. `_load_jobs()` caught any parse error, logged at WARNING, and
   returned an empty list.  `start()` then called `_save_store()`
   immediately, overwriting the corrupt-but-recoverable file with an
   empty job array.  Every scheduled job was silently lost with only a
   single warning line in the log.

Reproduction in production: container restart at 18:08, after which a
job that had fired correctly for two consecutive days never fired
again.  jobs.json on disk was missing the job entirely.

Fix:
- `_save_store()` now writes via temp file + `os.replace` + `fsync`
  (matches the session manager pattern from 512bf59,
  "fix(session): fsync sessions on graceful shutdown to prevent data
  loss").  An interrupted write cannot corrupt the live file.
- `_load_jobs()` now moves a corrupt store aside as
  `jobs.json.corrupt-<ts>` and returns `None` instead of `[]`.
- `start()` aborts with a `RuntimeError` when the on-disk store is
  corrupt, instead of starting empty and overwriting.
- `_load_store()` falls back to the previous in-memory snapshot when
  a hot reload encounters a corrupt file, so a transient corruption
  after start does not drop live jobs.

Tests cover the atomic-write path, the corrupt-file preservation,
the start-time refusal, the in-memory fallback, and a basic save/load
round trip across two service instances.  Existing 79 cron tests and
full suite (2553 tests) still pass.
2026-05-04 00:16:39 +08:00
Xubin Ren
66682eb46f test(cli): cover retry-wait interactive routing
Keep provider retry wait messages on the interactive progress path so they do not fall through as assistant responses.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-03 22:59:08 +08:00
04cb
c15d816d9c fix(cli): intercept _retry_wait so provider retry messages don't garble interactive output (#3600) 2026-05-03 22:59:08 +08:00
Xubin Ren
7faa339902 fix(webui): keep existing package lockfile
Restore the npm lockfile that is already present on main so this PR only carries the WebUI turn-completion changes.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-03 22:28:40 +08:00
Xubin Ren
96da6d8190 fix(webui): tighten turn completion handling
Keep the new turn-end signal scoped to WebSocket clients, preserve pending tool-call state across trailing tool result rows, and drop the accidental npm lockfile from the Bun-based WebUI.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-03 22:28:40 +08:00
ramonpaolo
be83525f99 test(webui): cover turn-end streaming regressions 2026-05-03 22:28:40 +08:00
ramonpaolo
08744ce408 fix(webui): isolate thread cache during chat switches 2026-05-03 22:28:40 +08:00
ramonpaolo
76e3f74df7 feat(webui): improve beta turn completion and streaming UX 2026-05-03 22:28:40 +08:00
chengyongru
5853d5dfda
fix: allow_patterns take priority over deny_patterns in ExecTool (#3594)
* fix: allow_patterns take priority over deny_patterns in ExecTool

Previously deny_patterns were checked first with no bypass, meaning
allow_patterns could never exempt commands from the built-in deny list.
This made it impossible to whitelist destructive commands for specific
directories (e.g. build/cleanup tasks).

Changes:
- shell.py: check allow_patterns first; if matched, skip deny check
- shell.py: deny_patterns now appends to built-in list (not replaces)
- schema.py: add allow_patterns/deny_patterns to ExecToolConfig
- loop.py/subagent.py: pass allow_patterns/deny_patterns to ExecTool
- Add test_exec_allow_patterns.py covering priority semantics

* fix: separate deny pattern errors from workspace violation detection

The deny pattern error message "Command blocked by safety guard" was
included in _WORKSPACE_BLOCK_MARKERS, causing deny_pattern blocks to be
misclassified as fatal workspace violations. This meant LLMs had no
chance to retry with a different command — the turn was aborted
immediately.

Changes:
- shell.py: deny/allowlist error messages now use distinct phrasing
  ("blocked by deny pattern filter" / "blocked by allowlist filter")
- runner.py: remove "blocked by safety guard" from
  _WORKSPACE_BLOCK_MARKERS so deny_pattern errors are treated as normal
  tool errors (LLM can retry) instead of fatal violations
- workspace path errors still use "blocked by safety guard" and remain
  fatal as intended

* fix: update test assertions to match new deny pattern error message

* fix: indentation error in test file

* fix: restore SSRF fatal classification and tidy exec pattern plumbing

Address review feedback on the deny/allow_patterns rework:

- runner.py: re-add "internal/private url detected" to
  _WORKSPACE_BLOCK_MARKERS. The earlier marker removal also stripped
  fatal classification from SSRF / internal-URL rejections (whose
  message still says "blocked by safety guard"), turning a hard
  security boundary into something the LLM could retry.
- loop.py / subagent.py: drop `or None` between ExecToolConfig and
  ExecTool. The schema default is an empty list and ExecTool already
  normalizes None back to [], so the indirection was a no-op.
- shell.py: extract `explicitly_allowed` flag in _guard_command so
  allow_patterns are scanned once instead of twice and the control
  flow no longer relies on a no-op `pass + else` branch.
- tests/agent/test_runner.py: add a regression test asserting that
  the SSRF block message is treated as fatal, while deny/allowlist
  filter messages are deliberately non-fatal.

* fix: remove unused exec allow-pattern test import

Keep the new ExecTool allow-pattern coverage clean under ruff.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Xubin Ren <xubinrencs@gmail.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-03 00:27:17 +08:00
Xubin Ren
2fa15ccf1b fix: improve media failure diagnostics and token fallback coverage 2026-05-02 11:37:07 +00:00
Xubin Ren
fde530de01 refactor(setup): enhance SKILL.md for upgrade process clarity 2026-05-02 07:40:29 +00:00
Xubin Ren
861fbb0dde fix(provider): correct LongCat OpenAI base URL
Use the SDK-ready /v1 base so LongCat chat completions hit the documented endpoint.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-02 01:52:04 +08:00
moranfong
051037ff08 feat(provider): add LongCat via OpenAI-compatible backend 2026-05-02 01:52:04 +08:00
yorkhellen
ee364c6ac1 fix(helpers): restore tiktoken fallback in estimate_prompt_tokens_chain 2026-05-02 00:07:45 +08:00
Xubin Ren
fd1a5a6267 test(provider): tidy Anthropic fallback imports
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-01 23:59:24 +08:00
coldxiangyu
4c54a2b153 fix(anthropic): auto-fallback to stream on long-request error
The Anthropic SDK raises a client-side ValueError when a non-streaming
`messages.create` call could exceed the 10-minute server timeout (e.g.
high `max_tokens` combined with extended thinking budget). The error
text "Streaming is required for operations that may take longer than
10 minutes" was bubbling up to the user as an opaque LLM error in
channels that use the non-stream path (e.g. wecom in #2709).

Detect this specific ValueError in `chat()` and transparently retry
through `chat_stream()` (without `on_content_delta` so behavior matches
the non-stream contract). Other ValueErrors continue to flow through
`_handle_error` unchanged.

Closes #2709

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 23:59:24 +08:00
coldxiangyu
4860a9a6c9 fix(matrix): stop sync loop on irrecoverable auth errors
When the Matrix homeserver returns M_UNKNOWN_TOKEN / M_FORBIDDEN /
M_UNAUTHORIZED (or soft_logout), the previous _sync_loop kept retrying
sync_forever every 2 seconds forever, spamming the homeserver and
filling logs (#1851). The auth state cannot recover by retrying, so
this is pure noise and a soft DoS on the homeserver.

- Extract `_is_fatal_auth_response()` helper
- In `_on_sync_error`, on fatal auth: set `_running=False` and call
  `stop_sync_forever()` so the loop exits cleanly
- Add exponential backoff (2s → 60s cap) to the generic exception path
  in `_sync_loop` so transient network blips also stop hammering

Closes #1851

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 23:59:09 +08:00
Xubin Ren
539d82eadc test(tools): accept spawn origin message context
Made-with: Cursor
2026-05-01 20:09:59 +08:00
Xubin Ren
188e6df757 fix(utils): cover complete trailing think markers
Made-with: Cursor
2026-05-01 20:09:59 +08:00
bravel
2c397ad442 fix: strip partial think tags in streaming output 2026-05-01 20:09:59 +08:00
Xubin Ren
aea5948b11 fix(tools): tighten web fetch URL cleaning
Made-with: Cursor
2026-05-01 19:58:19 +08:00
彭星杰
5dc96505e8 fix(web_fetch): sanitize URL to strip markdown backticks and quotes before validation
LLM-generated tool calls may wrap URLs in markdown backticks or quotes
(e.g. \https://example.com\), causing urlparse to produce empty scheme
and netloc, which leads to all fetch attempts failing silently.

Add URL cleaning at the top of WebFetchTool.execute to strip whitespace,
backticks, double quotes, and single quotes, plus an early rejection guard
for non-http(s) URLs after cleaning.
2026-05-01 19:58:19 +08:00
Xubin Ren
43a58335f6 fix(provider): narrow DeepSeek reasoning history cleanup
Made-with: Cursor
2026-05-01 19:52:38 +08:00
Jiajun Xie
8ca575bdeb fix: adjust DeepSeek reasoning mode check condition
- Modified _drop_deepseek_incomplete_reasoning_history to properly handle reasoning mode detection
- Fixes issue #3554
2026-05-01 19:52:38 +08:00
Xubin Ren
e16fa7c6b1
Merge PR #3561: fix: origin_message_id support and outbound deduplication
fix: origin_message_id support and outbound deduplication
2026-05-01 19:52:10 +08:00
Xubin Ren
e157392250 fix(agent): scope subagent reply dedupe to origin message
Made-with: Cursor
2026-05-01 11:47:24 +00:00
yorkhellen
08f326ec55 test: Add tests for sender_id runtime context injection 2026-05-01 19:43:38 +08:00
yorkhellen
c4170fa9ba feat: Add sender_id to LLM runtime context 2026-05-01 19:43:38 +08:00
hanyuanling
1040124ede Fix API stream lifecycle for tool-backed requests 2026-05-01 19:42:52 +08:00
liuZhou
73840b0af6 fix(matrix): remove tuple default from allow_room_mentions 2026-05-01 19:41:58 +08:00
hinotoi-agent
ad952e0da2 fix(dingtalk): block SSRF in outbound media fetches 2026-05-01 19:31:45 +08:00
copilot-swe-agent[bot]
0284174df9 fix: prevent empty Matrix messages when progress callback sends empty content
Agent-Logs-Url: https://github.com/halldorjanetzko/nanobot/sessions/df528c59-8214-41a0-9b79-9d1d41857107

Co-authored-by: halldorjanetzko <158819146+halldorjanetzko@users.noreply.github.com>
2026-05-01 19:31:04 +08:00
coldxiangyu
15007afd4a fix(matrix): skip events received before bot startup
Matrix sync replays the room timeline on each startup or `/restart`,
causing already-handled messages to be reprocessed (#3553). Even with
`store_sync_tokens=True`, the sync token isn't reliably re-injected
when restoring a session via access_token + load_store(), so the
client re-reads recent timeline entries.

Filter `event.server_timestamp` against the process start time so old
events are dropped at the `_on_message` / `_on_media_message` entry
points. Trade-off: messages received during downtime won't be
processed, which matches the issue reporter's expectation.

Closes #3553

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 19:30:33 +08:00
Jack Lu
d9800ecdd2 refactor: replace try-except blocks with contextlib.suppress for cleaner error handling across multiple files 2026-05-01 19:30:11 +08:00