- Check both jwt and cryptography in MSTEAMS_AVAILABLE guard so
partial installs fail early with a clear message instead of at runtime
- Add aclose() to test FakeHttpClient so stop() won't crash
- Move MSTEAMS.md into README.md following the same details/summary
pattern used by every other channel
- Note in README that validateInboundAuth defaults to false
Warn when validate_inbound_auth is disabled (default) so operators are
aware the webhook accepts unverified requests. Restore pymupdf to the
dev optional-dependencies group — its removal in the original PR was
unrelated to the Teams channel feature.
PyJWT and cryptography are optional msteams deps; they should not be
bundled into the generic dev install. Tests now skip the entire file
when the deps are missing, following the dingtalk pattern.
Add a built-in tool that lets the agent inspect and modify its own
runtime state (model, iterations, context window, etc.).
Key features:
- inspect: view current config, usage stats, and subagent status
- modify: adjust parameters at runtime (protected by type/range validation)
- Subagent observability: inspect running subagent tasks (phase,
iteration, tool events, errors) — subagents are no longer a black box
- Watchdog corrects out-of-bounds values on each iteration
- Enabled by default in read-only mode (self_modify: false)
- All changes are in-memory only; restart restores defaults
- Comprehensive test suite (90 tests)
Includes a self-awareness skill (always-on) with progressive disclosure:
SKILL.md for core rules, references/examples.md for detailed scenarios.
Feishu streaming cards auto-close after 10 minutes from creation,
regardless of update activity. With resuming enabled, a single card
lives across multiple tool-call rounds and can exceed this limit,
causing the final response to be silently lost.
Remove the _resuming logic from send_delta so each tool-call round
gets its own short-lived streaming card (well under 10 min). Add a
fallback that sends a regular interactive card when the final
streaming update fails.
Keep late follow-up injections observable when they are drained during max-iteration shutdown so loop-level response suppression still makes the right decision.
Made-with: Cursor
- Migrate "after tools" inline drain to use _try_drain_injections,
completing the refactoring (all 6 drain sites now use the helper).
- Move checkpoint emission into _try_drain_injections via optional
iteration parameter, eliminating the leaky split between helper
and caller for the final-response path.
- Extract _make_injection_callback() test helper to replace 7
identical inject_cb function bodies.
- Add test_injection_cycle_cap_on_error_path to verify the cycle
cap is enforced on error exit paths.
When the agent runner exits due to LLM error, tool error, empty response,
or max_iterations, it breaks out of the iteration loop without draining
the pending injection queue. This causes leftover messages to be
re-published as independent inbound messages, resulting in duplicate or
confusing replies to the user.
Extract the injection drain logic into a `_try_drain_injections` helper
and call it before each break in the error/edge-case paths. If injections
are found, continue the loop instead of breaking. For max_iterations
(where the loop is exhausted), drain injections to prevent re-publish
without continuing.
Remove two debug log lines that fire on every idle channel check:
- "scheduling archival" (logged before knowing if there's work)
- "skipping, no un-consolidated messages" (the common no-op path)
The meaningful "archived" info log (only on real work) is preserved.
Remove two debug log lines that fire on every idle channel check:
- "scheduling archival" (logged before knowing if there's work)
- "skipping, no un-consolidated messages" (the common no-op path)
The meaningful "archived" info log (only on real work) is preserved.
When a subagent result is injected with current_role="assistant",
_enforce_role_alternation drops the trailing assistant message, leaving
only the system prompt. Providers like Zhipu/GLM reject such requests
with error 1214 ("messages parameter invalid"). Now the last popped
assistant message is recovered as a user message when no user/tool
messages remain.
Prevent proactive compaction from archiving sessions that have an
in-flight agent task, avoiding mid-turn context truncation when a
task runs longer than the idle TTL.
When a subagent result is injected with current_role="assistant",
_enforce_role_alternation drops the trailing assistant message, leaving
only the system prompt. Providers like Zhipu/GLM reject such requests
with error 1214 ("messages parameter invalid"). Now the last popped
assistant message is recovered as a user message when no user/tool
messages remain.
Prevent proactive compaction from archiving sessions that have an
in-flight agent task, avoiding mid-turn context truncation when a
task runs longer than the idle TTL.
Track text-only user messages that were flushed before the turn loop completes, then materialize an interrupted assistant placeholder on the next request so session history stays legal and later turns do not skip their own assistant reply.
Made-with: Cursor
Use session.add_message for the pre-turn user-message flush and add focused regression tests for crash-time persistence and duplicate-free successful saves.
Made-with: Cursor
The existing runtime_checkpoint mechanism preserves the in-flight
assistant/tool state if the process dies mid-turn, but the triggering
user message is only written to session history at the end of the turn
via _save_turn(). If the worker is killed (OOM, SIGKILL, a self-
triggered systemctl restart, container eviction, etc.) before the turn
completes, the user's message is silently lost: on restart, the session
log only shows the interrupted assistant turn without any record of
what the user asked. Any recovery tooling built on top of session logs
cannot reply because it has no prompt to reply to.
This patch appends the incoming user message to the session and flushes
it to disk immediately after the session is loaded and before the agent
loop runs, then adjusts the _save_turn skip offset so the final
persistence step does not duplicate it.
Limited to textual content (isinstance(msg.content, str)); list-shaped
content (media blocks) still flows through _save_turn's sanitization at
end of turn, preserving existing behavior for those cases.
Add focused registry coverage so the new read_file/read_write parameter guard stays actionable without changing generic validation behavior for other tools.
Made-with: Cursor
- Add type validation in registry.prepare_call() to catch list/other invalid params
- Add logger.warning() in provider layer when non-dict args detected
- Works for OpenAI-compatible and Anthropic providers
- Registry returns clear error hint for model to self-correct
The catch-all except Exception in QQ send() was swallowing
aiohttp.ClientError and OSError that _send_media correctly
re-raises. Add explicit catch for network errors before the
generic handler.
Audited all channel implementations for overly broad exception handling
that causes retry amplification or silent message loss during network
errors. This is the same class of bug as #3050 (Telegram _send_text).
Fixes by channel:
Telegram (send_delta):
- _stream_end path used except Exception for HTML edit fallback
- Network errors (TimedOut, NetworkError) triggered redundant plain
text edit, doubling connection demand during pool exhaustion
- Changed to except BadRequest, matching the _send_text fix
Discord:
- send() caught all exceptions without re-raising
- ChannelManager._send_with_retry() saw successful return, never retried
- Messages silently dropped on any send failure
- Added raise after error logging
DingTalk:
- _send_batch_message() returned False on all exceptions including
network errors — no retry, fallback text sent unnecessarily
- _read_media_bytes() and _upload_media() swallowed transport errors,
causing _send_media_ref() to cascade through doomed fallback attempts
- Added except httpx.TransportError handlers that re-raise immediately
WeChat:
- Media send failure triggered text fallback even for network errors
- During network issues: 3×(media + text) = 6 API calls per message
- Added specific catches: TimeoutException/TransportError re-raise,
5xx HTTPStatusError re-raises, 4xx falls back to text
QQ:
- _send_media() returned False on all exceptions
- Network errors triggered fallback text instead of retry
- Added except (aiohttp.ClientError, OSError) that re-raises
Tests: 331 passed (283 existing + 48 new across 5 channel test files)
Fixes: #3054
Related: #3050, #3053
Previously _send_text() caught all exceptions (except Exception) when
sending HTML-formatted messages, falling back to plain text even for
network errors like TimedOut and NetworkError. This caused connection
demand to double during pool exhaustion scenarios (3 retries × 2
fallback attempts = 6 calls per message instead of 3).
Now only catches BadRequest (HTML parse errors), letting network errors
propagate immediately to the retry layer where they belong.
Fixes: HKUDS/nanobot#3050
Add a focused regression test for the successful no-image retry path so the original message history stays stripped after fallback and the repeated retry loop cannot silently return.
Made-with: Cursor
When a non-transient LLM error occurs with image content, the retry
mechanism strips images from a copy but never updates the original
conversation history. Subsequent iterations rebuild context from the
unmodified history, causing the same error-retry cycle to repeat
every iteration until max_iterations is reached.
Add _strip_image_content_inplace() that mutates the original message
content lists in-place after a successful no-image retry, so callers
sharing those references (e.g. the runner's conversation history)
also see the stripped version.
Point Dream skill creation at a readable builtin skill-creator template, keep skill writes rooted at the workspace, and document the new skill discovery behavior in README.
Made-with: Cursor
Instead of a separate skill discovery system, extend Dream's two-phase
pipeline to also detect reusable behavioral patterns from conversation
history and generate SKILL.md files.
Phase 1 gains a [SKILL] output type for pattern detection.
Phase 2 gains write_file (scoped to skills/) and read access to builtin
skills, enabling it to check for duplicates and follow skill-creator's
format conventions before creating new skills.
Inspired by PR #3039 by @wanghesong2019.
Co-authored-by: wanghesong2019 <wanghesong2019@users.noreply.github.com>
Keep the new exec guard focused on writes to history.jsonl and .dream_cursor while still allowing read-only copy operations out of those files.
Made-with: Cursor
Explain the new agents.defaults.disabledSkills option so users can discover and configure skill exclusion from the main agent and subagents.
Made-with: Cursor
Add 'domain' field to FeishuConfig (Literal['feishu', 'lark'], default 'feishu').
Pass domain to lark.Client.builder() and lark.ws.Client to support Lark global
(open.larksuite.com) in addition to Feishu China (open.feishu.cn).
Existing configs default to 'feishu' for backward compatibility.
Also add documentation for domain field in README.md and add tests for
domain config.