64 Commits

Author SHA1 Message Date
chengyongru
8a646d9aec fix(agent): cap recent history section in system prompt
Truncate the "Recent History" section injected by build_system_prompt()
to 32K chars. Without this, many accumulated history.jsonl entries could
still bloat the system prompt even with per-entry truncation in place.
2026-04-24 01:53:31 +08:00
chengyongru
6fbada5363 refactor(context): deduplicate system prompt — markdown skills index, skip template MEMORY.md
- Convert skills summary from verbose XML (4-5 lines/skill) to compact
  markdown list (1 line/skill) with inline path for read_file lookup
- Exclude always-loaded skills (e.g. memory) from the skills index to
  avoid duplicating content already in the Active Skills section
- Skip injecting the Memory section when MEMORY.md still matches the
  bundled template (i.e. Dream hasn't populated it yet)
2026-04-15 15:49:30 +08:00
Xubin Ren
47f5795708 refactor: move document extraction from ContextBuilder to API layer
ContextBuilder._build_user_content now only handles images (its original
responsibility).  Document text extraction (PDF, DOCX, XLSX, PPTX) is
performed by the new _extract_documents() helper in server.py, called
before process_direct().  This keeps the core context builder free of
format-specific dependencies and makes the API boundary the single place
where uploaded files are pre-processed.

Tests updated to reflect the new responsibility boundary.

Made-with: Cursor
2026-04-14 13:00:59 +00:00
Xubin Ren
2502fc616b Merge origin/main into feat/api-file-upload
Keep the API file upload branch current with main, enforce the documented JSON base64 per-file limit, and avoid leaking document extraction error strings into user prompts.

Made-with: Cursor
2026-04-14 12:29:43 +00:00
Xubin Ren
09c238ca0f Merge origin/main into pr-2959
Resolve the config plumbing conflicts and keep disabled skill filtering consistent for subagent prompts after syncing with main.

Made-with: Cursor
2026-04-12 02:02:39 +00:00
chengyongru
fb6dd111e1 feat(agent): auto compact — proactive session compression to reduce token cost and latency (#2982)
When a user is idle for longer than a configured TTL, nanobot **proactively** compresses the session context into a summary. This reduces token cost and first-token latency when the user returns — instead of re-processing a long stale context with an expired KV cache, the model receives a compact summary and fresh input.
2026-04-11 15:56:41 +08:00
chenyahui
e9c4fe6824 feat(skills): add disabled_skills config to exclude skills from loading
Introduce a disabled_skills option in the config schema that allows
users to specify a list of skill names to be excluded. The setting is
threaded from config through Nanobot -> AgentLoop -> ContextBuilder ->
SkillsLoader. Disabled skills are filtered out from list_skills,
get_always_skills, and build_skills_summary. Four new test cases cover
the filtering behavior.
2026-04-09 14:11:47 +08:00
dengjingren
a068df5a79 feat(api): support file uploads via JSON base64 and multipart/form-data 2026-04-08 15:58:52 +08:00
Xubin Ren
edb821e10d feat(agent): prompt behavior directives, tool descriptions, and loop robustness 2026-04-08 02:22:25 +08:00
Xubin Ren
ce7986e492 fix(memory): add timestamp and cap to recent history injection 2026-04-08 00:03:11 +08:00
Lingao Meng
31c154a7b8 fix(memory): prevent potential loss of compressed session history
When the Consolidator compresses old session messages into history.jsonl,
those messages are immediately removed from the LLM's context. Dream
processes history.jsonl into long-term memory (memory.md) on a cron
schedule (default every 2h), creating a window where compressed content
is invisible to the LLM.

This change closes the gap by injecting unprocessed history entries
(history.jsonl entries not yet consumed by Dream) directly into the
system prompt as "# Recent History".

Key design notes:
- Uses read_unprocessed_history(since_cursor=last_dream_cursor) so only
  entries not yet reflected in long-term memory are included, avoiding
  duplication with memory.md
- No overlap with session messages: Consolidator advances
  last_consolidated before returning, so archived messages are already
  removed from get_history() output
- Token-safe: Consolidator's estimate_session_prompt_tokens calls
  build_system_prompt via the same build_messages function, so the
  injected entries are included in token budget calculations and will
  trigger further consolidation if needed

Signed-off-by: Lingao Meng <menglingao@xiaomi.com>
2026-04-07 23:41:05 +08:00
Jack Lu
d436a1d678 feat: integrate Jinja2 templating for agent responses and memory consolidation
- Added Jinja2 template support for various agent responses, including identity, skills, and memory consolidation.
- Introduced new templates for evaluating notifications, handling subagent announcements, and managing platform policies.
- Updated the agent context and memory modules to utilize the new templating system for improved readability and maintainability.
- Added a new dependency on Jinja2 in pyproject.toml.
2026-04-04 14:18:22 +08:00
Xubin Ren
fbedf7ad77 feat: harden agent runtime for long-running tasks 2026-04-01 19:12:49 +00:00
Xubin Ren
13d6c0ae52 feat(config): add configurable timezone for runtime context
Add agent-level timezone configuration with a UTC default, propagate it into runtime context and heartbeat prompts, and document valid IANA timezone usage in the README.
2026-03-25 22:07:14 +08:00
ZhangYuanhan-AI
8abbe8a6df fix(agent): instruct LLM to use message tool for file delivery
During testing, we discovered that when a user requests the agent to
send a file (e.g., "send me IMG_1115.png"), the agent would call
read_file to view the content and then reply with text claiming
"file sent" — but never actually deliver the file to the user.

Root cause: The system prompt stated "Reply directly with text for
conversations. Only use the 'message' tool to send to a specific
chat channel", which led the LLM to believe text replies were
sufficient for all responses, including file delivery.

Fix: Add an explicit IMPORTANT instruction in the system prompt
telling the LLM it MUST use the 'message' tool with the 'media'
parameter to send files, and that read_file only reads content
for its own analysis.

Co-Authored-By: qulllee <qullkui@tencent.com>
2026-03-24 01:11:33 +08:00
Xubin Ren
445a96ab55 fix(agent): harden multimodal tool result flow
Keep multimodal tool outputs on the native content-block path while
restoring redirect SSRF checks for web_fetch image responses. Also share
image block construction, simplify persisted history sanitization, and
add regression tests for image reads and blocked private redirects.

Made-with: Cursor
2026-03-21 05:34:56 +00:00
vandazia
71a88da186 feat: implement native multimodal autonomous sensory capabilities 2026-03-20 22:00:38 +08:00
zhangxiaoyu.york
f72ceb7a3c fix:set subagent result message role = assistant 2026-03-18 00:43:46 +08:00
Xubin Ren
8cf11a0291 fix: preserve image paths in fallback and session history 2026-03-17 22:37:09 +08:00
Xubin Ren
6e2b6396a4 security: add SSRF protection, untrusted content marking, and internal URL blocking 2026-03-16 15:05:26 +08:00
Xubin Ren
5d1528a5f3 fix(heartbeat): inject shared current time context into phase 1 2026-03-16 10:52:26 +08:00
Xubin Ren
d684fec27a Replace load_skill tool with read_file extra_allowed_dirs for builtin skills access
Instead of adding a separate load_skill tool to bypass workspace restrictions,
extend ReadFileTool with extra_allowed_dirs so it can read builtin skill paths
while keeping write/edit tools locked to the workspace. Fixes the original issue
for both main agent and subagents.

Made-with: Cursor
2026-03-15 23:21:02 +08:00
Ben
45832ea499 Add load_skill tool to bypass workspace restriction for builtin skills
When restrictToWorkspace is enabled, the agent cannot read builtin skill
files via read_file since they live outside the workspace. This adds a
dedicated load_skill tool that reads skills by name through the SkillsLoader,
which accesses files directly via Python without the workspace restriction.

- Add LoadSkillTool to filesystem tools
- Register it in the agent loop
- Update system prompt to instruct agent to use load_skill instead of read_file
- Remove raw filesystem paths from skills summary
2026-03-15 23:21:02 +08:00
Re-bin
ddccf25bb1 fix(subagent): preserve reasoning fields across tool turns
Share assistant message construction between the main agent and subagents, and add a regression test to keep reasoning_content and thinking_blocks in follow-up tool rounds.
2026-03-11 03:47:24 +00:00
Re-bin
4715321319 Merge branch 'main' into pr-1579 and tighten platform guidance 2026-03-08 16:39:37 +00:00
Re-bin
ce9b516b11 Merge branch 'main' into pr-1579 2026-03-08 16:29:54 +00:00
Re-bin
7cbb254a8e fix: remove stale IDENTITY bootstrap entry 2026-03-08 15:39:40 +00:00
Re-bin
3a01fe536a refactor: move detect_image_mime to utils/helpers for reuse 2026-03-06 06:49:09 +00:00
VITOHJL
958c23fb01 chore: refine platform policy and memory SKILL docs 2026-03-05 23:57:43 +08:00
coldxiangyu
46192fbd2a fix(context): detect image MIME type from magic bytes instead of file extension
Feishu downloads images with incorrect extensions (e.g. .jpg for PNG files).
mimetypes.guess_type() relies on the file extension, causing a MIME mismatch
that Anthropic rejects with 'image was specified using image/jpeg but appears
to be image/png'.

Fix: read the first bytes of the image data and detect the real MIME type via
magic bytes (PNG: 0x89PNG, JPEG: 0xFFD8FF, GIF: GIF87a/GIF89a, WEBP: RIFF+WEBP).
Fall back to mimetypes.guess_type() only when magic bytes are inconclusive.
2026-03-05 20:29:10 +08:00
Nikolas de Hor
ad99d5aaa0 fix: merge consecutive user messages into single message
Some LLM providers (Minimax, Dashscope) strictly reject consecutive
messages with the same role. build_messages() was emitting two separate
user messages back-to-back: the runtime context and the actual user
content.

Merge them into a single user message, handling both plain text and
multimodal (image) content. Update _save_turn() to strip the runtime
context prefix from the merged message when persisting to session
history.

Fixes #1414
Fixes #1344
2026-03-03 00:59:58 -03:00
Jack Lu
3ee061b879
Merge branch 'main' into main 2026-03-01 13:35:24 +08:00
Re-bin
5ca386ebf5 fix: preserve reasoning_content and thinking_blocks in session history 2026-02-28 17:37:12 +00:00
JK_Lu
977ca725f2 style: unify code formatting and import order
- Remove trailing whitespace and normalize blank lines
- Unify string quotes and line breaks for long lines
- Sort imports alphabetically across modules
2026-02-28 20:55:43 +08:00
aiguozhi123456
db4185c8b7 Add timestamp format hint for HISTORY.md grep searching 2026-02-27 11:11:42 +00:00
Re-bin
cdbede2fa8 refactor: simplify /stop dispatch, inline commands, trim verbose docstrings 2026-02-25 17:04:08 +00:00
Re-bin
d55a850357 refactor: simplify runtime context injection — drop JSON/dedup, keep untrusted tag 2026-02-25 16:13:48 +00:00
rickthemad4
87a2084ee2 feat: add untrusted runtime context layer for stable prompt prefix 2026-02-24 16:38:29 +00:00
Re-bin
f294e9d065 refactor: merge runtime context helpers and move imports to top 2026-02-24 16:15:21 +00:00
rickthemad4
56b9b33c6d fix: stabilize system prompt for better cache reuse 2026-02-24 14:18:50 +00:00
Re-bin
d9462284e1 improve agent reliability: behavioral constraints, full tool history, error hints 2026-02-23 09:13:08 +00:00
Re-bin
13d768cd93 Merge branch 'main' into pr-939 2026-02-21 17:06:05 +00:00
Xubin Ren
6a9152f0c4
Merge PR #947 to Fix 'Missing reasoning_content field' error for deepseek provider.
fix(context): Fix 'Missing `reasoning_content` field' error for deepseek provider.
2026-02-22 00:47:58 +08:00
nanobot-bot
01c835aac2 fix(context): Fix 'Missing reasoning_content field' error for deepseek provider. 2026-02-21 23:11:30 +08:00
vincentchen
b3acd19c7b Remove redundant tools description (because tools information is passed in with each self.provider.chat() call) 2026-02-21 20:28:42 +08:00
Re-bin
aeb07d3450 refactor(loop): remove interim text retry, use system prompt constraint instead 2026-02-21 07:32:58 +00:00
Re-bin
0c2fea6d33 Merge branch 'main' into pr-795 2026-02-20 11:25:51 +00:00
Re-bin
715b2db24b feat: stream intermediate progress to user during tool execution 2026-02-18 14:23:51 +00:00
Ivan
e44f14379a fix: sanitize messages and ensure 'content' for strict LLM providers
- Strip non-standard keys like 'reasoning_content' before sending to LLM
- Always include 'content' key in assistant messages (required by StepFun)
- Add _sanitize_messages to LiteLLMProvider to prevent 400 BadRequest errors
2026-02-18 11:57:58 +03:00
Re-bin
1db05c881d fix: omit empty content in assistant messages 2026-02-17 08:59:05 +00:00