nanobot

mirror of https://github.com/HKUDS/nanobot.git synced 2026-05-21 09:02:32 +00:00

Author	SHA1	Message	Date
hussein1362	e72c415473	fix(heartbeat): prevent internal reasoning leaks and finalization fallback in delivery Three failure modes addressed: 1. Model reflects HEARTBEAT.md instructions back as output instead of executing them ("HEARTBEAT.md has active tasks listed...") 2. Model narrates decision logic ("Best judgment call: stay quiet") 3. Model produces empty output for silence, runner treats it as failure, finalization retry generates "couldn't produce a final answer" which gets delivered to the user Changes: - Add _is_deliverable() pre-filter in HeartbeatService._tick() that catches finalization fallback messages and leaked reasoning patterns before they reach the evaluator - Wrap Phase 2 task input with a delivery-awareness preamble telling the model its output goes directly to the user's messaging app - Add meta-reasoning suppression criterion to evaluator template No changes to agent/loop.py, runner.py, providers, or config schema.	2026-04-27 18:14:13 +08:00
Xubin Ren	311a7fe36e	fix(session): stop training the model to parrot [Message Time: ...] Past assistant turns in history were prefixed with "[Message Time: ...]" just like user turns. The model treated these as in-context demos and started prefixing its own replies with the same marker, leaking metadata to the user. Prompt-level warnings could not beat dozens of prior assistant samples. Annotate only user turns and proactive deliveries (_channel_delivery=True, i.e. cron / heartbeat pushes whose timing is the whole point and which are too infrequent to act as demos). Adjacent user-side timestamps still pin every normal assistant reply for relative-time reasoning. The now-redundant identity.md warning is removed along with the demonstration source.	2026-04-27 07:11:20 +00:00
Xubin Ren	eeaec1f951	fix(agent): prevent message time metadata from leaking into replies	2026-04-27 06:23:43 +00:00
Xubin Ren	be05189f39	feat(channels): add video support for Telegram and WebSocket Telegram previously sent all video files as documents via send_document, so users saw a file icon instead of an inline player. WebSocket only accepted image MIME types, rejecting video uploads entirely. Telegram: - Recognize video extensions (mp4/mov/avi/mkv/webm/3gp) in _get_media_type - Route videos through send_video with supports_streaming=True - Add VIDEO/VIDEO_NOTE/ANIMATION to inbound message filters - Add video MIME mappings to _get_extension - Fix: local file sends now use _call_with_retry (previously no retry) WebSocket: - Expand upload MIME whitelist with video/mp4, video/webm, video/quicktime - Add per-type size limits (_MAX_VIDEO_BYTES=20MB, _MAX_VIDEOS_PER_MESSAGE=1) - Expand media serving endpoint to serve video with correct Content-Type Agent: - Add "video" to message tool media parameter description - Add .mp4 example to identity.md system prompt Made-with: Cursor	2026-04-25 02:20:13 +08:00
chengyongru	58110afb88	fix(templates): keep Search & Discovery heading in identity.md No reason to rename it to "Tools" — the section still covers the same grep/glob search tips as before.	2026-04-18 21:55:56 +08:00
chengyongru	34e8f97b1f	refactor(templates): separate identity and SOUL responsibilities Move all behavioral instructions out of identity.md into SOUL.md so that each file has a single clear purpose: - identity.md: capability facts only (runtime, workspace, format hints, tool guidance, untrusted content warning) - SOUL.md: behavioral rules (name, personality, execution rules) The "Act, don't narrate" rule is refined into layered behavior: act immediately on single-step tasks, plan first for multi-step tasks. This eliminates the contradiction where identity said "never end with a plan" but user SOUL.md said "always plan first".	2026-04-18 21:55:56 +08:00
Xubin Ren	cc5a666d5d	review(dream): harden line-age annotation per review feedback Follow-up to #3212, fully backward compatible: - Extract the 14-day staleness threshold as `_STALE_THRESHOLD_DAYS` module constant and pass it into the Phase 1 prompt template as `{{ stale_threshold_days }}`. The number lived in three places before (code threshold, prompt instruction, docstring); now there is one. - Add `DreamConfig.annotate_line_ages` (default True = current behavior) and propagate it through `Dream.__init__` and the gateway wiring in cli/commands.py. Gives users a knob to disable the feature without a code patch if an LLM reacts poorly to the `← Nd` suffix. - Harden `_annotate_with_ages` against dirty working trees: when HEAD blob line count disagrees with the working-tree content length, skip annotation entirely instead of assigning ages to the wrong lines. The previous `i >= len(ages)` guard only handled one direction of the mismatch. - Inline-comment the `max_iterations` 10→15 bump with a pointer to exp002 so future blame has context. - Add 4 regression tests: end-to-end `← 30d` reaches prompt, 14/15 threshold boundary, `annotate_line_ages=False` bypasses git entirely (verified via `assert_not_called`), length-mismatch defense, and template-var rendering. Made-with: Cursor	2026-04-17 13:45:38 +08:00
chengyongru	35f3084c03	feat(dream): per-line age annotations + dedup-aware prompt + max_iter=15 Three improvements to Dream's memory consolidation: 1. Per-line git-blame age annotations: MEMORY.md lines get `← Nd` suffixes (N>14) from dulwich annotate. SOUL.md/USER.md excluded as permanent. LLM uses content judgment, not just age, to decide what to prune. 2. Dedup-aware Phase 1 prompt: reframed as dual-task (extract facts + deduplicate existing files) with explicit redundancy patterns to scan for. Validated through 20 experiments (exp-002 prompt + max_iter=15 was best, averaging -1643 chars/5.4% compression per run). 3. Phase 1 analysis as commit body: dream git commits now include the full Phase 1 analysis for transparency via /dream-log. 4. max_iterations raised from 10 to 15: 30% improvement over 10 with no risk; 20 showed diminishing returns (exp-020: -701 vs exp-017: -1643).	2026-04-17 13:45:38 +08:00
chengyongru	6fbada5363	refactor(context): deduplicate system prompt — markdown skills index, skip template MEMORY.md - Convert skills summary from verbose XML (4-5 lines/skill) to compact markdown list (1 line/skill) with inline path for read_file lookup - Exclude always-loaded skills (e.g. memory) from the skills index to avoid duplicating content already in the Active Skills section - Skip injecting the Memory section when MEMORY.md still matches the bundled template (i.e. Dream hasn't populated it yet)	2026-04-15 15:49:30 +08:00
Xubin Ren	7a7f5c9689	fix(dream): use valid builtin skill template paths Point Dream skill creation at a readable builtin skill-creator template, keep skill writes rooted at the workspace, and document the new skill discovery behavior in README. Made-with: Cursor	2026-04-12 16:49:55 +08:00
chengyongru	2a243bfe4f	feat(agent): integrate skill discovery into Dream consolidation Instead of a separate skill discovery system, extend Dream's two-phase pipeline to also detect reusable behavioral patterns from conversation history and generate SKILL.md files. Phase 1 gains a [SKILL] output type for pattern detection. Phase 2 gains write_file (scoped to skills/) and read access to builtin skills, enabling it to check for duplicates and follow skill-creator's format conventions before creating new skills. Inspired by PR #3039 by @wanghesong2019. Co-authored-by: wanghesong2019 <wanghesong2019@users.noreply.github.com>	2026-04-12 16:49:55 +08:00
Xubin Ren	c7d10de253	feat(soul): restore friendly and curious tone to SOUL.md Made-with: Cursor	2026-04-08 02:22:25 +08:00
Xubin Ren	edb821e10d	feat(agent): prompt behavior directives, tool descriptions, and loop robustness	2026-04-08 02:22:25 +08:00
chengyongru	b4f985f3dc	feat(memory):dream enhancement (#2887 ) * feat(dream): enhance memory cleanup with staleness detection - Phase 1: add [FILE-REMOVE] directive and staleness patterns (14-day threshold, completed tasks, superseded info, resolved tracking) - Phase 2: add explicit cleanup rules, file paths section, and deletion guidance to prevent LLM path confusion - Inject current date and file sizes into Phase 1 context for age-aware analysis - Add _dream_debug() helper for observability (dream-debug.log in workspace) - Log Phase 1 analysis output and Phase 2 tool events for debugging Tested with glm-5-turbo: MEMORY.md reduced from 149 to 108-129 lines across two rounds, correctly identifying and removing weather data, detailed incident info, completed research, and stale discussions. * refactor(dream): replace _dream_debug file logger with loguru Remove the custom _dream_debug() helper that wrote to dream-debug.log and use the existing loguru logger instead. Phase 1 analysis is logged at debug level, tool events at info level — consistent with the rest of the codebase and no extra log file to manage. * fix(dream): make stale scan independent of conversation history Reframe Phase 1 from a single comparison task to two independent tasks: history diff AND proactive stale scan. The LLM was skipping stale content that wasn't referenced in conversation history (e.g. old triage snapshots). Now explicitly requires scanning memory files for staleness patterns on every run. * fix(dream): correct old_text param name and truncate debug log - Phase 2 prompt: old_string -> old_text to match EditFileTool interface - Phase 1 debug log: truncate analysis to 500 chars to avoid oversized lines * refactor(dream): streamline prompts by separating concerns Phase 1 owns all staleness judgment logic; Phase 2 is pure execution guidance. Remove duplicated cleanup rules from Phase 2 since Phase 1 already determines what to add/remove. Fix remaining old_string -> old_text. Total prompt size reduced ~45% (870 -> 480 tokens). * fix(dream): add FILE-REMOVE execution guidance to Phase 2 prompt Phase 2 was only processing [FILE] additions and ignoring [FILE-REMOVE] deletions after the cleanup rules were removed. Add explicit mapping: [FILE] → add content, [FILE-REMOVE] → delete content.	2026-04-07 22:39:47 +08:00
Xubin Ren	c9d4b7b905	Merge remote-tracking branch 'origin/main' into pr-2449 Made-with: Cursor # Conflicts: # nanobot/utils/evaluator.py	2026-04-06 06:30:11 +00:00
Xubin Ren	33bef8d508	Merge remote-tracking branch 'origin/main' into feat/search-tools Made-with: Cursor	2026-04-04 14:37:59 +00:00
Xubin Ren	c3b4ebae53	refactor(agent): move internal prompts into packaged templates	2026-04-04 11:09:37 +00:00
Jack Lu	d436a1d678	feat: integrate Jinja2 templating for agent responses and memory consolidation - Added Jinja2 template support for various agent responses, including identity, skills, and memory consolidation. - Introduced new templates for evaluating notifications, handling subagent announcements, and managing platform policies. - Updated the agent context and memory modules to utilize the new templating system for improved readability and maintainability. - Added a new dependency on Jinja2 in pyproject.toml.	2026-04-04 14:18:22 +08:00
Xubin Ren	15cc9b23b4	feat(agent): add built-in grep and glob search tools	2026-04-02 15:37:57 +00:00
Re-bin	c05cb2ef64	refactor(cron): remove CLI cron commands and unify scheduling via cron tool	2026-03-03 05:51:24 +00:00
Re-bin	cdbede2fa8	refactor: simplify /stop dispatch, inline commands, trim verbose docstrings	2026-02-25 17:04:08 +00:00
Re-bin	30361c9307	refactor: replace cron usage docs in TOOLS.md with reference to cron skill	2026-02-23 18:28:09 +00:00
Re-bin	35e3f7ed26	fix(templates): tighten AGENTS.md tool call guidelines to reduce hallucinations	2026-02-23 14:10:43 +00:00
Re-bin	577b3d104a	refactor: move workspace/ to nanobot/templates/ for packaging	2026-02-23 08:08:01 +00:00

24 Commits