nanobot/dream_phase1.md at 80bfcf44736e954cb7e85ebab0177264d9c078a8

mirror of https://github.com/HKUDS/nanobot.git synced 2026-05-25 11:02:34 +00:00

Xubin Ren cc5a666d5d review(dream): harden line-age annotation per review feedback

Follow-up to #3212, fully backward compatible:

- Extract the 14-day staleness threshold as `_STALE_THRESHOLD_DAYS` module
  constant and pass it into the Phase 1 prompt template as
  `{{ stale_threshold_days }}`. The number lived in three places before
  (code threshold, prompt instruction, docstring); now there is one.
- Add `DreamConfig.annotate_line_ages` (default True = current behavior)
  and propagate it through `Dream.__init__` and the gateway wiring in
  cli/commands.py. Gives users a knob to disable the feature without a
  code patch if an LLM reacts poorly to the `← Nd` suffix.
- Harden `_annotate_with_ages` against dirty working trees: when HEAD
  blob line count disagrees with the working-tree content length, skip
  annotation entirely instead of assigning ages to the wrong lines. The
  previous `i >= len(ages)` guard only handled one direction of the
  mismatch.
- Inline-comment the `max_iterations` 10→15 bump with a pointer to
  exp002 so future blame has context.
- Add 4 regression tests: end-to-end `← 30d` reaches prompt, 14/15
  threshold boundary, `annotate_line_ages=False` bypasses git entirely
  (verified via `assert_not_called`), length-mismatch defense, and
  template-var rendering.

Made-with: Cursor

2026-04-17 13:45:38 +08:00

2.4 KiB

Raw Blame History

You have TWO equally important tasks:

Extract new facts from conversation history
Deduplicate existing memory files — find and flag redundant, overlapping, or stale content even if NOT mentioned in history

Output one line per finding: [FILE] atomic fact (not already in memory) [FILE-REMOVE] reason for removal [SKILL] kebab-case-name: one-line description of the reusable pattern

Files: USER (identity, preferences), SOUL (bot behavior, tone), MEMORY (knowledge, project context)

Rules:

Atomic facts: "has a cat named Luna" not "discussed pet care"
Corrections: [USER] location is Tokyo, not Osaka
Capture confirmed approaches the user validated

Deduplication — scan ALL memory files for these redundancy patterns:

Same fact stated in multiple places (e.g., "communicates in Chinese" in both USER.md and multiple MEMORY.md entries)
Overlapping or nested sections covering the same topic
Information in MEMORY.md that is already captured in USER.md or SOUL.md (MEMORY.md should not duplicate permanent-file content)
Verbose entries that can be condensed without losing information For each duplicate found, output [FILE-REMOVE] for the less authoritative copy (prefer keeping facts in their canonical location)

Staleness — MEMORY.md lines may have a ← Nd suffix showing days since last modification:

SOUL.md and USER.md have no age annotations — they are permanent, only update with corrections
Age only indicates when content was last touched, not whether it should be removed
Use content judgment: user habits/preferences/personality traits are permanent regardless of age
Only prune content that is objectively outdated: passed events, resolved tracking, superseded approaches
Lines with ← Nd (N>{{ stale_threshold_days }}) deserve closer review but are NOT automatically removable
When removing: prefer deleting individual items over entire sections

Skill discovery — flag [SKILL] when ALL of these are true:

A specific, repeatable workflow appeared 2+ times in the conversation history
It involves clear steps (not vague preferences like "likes concise answers")
It is substantial enough to warrant its own instruction set (not trivial like "read a file")
Do not worry about duplicates — the next phase will check against existing skills

Do not add: current weather, transient status, temporary errors, conversational filler.

[SKIP] if nothing needs updating.

2.4 KiB Raw Blame History

2.4 KiB

Raw Blame History