nanobot

mirror of https://github.com/HKUDS/nanobot.git synced 2026-05-19 16:12:30 +00:00

Author	SHA1	Message	Date
chengyongru	ac9a2d0c25	test(pairing): cover _PENDING_USER_TURN_KEY cleanup and None allow_from - Assert pending_user_turn is cleared from session metadata after shortcut commands (e.g. /help) in test_auto_compact.py. - Add test for None allow_from / allowFrom values in test_base_channel.py to prevent TypeError regressions.	2026-05-15 15:46:44 +08:00
chengyongru	26665823e3	fix(agent): persist shortcut commands without polluting LLM context Shortcut commands (e.g. /help, /pairing) skip BUILD and SAVE states, so their turns were never persisted to the session. This caused WebUI chats to appear empty after _turn_end because history hydration reads from the session file. Fix by persisting the user message and assistant response inside _state_command, but tag them with _command=True so Session.get_history filters them out of LLM context. /new is excluded because it intentionally clears the session. - AgentLoop._persist_user_message_early now accepts **kwargs so _state_command can pass _command=True for the user turn. - Session.get_history skips messages with _command=True.	2026-05-14 23:51:58 +08:00
chengyongru	a6e993df25	fix(agent): move archived summary into system prompt for KV cache stability - Append [Archived Context Summary] to system prompt instead of injecting it into the user message runtime context, improving KV cache reuse across turns and avoiding consecutive same-role messages. - _last_summary persists in metadata (no pop) for restart survival; summary is re-injected every turn via the stable system prompt. - Remove dynamic "Inactive for X minutes" from _format_summary — use static last_active timestamp instead to preserve KV cache stability. - Pass session_summary through build_messages() so both normal and ask_user paths receive the archived summary in the system prompt. - estimate_session_prompt_tokens now reads _last_summary from metadata to include the summary in token budget estimation. - Remove obsolete session_summary parameter from maybe_consolidate_by_tokens and estimate_session_prompt_tokens call sites in loop.py (summary flows through build_messages instead). - Ensure /new (session.clear()) clears _last_summary from metadata.	2026-05-11 01:25:15 +08:00
Xubin Ren	9252f4d826	Revert "fix(agent): persist _last_summary across restarts with used sentinel" This reverts commit e5a1416a37b423de95b0fa279e9473110a678112.	2026-05-09 15:00:54 +08:00
chengyongru	e5a1416a37	fix(agent): persist _last_summary across restarts with used sentinel The previous implementation popped _last_summary from session.metadata after injecting it into the prompt, then saved the session. This caused the summary to be permanently lost after a process restart, making the AI forget archived context and appear to ignore memory or reference non-existent previous messages. Replace the destructive pop with a _last_summary_used sentinel: - _last_summary stays in metadata for restart survival - _last_summary_used prevents duplicate injection within the same turn - Clear the sentinel whenever a new summary is generated Updates tests to match the new persistence behavior.	2026-05-09 14:58:38 +08:00
Xubin Ren	ad4802600e	refactor(config): make max messages default explicit Use 120 as the config-level default and normalize zero back to that limit so session replay always receives an explicit message cap. Made-with: Cursor	2026-04-28 14:54:32 +08:00
Xubin Ren	eb4b3d9e26	refactor(session): internalize history/file-cap knobs as constants Move sessionHistoryMaxMessages, sessionHistoryMaxTokens, and sessionFileMaxMessages out of user-facing config into internal constants (HISTORY_MAX_MESSAGES=120, FILE_MAX_MESSAGES=2000). - Remove 3 fields from AgentDefaults and config pipeline - Sink enforce_file_cap into Session (was AgentLoop) - Auto-derive token budget from context window (was configurable) - Net -113 lines across 7 files; 723 tests green Made-with: Cursor	2026-04-27 08:06:50 +00:00
Xubin Ren	29ebc2d355	Merge origin/main into feat/session-replay-file-cap-invariants Preserve main's timestamp/tool-context replay semantics while keeping the PR's session history and file-cap budgets. Made-with: Cursor	2026-04-27 07:32:00 +00:00
hanyuanling	59dfd74842	feat(session): enforce replay/file-cap invariants for history lifecycle	2026-04-27 00:53:32 +08:00
chengyongru	becaff3e9d	fix(agent): skip auto-compact for sessions with active agent tasks Prevent proactive compaction from archiving sessions that have an in-flight agent task, avoiding mid-turn context truncation when a task runs longer than the idle TTL.	2026-04-13 12:51:37 +08:00
Xubin Ren	84e840659a	refactor(config): rename auto compact config key Prefer the more user-friendly idleCompactAfterMinutes name for auto compact while keeping sessionTtlMinutes as a backward-compatible alias. Update tests and README to document the retained recent-context behavior and the new preferred key.	2026-04-11 15:56:41 +08:00
Xubin Ren	1cb28b39a3	feat(agent): retain recent context during auto compact Keep a legal recent suffix in idle auto-compacted sessions so resumed chats preserve their freshest live context while older messages are summarized. Recover persisted summaries even when retained messages remain, and document the new behavior.	2026-04-11 15:56:41 +08:00
chengyongru	d03458f034	fix(agent): eliminate race condition in auto compact summary retrieval Make Consolidator.archive() return the summary string directly instead of writing to history.jsonl then reading back via get_last_history_entry(). This eliminates a race condition where concurrent _archive calls for different sessions could read each other's summaries from the shared history file (cross-user context leak in multi-user deployments). Also removes Consolidator.get_last_history_entry() — no longer needed.	2026-04-11 15:56:41 +08:00
chengyongru	fb6dd111e1	feat(agent): auto compact — proactive session compression to reduce token cost and latency (#2982 ) When a user is idle for longer than a configured TTL, nanobot proactively compresses the session context into a summary. This reduces token cost and first-token latency when the user returns — instead of re-processing a long stale context with an expired KV cache, the model receives a compact summary and fresh input.	2026-04-11 15:56:41 +08:00

14 Commits