Merge branch 'main' of https://github.com/HKUDS/nanobot into codex/webui-performance

This commit is contained in:
Xubin Ren 2026-05-18 01:18:28 +08:00
commit 8708ccea86
8 changed files with 635 additions and 363 deletions

View File

@ -26,7 +26,52 @@ Instead of storing secrets directly in `config.json`, you can use `${VAR_NAME}`
} }
``` ```
For **systemd** deployments, use `EnvironmentFile=` in the service unit to load variables from a file that only the deploying user can read: Any string value in `config.json` can use `${VAR_NAME}`. Resolution runs once at startup, in memory only — resolved values are never written back to disk, so editing config through `nanobot onboard` or the WebUI preserves the placeholder.
If a referenced variable is unset, nanobot fails fast at startup with `ValueError: Environment variable 'NAME' referenced in config is not set`.
### More examples
**MCP servers** — both stdio `env` and HTTP `headers`:
```json
{
"tools": {
"mcpServers": {
"github": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": { "GITHUB_PERSONAL_ACCESS_TOKEN": "${GITHUB_TOKEN}" }
},
"remote": {
"url": "https://example.com/mcp/",
"headers": { "Authorization": "Bearer ${REMOTE_MCP_TOKEN}" }
}
}
}
}
```
**Web search providers:**
```json
{
"tools": {
"web": {
"search": {
"provider": "brave",
"apiKey": "${BRAVE_API_KEY}"
}
}
}
}
```
### Loading variables at startup
Pick whatever fits your deployment — nanobot only reads `os.environ` at startup, so any mechanism that populates the process environment works.
**systemd** — use `EnvironmentFile=` in the service unit to load variables from a file that only the deploying user can read:
```ini ```ini
# /etc/systemd/system/nanobot.service (excerpt) # /etc/systemd/system/nanobot.service (excerpt)
@ -42,6 +87,35 @@ TELEGRAM_TOKEN=your-token-here
IMAP_PASSWORD=your-password-here IMAP_PASSWORD=your-password-here
``` ```
**Docker** — pass an env file to the locally built image (one `KEY=VALUE` per line), or use `-e KEY=value`:
```bash
docker run --rm --env-file=./nanobot.env \
-v ~/.nanobot:/home/nanobot/.nanobot \
nanobot agent -m "Hello"
```
**direnv** — drop a `.envrc` in your working directory and run `direnv allow`:
```bash
# .envrc (auto-loaded by direnv)
export TELEGRAM_TOKEN=your-token-here
export ANTHROPIC_API_KEY=...
```
**Secret managers (1Password, Bitwarden, pass)** — wrap the process so secrets only exist as env vars for the lifetime of the run, never on disk:
```bash
# 1Password — references in .env.tpl look like `op://Vault/Item/field`
op run --env-file=.env.tpl -- nanobot agent
# pass (passwordstore.org)
ANTHROPIC_API_KEY="$(pass show api/anthropic)" nanobot agent
# Bitwarden
ANTHROPIC_API_KEY="$(bw get password api/anthropic)" nanobot agent
```
## Providers ## Providers
> [!TIP] > [!TIP]
@ -917,7 +991,7 @@ By default, web search uses `duckduckgo`, and it works out of the box without an
"web": { "web": {
"search": { "search": {
"provider": "brave", "provider": "brave",
"apiKey": "BSA..." "apiKey": "${BRAVE_API_KEY}"
} }
} }
} }
@ -931,7 +1005,7 @@ By default, web search uses `duckduckgo`, and it works out of the box without an
"web": { "web": {
"search": { "search": {
"provider": "tavily", "provider": "tavily",
"apiKey": "tvly-..." "apiKey": "${TAVILY_API_KEY}"
} }
} }
} }
@ -945,7 +1019,7 @@ By default, web search uses `duckduckgo`, and it works out of the box without an
"web": { "web": {
"search": { "search": {
"provider": "jina", "provider": "jina",
"apiKey": "jina_..." "apiKey": "${JINA_API_KEY}"
} }
} }
} }
@ -959,7 +1033,7 @@ By default, web search uses `duckduckgo`, and it works out of the box without an
"web": { "web": {
"search": { "search": {
"provider": "kagi", "provider": "kagi",
"apiKey": "your-kagi-api-key" "apiKey": "${KAGI_API_KEY}"
} }
} }
} }
@ -973,7 +1047,7 @@ By default, web search uses `duckduckgo`, and it works out of the box without an
"web": { "web": {
"search": { "search": {
"provider": "olostep", "provider": "olostep",
"apiKey": "YOUR_OLOSTEP_API_KEY" "apiKey": "${OLOSTEP_API_KEY}"
} }
} }
} }
@ -1136,6 +1210,8 @@ MCP tools are automatically discovered and registered on startup. The LLM can us
> [!TIP] > [!TIP]
> For production deployments, set `"restrictToWorkspace": true` and `"tools.exec.sandbox": "bwrap"` in your config to sandbox the agent. > For production deployments, set `"restrictToWorkspace": true` and `"tools.exec.sandbox": "bwrap"` in your config to sandbox the agent.
For API keys, tokens, and other secrets, see [Environment Variables for Secrets](#environment-variables-for-secrets) — avoid storing them directly in `config.json`.
| Option | Default | Description | | Option | Default | Description |
|--------|---------|-------------| |--------|---------|-------------|
| `tools.restrictToWorkspace` | `false` | When `true`, restricts **all** agent tools (shell, file read/write/edit, list) to the workspace directory. Prevents path traversal and out-of-scope access. | | `tools.restrictToWorkspace` | `false` | When `true`, restricts **all** agent tools (shell, file read/write/edit, list) to the workspace directory. Prevents path traversal and out-of-scope access. |

View File

@ -10,6 +10,18 @@
> [!IMPORTANT] > [!IMPORTANT]
> Official Docker usage currently means building from this repository with the included `Dockerfile`. Docker Hub images under third-party namespaces are not maintained or verified by HKUDS/nanobot; do not mount API keys or bot tokens into them unless you trust the publisher. > Official Docker usage currently means building from this repository with the included `Dockerfile`. Docker Hub images under third-party namespaces are not maintained or verified by HKUDS/nanobot; do not mount API keys or bot tokens into them unless you trust the publisher.
> [!IMPORTANT]
> The gateway and WebSocket channel default to `host: "127.0.0.1"` in `config.json` (set in `nanobot/config/schema.py`). Docker `-p` port forwarding cannot reach a container's loopback interface, so for the host or LAN to reach the exposed ports you must set both binds to `0.0.0.0` in `~/.nanobot/config.json` before starting the container:
>
> ```json
> {
> "gateway": { "host": "0.0.0.0" },
> "channels": { "websocket": { "host": "0.0.0.0" } }
> }
> ```
>
> When `host` is `0.0.0.0`, the gateway refuses to start unless `token` or `tokenIssueSecret` is also configured on the WebSocket channel — see [`webui/README.md`](../webui/README.md) for details.
### Docker Compose ### Docker Compose
```bash ```bash
@ -36,8 +48,20 @@ docker run -v ~/.nanobot:/home/nanobot/.nanobot --rm nanobot onboard
# Edit config on host to add API keys # Edit config on host to add API keys
vim ~/.nanobot/config.json vim ~/.nanobot/config.json
# Run gateway (connects to enabled channels, e.g. Telegram/Discord/Mochat) # Run gateway (connects to enabled channels, e.g. Telegram/Discord/Mochat).
docker run -v ~/.nanobot:/home/nanobot/.nanobot -p 18790:18790 nanobot gateway # Mirrors the security caps and port mappings declared in docker-compose.yml:
# - `--cap-drop ALL --cap-add SYS_ADMIN` + unconfined apparmor/seccomp are required
# when `tools.exec.sandbox: "bwrap"` is enabled (bwrap needs CAP_SYS_ADMIN for
# user namespaces). Without them, `bwrap` exits with `clone3: Operation not permitted`.
# - `-p 8765:8765` exposes the WebSocket channel / WebUI alongside the gateway health
# endpoint on 18790.
docker run \
--cap-drop ALL --cap-add SYS_ADMIN \
--security-opt apparmor=unconfined \
--security-opt seccomp=unconfined \
-v ~/.nanobot:/home/nanobot/.nanobot \
-p 18790:18790 -p 8765:8765 \
nanobot gateway
# Or run a single command # Or run a single command
docker run -v ~/.nanobot:/home/nanobot/.nanobot --rm nanobot agent -m "Hello!" docker run -v ~/.nanobot:/home/nanobot/.nanobot --rm nanobot agent -m "Hello!"

View File

@ -4,7 +4,7 @@ from __future__ import annotations
from collections.abc import Collection from collections.abc import Collection
from datetime import datetime from datetime import datetime
from typing import TYPE_CHECKING, Any, Callable, Coroutine from typing import TYPE_CHECKING, Callable, Coroutine
from loguru import logger from loguru import logger
@ -37,27 +37,6 @@ class AutoCompact:
def _format_summary(text: str, last_active: datetime) -> str: def _format_summary(text: str, last_active: datetime) -> str:
return f"Previous conversation summary (last active {last_active.isoformat()}):\n{text}" return f"Previous conversation summary (last active {last_active.isoformat()}):\n{text}"
def _split_unconsolidated(
self, session: Session,
) -> tuple[list[dict[str, Any]], list[dict[str, Any]]]:
"""Split live session tail into archiveable prefix and retained recent suffix."""
tail = list(session.messages[session.last_consolidated:])
if not tail:
return [], []
probe = Session(
key=session.key,
messages=tail.copy(),
created_at=session.created_at,
updated_at=session.updated_at,
metadata={},
last_consolidated=0,
)
probe.retain_recent_legal_suffix(self._RECENT_SUFFIX_MESSAGES)
kept = probe.messages
cut = len(tail) - len(kept)
return tail[:cut], kept
def check_expired(self, schedule_background: Callable[[Coroutine], None], def check_expired(self, schedule_background: Callable[[Coroutine], None],
active_session_keys: Collection[str] = ()) -> None: active_session_keys: Collection[str] = ()) -> None:
"""Schedule archival for idle sessions, skipping those with in-flight agent tasks.""" """Schedule archival for idle sessions, skipping those with in-flight agent tasks."""
@ -74,33 +53,17 @@ class AutoCompact:
async def _archive(self, key: str) -> None: async def _archive(self, key: str) -> None:
try: try:
self.sessions.invalidate(key) summary = await self.consolidator.compact_idle_session(
session = self.sessions.get_or_create(key) key, self._RECENT_SUFFIX_MESSAGES,
archive_msgs, kept_msgs = self._split_unconsolidated(session) )
if not archive_msgs and not kept_msgs:
session.updated_at = datetime.now()
self.sessions.save(session)
return
last_active = session.updated_at
summary = ""
if archive_msgs:
summary = await self.consolidator.archive(archive_msgs) or ""
if summary and summary != "(nothing)": if summary and summary != "(nothing)":
self._summaries[key] = (summary, last_active) session = self.sessions.get_or_create(key)
session.metadata["_last_summary"] = {"text": summary, "last_active": last_active.isoformat()} meta = session.metadata.get("_last_summary")
session.messages = kept_msgs if isinstance(meta, dict):
session.last_consolidated = 0 self._summaries[key] = (
session.updated_at = datetime.now() meta["text"],
self.sessions.save(session) datetime.fromisoformat(meta["last_active"]),
if archive_msgs: )
logger.info(
"Auto-compact: archived {} (archived={}, kept={}, summary={})",
key,
len(archive_msgs),
len(kept_msgs),
bool(summary),
)
except Exception: except Exception:
logger.exception("Auto-compact: failed for {}", key) logger.exception("Auto-compact: failed for {}", key)
finally: finally:

View File

@ -678,11 +678,18 @@ class Consolidator:
The budget reserves space for completion tokens and a safety buffer The budget reserves space for completion tokens and a safety buffer
so the LLM request never exceeds the context window. so the LLM request never exceeds the context window.
""" """
if not session.messages or self.context_window_tokens <= 0: if self.context_window_tokens <= 0:
return return
lock = self.get_lock(session.key) lock = self.get_lock(session.key)
async with lock: async with lock:
# Refresh session reference: AutoCompact may have replaced it.
fresh = self.sessions.get_or_create(session.key)
if fresh is not session:
session = fresh
if not session.messages:
return
budget = self._input_token_budget budget = self._input_token_budget
target = int(budget * self.consolidation_ratio) target = int(budget * self.consolidation_ratio)
last_summary = await self._consolidate_replay_overflow( last_summary = await self._consolidate_replay_overflow(
@ -769,6 +776,74 @@ class Consolidator:
# the summary injection strategy with AutoCompact._archive(). # the summary injection strategy with AutoCompact._archive().
self._persist_last_summary(session, last_summary) self._persist_last_summary(session, last_summary)
async def compact_idle_session(
self,
session_key: str,
max_suffix: int = 8,
) -> str | None:
"""Hard-truncate an idle session under the consolidation lock.
Used by AutoCompact so all session mutation goes through a single
lock-protected path. Returns the summary text on success, ``None``
if the LLM failed (raw_archive fallback), or ``""`` if there was
nothing to archive.
"""
lock = self.get_lock(session_key)
async with lock:
self.sessions.invalidate(session_key)
session = self.sessions.get_or_create(session_key)
tail = list(session.messages[session.last_consolidated:])
if not tail:
session.updated_at = datetime.now()
self.sessions.save(session)
return ""
probe = Session(
key=session.key,
messages=tail.copy(),
created_at=session.created_at,
updated_at=session.updated_at,
metadata={},
last_consolidated=0,
)
probe.retain_recent_legal_suffix(max_suffix)
kept = probe.messages
cut = len(tail) - len(kept)
archive_msgs = tail[:cut]
if not archive_msgs and not kept:
session.updated_at = datetime.now()
self.sessions.save(session)
return ""
last_active = session.updated_at
summary: str | None = ""
if archive_msgs:
summary = await self.archive(archive_msgs)
if summary and summary != "(nothing)":
session.metadata["_last_summary"] = {
"text": summary,
"last_active": last_active.isoformat(),
}
session.messages = kept
session.last_consolidated = 0
session.updated_at = datetime.now()
self.sessions.save(session)
if archive_msgs:
logger.info(
"Idle-session compact for {}: archived={}, kept={}, summary={}",
session_key,
len(archive_msgs),
len(kept),
bool(summary),
)
return summary
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
# Dream — heavyweight cron-scheduled memory consolidation # Dream — heavyweight cron-scheduled memory consolidation

View File

@ -45,6 +45,73 @@ def _add_turns(session, turns: int, *, prefix: str = "msg") -> None:
session.add_message("assistant", f"{prefix} assistant {i}") session.add_message("assistant", f"{prefix} assistant {i}")
def _make_fake_compact(
loop: AgentLoop,
*,
summary: str = "Summary.",
on_archive=None,
track_archived: list | None = None,
track_count: bool = False,
):
"""Return a fake compact_idle_session that mirrors the real method's session mutation."""
from nanobot.session.manager import Session as _Session
state = {"count": 0}
async def _fake_compact(key: str, max_suffix: int = 8) -> str:
state["count"] += 1
session = loop.sessions.get_or_create(key)
tail = list(session.messages[session.last_consolidated:])
if not tail:
session.updated_at = datetime.now()
loop.sessions.save(session)
return ""
probe = _Session(
key=session.key,
messages=tail.copy(),
created_at=session.created_at,
updated_at=session.updated_at,
metadata={},
last_consolidated=0,
)
probe.retain_recent_legal_suffix(max_suffix)
kept = probe.messages
cut = len(tail) - len(kept)
archive_msgs = tail[:cut]
if not archive_msgs and not kept:
session.updated_at = datetime.now()
loop.sessions.save(session)
return ""
last_active = session.updated_at
s = summary
if archive_msgs:
if on_archive:
result = on_archive(archive_msgs)
s = result if isinstance(result, str) else summary
if track_archived is not None:
track_archived.extend(archive_msgs)
if s and s != "(nothing)":
session.metadata["_last_summary"] = {
"text": s,
"last_active": last_active.isoformat(),
}
session.messages = kept
session.last_consolidated = 0
session.updated_at = datetime.now()
loop.sessions.save(session)
return s
# Attach state for count access
_fake_compact.state = state # type: ignore[attr-defined]
return _fake_compact
class TestSessionTTLConfig: class TestSessionTTLConfig:
"""Test session TTL configuration.""" """Test session TTL configuration."""
@ -201,10 +268,7 @@ class TestAutoCompact:
s2.add_message("user", "recent") s2.add_message("user", "recent")
loop.sessions.save(s2) loop.sessions.save(s2)
async def _fake_archive(messages): loop.consolidator.compact_idle_session = _make_fake_compact(loop)
return "Summary."
loop.consolidator.archive = _fake_archive
loop.auto_compact.check_expired(loop._schedule_background) loop.auto_compact.check_expired(loop._schedule_background)
await asyncio.sleep(0.1) await asyncio.sleep(0.1)
@ -222,12 +286,9 @@ class TestAutoCompact:
loop.sessions.save(session) loop.sessions.save(session)
archived_messages = [] archived_messages = []
loop.consolidator.compact_idle_session = _make_fake_compact(
async def _fake_archive(messages): loop, track_archived=archived_messages,
archived_messages.extend(messages) )
return "Summary."
loop.consolidator.archive = _fake_archive
await loop.auto_compact._archive("cli:test") await loop.auto_compact._archive("cli:test")
@ -246,10 +307,9 @@ class TestAutoCompact:
_add_turns(session, 6, prefix="hello") _add_turns(session, 6, prefix="hello")
loop.sessions.save(session) loop.sessions.save(session)
async def _fake_archive(messages): loop.consolidator.compact_idle_session = _make_fake_compact(
return "User said hello." loop, summary="User said hello.",
)
loop.consolidator.archive = _fake_archive
await loop.auto_compact._archive("cli:test") await loop.auto_compact._archive("cli:test")
@ -262,23 +322,16 @@ class TestAutoCompact:
@pytest.mark.asyncio @pytest.mark.asyncio
async def test_auto_compact_empty_session(self, tmp_path): async def test_auto_compact_empty_session(self, tmp_path):
"""_archive on empty session should not archive.""" """_archive on empty session should not store a summary."""
loop = _make_loop(tmp_path, session_ttl_minutes=15) loop = _make_loop(tmp_path, session_ttl_minutes=15)
archive_called = False loop.consolidator.compact_idle_session = _make_fake_compact(loop)
async def _fake_archive(messages):
nonlocal archive_called
archive_called = True
return "Summary."
loop.consolidator.archive = _fake_archive
await loop.auto_compact._archive("cli:test") await loop.auto_compact._archive("cli:test")
assert not archive_called
session_after = loop.sessions.get_or_create("cli:test") session_after = loop.sessions.get_or_create("cli:test")
assert len(session_after.messages) == 0 assert len(session_after.messages) == 0
assert "cli:test" not in loop.auto_compact._summaries
await loop.close_mcp() await loop.close_mcp()
@pytest.mark.asyncio @pytest.mark.asyncio
@ -290,18 +343,14 @@ class TestAutoCompact:
session.last_consolidated = 18 session.last_consolidated = 18
loop.sessions.save(session) loop.sessions.save(session)
archived_count = 0 archived_messages = []
loop.consolidator.compact_idle_session = _make_fake_compact(
async def _fake_archive(messages): loop, track_archived=archived_messages,
nonlocal archived_count )
archived_count = len(messages)
return "Summary."
loop.consolidator.archive = _fake_archive
await loop.auto_compact._archive("cli:test") await loop.auto_compact._archive("cli:test")
assert archived_count == 2 assert len(archived_messages) == 2
await loop.close_mcp() await loop.close_mcp()
@ -334,12 +383,9 @@ class TestAutoCompactIdleDetection:
loop.sessions.save(session) loop.sessions.save(session)
archived_messages = [] archived_messages = []
loop.consolidator.compact_idle_session = _make_fake_compact(
async def _fake_archive(messages): loop, track_archived=archived_messages,
archived_messages.extend(messages) )
return "Summary."
loop.consolidator.archive = _fake_archive
# Simulate proactive archive completing before message arrives # Simulate proactive archive completing before message arrives
await loop.auto_compact._archive("cli:test") await loop.auto_compact._archive("cli:test")
@ -402,10 +448,7 @@ class TestAutoCompactIdleDetection:
session.updated_at = datetime.now() - timedelta(minutes=20) session.updated_at = datetime.now() - timedelta(minutes=20)
loop.sessions.save(session) loop.sessions.save(session)
async def _fake_archive(messages): loop.consolidator.compact_idle_session = _make_fake_compact(loop)
return "Summary."
loop.consolidator.archive = _fake_archive
msg = InboundMessage(channel="cli", sender_id="user", chat_id="test", content="/new") msg = InboundMessage(channel="cli", sender_id="user", chat_id="test", content="/new")
response = await loop._process_message(msg) response = await loop._process_message(msg)
@ -466,10 +509,7 @@ class TestAutoCompactSystemMessages:
session.updated_at = datetime.now() - timedelta(minutes=20) session.updated_at = datetime.now() - timedelta(minutes=20)
loop.sessions.save(session) loop.sessions.save(session)
async def _fake_archive(messages): loop.consolidator.compact_idle_session = _make_fake_compact(loop)
return "Summary."
loop.consolidator.archive = _fake_archive
# Simulate proactive archive completing before system message arrives # Simulate proactive archive completing before system message arrives
await loop.auto_compact._archive("cli:test") await loop.auto_compact._archive("cli:test")
@ -547,12 +587,9 @@ class TestAutoCompactEdgeCases:
loop.sessions.save(session) loop.sessions.save(session)
archived_messages = [] archived_messages = []
loop.consolidator.compact_idle_session = _make_fake_compact(
async def _fake_archive(messages): loop, track_archived=archived_messages,
archived_messages.extend(messages) )
return "Summary."
loop.consolidator.archive = _fake_archive
# Simulate proactive archive completing before message arrives # Simulate proactive archive completing before message arrives
await loop.auto_compact._archive("cli:test") await loop.auto_compact._archive("cli:test")
@ -644,10 +681,7 @@ class TestAutoCompactIntegration:
session.updated_at = datetime.now() - timedelta(minutes=20) session.updated_at = datetime.now() - timedelta(minutes=20)
loop.sessions.save(session) loop.sessions.save(session)
async def _fake_archive(messages): loop.consolidator.compact_idle_session = _make_fake_compact(loop)
return "Summary."
loop.consolidator.archive = _fake_archive
# Simulate proactive archive completing before message arrives # Simulate proactive archive completing before message arrives
await loop.auto_compact._archive("cli:test") await loop.auto_compact._archive("cli:test")
@ -704,12 +738,9 @@ class TestProactiveAutoCompact:
loop.sessions.save(session) loop.sessions.save(session)
archived_messages = [] archived_messages = []
loop.consolidator.compact_idle_session = _make_fake_compact(
async def _fake_archive(messages): loop, summary="User chatted about old things.", track_archived=archived_messages,
archived_messages.extend(messages) )
return "User chatted about old things."
loop.consolidator.archive = _fake_archive
await self._run_check_expired(loop) await self._run_check_expired(loop)
@ -748,14 +779,14 @@ class TestProactiveAutoCompact:
started = asyncio.Event() started = asyncio.Event()
block_forever = asyncio.Event() block_forever = asyncio.Event()
async def _slow_archive(messages): async def _slow_compact(key, max_suffix=8):
nonlocal archive_count nonlocal archive_count
archive_count += 1 archive_count += 1
started.set() started.set()
await block_forever.wait() await block_forever.wait()
return "Summary." return "Summary."
loop.consolidator.archive = _slow_archive loop.consolidator.compact_idle_session = _slow_compact
# First call starts archiving via callback # First call starts archiving via callback
loop.auto_compact.check_expired(loop._schedule_background) loop.auto_compact.check_expired(loop._schedule_background)
@ -781,10 +812,10 @@ class TestProactiveAutoCompact:
session.updated_at = datetime.now() - timedelta(minutes=20) session.updated_at = datetime.now() - timedelta(minutes=20)
loop.sessions.save(session) loop.sessions.save(session)
async def _failing_archive(messages): async def _failing_compact(key, max_suffix=8):
raise RuntimeError("LLM down") raise RuntimeError("LLM down")
loop.consolidator.archive = _failing_archive loop.consolidator.compact_idle_session = _failing_compact
# Should not raise # Should not raise
await self._run_check_expired(loop) await self._run_check_expired(loop)
@ -795,24 +826,18 @@ class TestProactiveAutoCompact:
@pytest.mark.asyncio @pytest.mark.asyncio
async def test_proactive_archive_skips_empty_sessions(self, tmp_path): async def test_proactive_archive_skips_empty_sessions(self, tmp_path):
"""Proactive archive should not call LLM for sessions with no un-consolidated messages.""" """Proactive archive should not produce a summary for sessions with no messages."""
loop = _make_loop(tmp_path, session_ttl_minutes=15) loop = _make_loop(tmp_path, session_ttl_minutes=15)
session = loop.sessions.get_or_create("cli:test") session = loop.sessions.get_or_create("cli:test")
session.updated_at = datetime.now() - timedelta(minutes=20) session.updated_at = datetime.now() - timedelta(minutes=20)
loop.sessions.save(session) loop.sessions.save(session)
archive_called = False loop.consolidator.compact_idle_session = _make_fake_compact(loop)
async def _fake_archive(messages):
nonlocal archive_called
archive_called = True
return "Summary."
loop.consolidator.archive = _fake_archive
await self._run_check_expired(loop) await self._run_check_expired(loop)
assert not archive_called # Empty session should not produce a summary
assert "cli:test" not in loop.auto_compact._summaries
await loop.close_mcp() await loop.close_mcp()
@pytest.mark.asyncio @pytest.mark.asyncio
@ -824,18 +849,12 @@ class TestProactiveAutoCompact:
session.updated_at = datetime.now() - timedelta(minutes=20) session.updated_at = datetime.now() - timedelta(minutes=20)
loop.sessions.save(session) loop.sessions.save(session)
archive_count = 0 _fake_compact = _make_fake_compact(loop)
loop.consolidator.compact_idle_session = _fake_compact
async def _fake_archive(messages):
nonlocal archive_count
archive_count += 1
return "Summary."
loop.consolidator.archive = _fake_archive
# Simulate an active agent task for this session # Simulate an active agent task for this session
await self._run_check_expired(loop, active_session_keys={"cli:test"}) await self._run_check_expired(loop, active_session_keys={"cli:test"})
assert archive_count == 0 assert _fake_compact.state["count"] == 0
session_after = loop.sessions.get_or_create("cli:test") session_after = loop.sessions.get_or_create("cli:test")
assert len(session_after.messages) == 12 # All messages preserved assert len(session_after.messages) == 12 # All messages preserved
@ -851,22 +870,16 @@ class TestProactiveAutoCompact:
session.updated_at = datetime.now() - timedelta(minutes=20) session.updated_at = datetime.now() - timedelta(minutes=20)
loop.sessions.save(session) loop.sessions.save(session)
archive_count = 0 _fake_compact = _make_fake_compact(loop)
loop.consolidator.compact_idle_session = _fake_compact
async def _fake_archive(messages):
nonlocal archive_count
archive_count += 1
return "Summary."
loop.consolidator.archive = _fake_archive
# First tick: active task, skip # First tick: active task, skip
await self._run_check_expired(loop, active_session_keys={"cli:test"}) await self._run_check_expired(loop, active_session_keys={"cli:test"})
assert archive_count == 0 assert _fake_compact.state["count"] == 0
# Second tick: task completed, should archive # Second tick: task completed, should archive
await self._run_check_expired(loop) await self._run_check_expired(loop)
assert archive_count == 1 assert _fake_compact.state["count"] == 1
await loop.close_mcp() await loop.close_mcp()
@pytest.mark.asyncio @pytest.mark.asyncio
@ -888,18 +901,12 @@ class TestProactiveAutoCompact:
s3.add_message("user", "recent") s3.add_message("user", "recent")
loop.sessions.save(s3) loop.sessions.save(s3)
archive_count = 0 _fake_compact = _make_fake_compact(loop)
loop.consolidator.compact_idle_session = _fake_compact
async def _fake_archive(messages):
nonlocal archive_count
archive_count += 1
return "Summary."
loop.consolidator.archive = _fake_archive
await self._run_check_expired(loop, active_session_keys={"cli:expired_active"}) await self._run_check_expired(loop, active_session_keys={"cli:expired_active"})
assert archive_count == 1 assert _fake_compact.state["count"] == 1
s1_after = loop.sessions.get_or_create("cli:expired_idle") s1_after = loop.sessions.get_or_create("cli:expired_idle")
assert len(s1_after.messages) == loop.auto_compact._RECENT_SUFFIX_MESSAGES assert len(s1_after.messages) == loop.auto_compact._RECENT_SUFFIX_MESSAGES
s2_after = loop.sessions.get_or_create("cli:expired_active") s2_after = loop.sessions.get_or_create("cli:expired_active")
@ -917,22 +924,16 @@ class TestProactiveAutoCompact:
session.updated_at = datetime.now() - timedelta(minutes=20) session.updated_at = datetime.now() - timedelta(minutes=20)
loop.sessions.save(session) loop.sessions.save(session)
archive_count = 0 _fake_compact = _make_fake_compact(loop)
loop.consolidator.compact_idle_session = _fake_compact
async def _fake_archive(messages):
nonlocal archive_count
archive_count += 1
return "Summary."
loop.consolidator.archive = _fake_archive
# First tick: archives the session # First tick: archives the session
await self._run_check_expired(loop) await self._run_check_expired(loop)
assert archive_count == 1 assert _fake_compact.state["count"] == 1
# Second tick: should NOT re-schedule (updated_at is fresh after clear) # Second tick: should NOT re-schedule (updated_at is fresh after clear)
await self._run_check_expired(loop) await self._run_check_expired(loop)
assert archive_count == 1 # Still 1, not re-scheduled assert _fake_compact.state["count"] == 1 # Still 1, not re-scheduled
await loop.close_mcp() await loop.close_mcp()
@pytest.mark.asyncio @pytest.mark.asyncio
@ -943,22 +944,15 @@ class TestProactiveAutoCompact:
session.updated_at = datetime.now() - timedelta(minutes=20) session.updated_at = datetime.now() - timedelta(minutes=20)
loop.sessions.save(session) loop.sessions.save(session)
archive_count = 0 loop.consolidator.compact_idle_session = _make_fake_compact(loop)
async def _fake_archive(messages):
nonlocal archive_count
archive_count += 1
return "Summary."
loop.consolidator.archive = _fake_archive
# First tick: skips (no messages), refreshes updated_at # First tick: skips (no messages), refreshes updated_at
await self._run_check_expired(loop) await self._run_check_expired(loop)
assert archive_count == 0 assert "cli:test" not in loop.auto_compact._summaries
# Second tick: should NOT re-schedule because updated_at is fresh # Second tick: should NOT re-schedule because updated_at is fresh
await self._run_check_expired(loop) await self._run_check_expired(loop)
assert archive_count == 0 assert "cli:test" not in loop.auto_compact._summaries
await loop.close_mcp() await loop.close_mcp()
@pytest.mark.asyncio @pytest.mark.asyncio
@ -970,18 +964,12 @@ class TestProactiveAutoCompact:
session.updated_at = datetime.now() - timedelta(minutes=20) session.updated_at = datetime.now() - timedelta(minutes=20)
loop.sessions.save(session) loop.sessions.save(session)
archive_count = 0 _fake_compact = _make_fake_compact(loop)
loop.consolidator.compact_idle_session = _fake_compact
async def _fake_archive(messages):
nonlocal archive_count
archive_count += 1
return "Summary."
loop.consolidator.archive = _fake_archive
# First compact cycle # First compact cycle
await loop.auto_compact._archive("cli:test") await loop.auto_compact._archive("cli:test")
assert archive_count == 1 assert _fake_compact.state["count"] == 1
# User returns, sends new messages # User returns, sends new messages
msg = InboundMessage(channel="cli", sender_id="user", chat_id="test", content="second topic") msg = InboundMessage(channel="cli", sender_id="user", chat_id="test", content="second topic")
@ -995,7 +983,7 @@ class TestProactiveAutoCompact:
# Second compact cycle should succeed # Second compact cycle should succeed
await loop.auto_compact._archive("cli:test") await loop.auto_compact._archive("cli:test")
assert archive_count == 2 assert _fake_compact.state["count"] == 2
await loop.close_mcp() await loop.close_mcp()
@ -1011,10 +999,9 @@ class TestSummaryPersistence:
session.updated_at = datetime.now() - timedelta(minutes=20) session.updated_at = datetime.now() - timedelta(minutes=20)
loop.sessions.save(session) loop.sessions.save(session)
async def _fake_archive(messages): loop.consolidator.compact_idle_session = _make_fake_compact(
return "User said hello." loop, summary="User said hello.",
)
loop.consolidator.archive = _fake_archive
await loop.auto_compact._archive("cli:test") await loop.auto_compact._archive("cli:test")
@ -1036,10 +1023,9 @@ class TestSummaryPersistence:
session.updated_at = last_active session.updated_at = last_active
loop.sessions.save(session) loop.sessions.save(session)
async def _fake_archive(messages): loop.consolidator.compact_idle_session = _make_fake_compact(
return "User said hello." loop, summary="User said hello.",
)
loop.consolidator.archive = _fake_archive
# Archive # Archive
await loop.auto_compact._archive("cli:test") await loop.auto_compact._archive("cli:test")
@ -1069,10 +1055,7 @@ class TestSummaryPersistence:
session.updated_at = datetime.now() - timedelta(minutes=20) session.updated_at = datetime.now() - timedelta(minutes=20)
loop.sessions.save(session) loop.sessions.save(session)
async def _fake_archive(messages): loop.consolidator.compact_idle_session = _make_fake_compact(loop)
return "Summary."
loop.consolidator.archive = _fake_archive
await loop.auto_compact._archive("cli:test") await loop.auto_compact._archive("cli:test")
@ -1100,10 +1083,7 @@ class TestSummaryPersistence:
session.updated_at = datetime.now() - timedelta(minutes=20) session.updated_at = datetime.now() - timedelta(minutes=20)
loop.sessions.save(session) loop.sessions.save(session)
async def _fake_archive(messages): loop.consolidator.compact_idle_session = _make_fake_compact(loop)
return "Summary."
loop.consolidator.archive = _fake_archive
await loop.auto_compact._archive("cli:test") await loop.auto_compact._archive("cli:test")
@ -1129,10 +1109,9 @@ class TestSummaryPersistence:
session.updated_at = datetime.now() - timedelta(minutes=20) session.updated_at = datetime.now() - timedelta(minutes=20)
loop.sessions.save(session) loop.sessions.save(session)
async def _fake_archive(messages): loop.consolidator.compact_idle_session = _make_fake_compact(
return "First summary." loop, summary="First summary.",
)
loop.consolidator.archive = _fake_archive
await loop.auto_compact._archive("cli:test") await loop.auto_compact._archive("cli:test")
# Consume the first summary via hot path # Consume the first summary via hot path
@ -1148,10 +1127,9 @@ class TestSummaryPersistence:
session.updated_at = datetime.now() - timedelta(minutes=20) session.updated_at = datetime.now() - timedelta(minutes=20)
loop.sessions.save(session) loop.sessions.save(session)
async def _fake_archive2(messages): loop.consolidator.compact_idle_session = _make_fake_compact(
return "Second summary." loop, summary="Second summary.",
)
loop.consolidator.archive = _fake_archive2
await loop.auto_compact._archive("cli:test") await loop.auto_compact._archive("cli:test")
# The second archive writes a new summary # The second archive writes a new summary
@ -1173,10 +1151,9 @@ class TestSummaryPersistence:
session.updated_at = datetime.now() - timedelta(minutes=20) session.updated_at = datetime.now() - timedelta(minutes=20)
loop.sessions.save(session) loop.sessions.save(session)
async def _fake_archive(messages): loop.consolidator.compact_idle_session = _make_fake_compact(
return "Old summary." loop, summary="Old summary.",
)
loop.consolidator.archive = _fake_archive
await loop.auto_compact._archive("cli:test") await loop.auto_compact._archive("cli:test")
# Verify summary exists before /new # Verify summary exists before /new

View File

@ -38,7 +38,7 @@ def _make_autocompact(
sessions = MagicMock(spec=SessionManager) sessions = MagicMock(spec=SessionManager)
if consolidator is None: if consolidator is None:
consolidator = MagicMock() consolidator = MagicMock()
consolidator.archive = AsyncMock(return_value="Summary.") consolidator.compact_idle_session = AsyncMock(return_value="Summary.")
return AutoCompact( return AutoCompact(
sessions=sessions, sessions=sessions,
consolidator=consolidator, consolidator=consolidator,
@ -178,62 +178,6 @@ class TestFormatSummary:
assert result.startswith("Previous conversation summary (last active ") assert result.startswith("Previous conversation summary (last active ")
# ---------------------------------------------------------------------------
# _split_unconsolidated
# ---------------------------------------------------------------------------
class TestSplitUnconsolidated:
"""Test AutoCompact._split_unconsolidated splitting logic."""
def test_empty_session_returns_both_empty(self):
"""Empty session should return ([], [])."""
ac = _make_autocompact()
session = _make_session(messages=[])
archive, kept = ac._split_unconsolidated(session)
assert archive == []
assert kept == []
def test_all_messages_archivable_when_more_than_suffix(self):
"""Session with many messages should archive a prefix and keep suffix."""
ac = _make_autocompact()
msgs = [{"role": "user", "content": f"u{i}"} for i in range(20)]
session = _make_session(messages=msgs)
archive, kept = ac._split_unconsolidated(session)
assert len(archive) > 0
assert len(kept) <= AutoCompact._RECENT_SUFFIX_MESSAGES
def test_fewer_messages_than_suffix_returns_empty_archive(self):
"""Session with fewer messages than suffix should have empty archive."""
ac = _make_autocompact()
msgs = [{"role": "user", "content": f"u{i}"} for i in range(3)]
session = _make_session(messages=msgs)
archive, kept = ac._split_unconsolidated(session)
assert archive == []
assert len(kept) == len(msgs)
def test_respects_last_consolidated_offset(self):
"""Only messages after last_consolidated should be considered."""
ac = _make_autocompact()
msgs = [{"role": "user", "content": f"u{i}"} for i in range(20)]
# First 10 are already consolidated
session = _make_session(messages=msgs, last_consolidated=10)
archive, kept = ac._split_unconsolidated(session)
# Only the tail of 10 messages is considered for splitting
assert all(m["content"] in [f"u{i}" for i in range(10, 20)] for m in kept)
assert all(m["content"] in [f"u{i}" for i in range(10, 20)] for m in archive)
def test_retain_recent_legal_suffix_keeps_last_n(self):
"""The kept suffix should be at most _RECENT_SUFFIX_MESSAGES long."""
ac = _make_autocompact()
# 20 user messages = 20 messages total, all after last_consolidated=0
msgs = [{"role": "user", "content": f"u{i}"} for i in range(20)]
session = _make_session(messages=msgs)
archive, kept = ac._split_unconsolidated(session)
assert len(kept) <= AutoCompact._RECENT_SUFFIX_MESSAGES
assert len(archive) == len(msgs) - len(kept)
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
# check_expired # check_expired
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
@ -313,126 +257,71 @@ class TestCheckExpired:
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
class TestArchive: class TestArchiveDelegates:
"""Test AutoCompact._archive async method.""" """_archive should delegate all session mutation to Consolidator."""
@pytest.mark.asyncio @pytest.mark.asyncio
async def test_empty_session_updates_timestamp_no_archive_call(self): async def test_calls_compact_idle_session(self):
"""Empty session should refresh updated_at and not call consolidator.archive."""
ac = _make_autocompact() ac = _make_autocompact()
mock_sm = MagicMock(spec=SessionManager) mock_sm = MagicMock(spec=SessionManager)
empty_session = _make_session(messages=[])
mock_sm.get_or_create.return_value = empty_session
ac.sessions = mock_sm ac.sessions = mock_sm
ac.consolidator.archive = AsyncMock(return_value="Summary.") ac.consolidator.compact_idle_session = AsyncMock(return_value="Summary.")
await ac._archive("cli:test") await ac._archive("cli:test")
ac.consolidator.archive.assert_not_called() ac.consolidator.compact_idle_session.assert_awaited_once_with(
mock_sm.save.assert_called_once_with(empty_session) "cli:test", ac._RECENT_SUFFIX_MESSAGES,
# updated_at was refreshed )
assert empty_session.updated_at > datetime.now() - timedelta(seconds=5)
@pytest.mark.asyncio @pytest.mark.asyncio
async def test_archive_returns_empty_string_no_summary_stored(self): async def test_populates_summaries_from_metadata(self):
"""If archive returns empty string, no summary should be stored."""
ac = _make_autocompact() ac = _make_autocompact()
mock_sm = MagicMock(spec=SessionManager) mock_sm = MagicMock(spec=SessionManager)
msgs = [{"role": "user", "content": f"u{i}"} for i in range(20)] session = _make_session(
session = _make_session(messages=msgs) metadata={"_last_summary": {"text": "Hello.", "last_active": "2026-05-13T10:00:00"}}
)
mock_sm.get_or_create.return_value = session mock_sm.get_or_create.return_value = session
ac.sessions = mock_sm ac.sessions = mock_sm
ac.consolidator.archive = AsyncMock(return_value="") ac.consolidator.compact_idle_session = AsyncMock(return_value="Hello.")
await ac._archive("cli:test") await ac._archive("cli:test")
assert "cli:test" not in ac._summaries
@pytest.mark.asyncio
async def test_archive_returns_nothing_no_summary_stored(self):
"""If archive returns '(nothing)', no summary should be stored."""
ac = _make_autocompact()
mock_sm = MagicMock(spec=SessionManager)
msgs = [{"role": "user", "content": f"u{i}"} for i in range(20)]
session = _make_session(messages=msgs)
mock_sm.get_or_create.return_value = session
ac.sessions = mock_sm
ac.consolidator.archive = AsyncMock(return_value="(nothing)")
await ac._archive("cli:test")
assert "cli:test" not in ac._summaries
@pytest.mark.asyncio
async def test_archive_exception_caught_key_removed_from_archiving(self):
"""If archive raises, exception is caught and key removed from _archiving."""
ac = _make_autocompact()
mock_sm = MagicMock(spec=SessionManager)
msgs = [{"role": "user", "content": f"u{i}"} for i in range(20)]
session = _make_session(messages=msgs)
mock_sm.get_or_create.return_value = session
ac.sessions = mock_sm
ac.consolidator.archive = AsyncMock(side_effect=RuntimeError("LLM down"))
# Should not raise
await ac._archive("cli:test")
assert "cli:test" not in ac._archiving
@pytest.mark.asyncio
async def test_successful_archive_stores_summary_in_summaries_and_metadata(self):
"""Successful archive should store summary in _summaries dict and metadata."""
ac = _make_autocompact()
mock_sm = MagicMock(spec=SessionManager)
msgs = [{"role": "user", "content": f"u{i}"} for i in range(20)]
last_active = datetime(2026, 5, 13, 10, 0, 0)
session = _make_session(messages=msgs, updated_at=last_active)
mock_sm.get_or_create.return_value = session
ac.sessions = mock_sm
ac.consolidator.archive = AsyncMock(return_value="User discussed AI.")
await ac._archive("cli:test")
# _summaries
entry = ac._summaries.get("cli:test") entry = ac._summaries.get("cli:test")
assert entry is not None assert entry is not None
assert entry[0] == "User discussed AI." assert entry[0] == "Hello."
assert entry[1] == last_active
# metadata
meta = session.metadata.get("_last_summary")
assert meta is not None
assert meta["text"] == "User discussed AI."
assert "last_active" in meta
@pytest.mark.asyncio @pytest.mark.asyncio
async def test_finally_block_always_removes_from_archiving(self): async def test_no_summary_when_compact_returns_empty(self):
"""Finally block should always remove key from _archiving, even on error."""
ac = _make_autocompact() ac = _make_autocompact()
mock_sm = MagicMock(spec=SessionManager) mock_sm = MagicMock(spec=SessionManager)
msgs = [{"role": "user", "content": f"u{i}"} for i in range(20)]
session = _make_session(messages=msgs)
mock_sm.get_or_create.return_value = session
ac.sessions = mock_sm ac.sessions = mock_sm
ac.consolidator.archive = AsyncMock(side_effect=RuntimeError("fail")) ac.consolidator.compact_idle_session = AsyncMock(return_value="")
# Pre-add key to archiving to verify it gets removed
ac._archiving.add("cli:test")
await ac._archive("cli:test") await ac._archive("cli:test")
assert "cli:test" not in ac._archiving
assert "cli:test" not in ac._summaries
@pytest.mark.asyncio @pytest.mark.asyncio
async def test_finally_removes_from_archiving_on_success(self): async def test_no_summary_when_compact_returns_nothing(self):
"""Finally block should remove key from _archiving on success too."""
ac = _make_autocompact() ac = _make_autocompact()
mock_sm = MagicMock(spec=SessionManager) mock_sm = MagicMock(spec=SessionManager)
msgs = [{"role": "user", "content": f"u{i}"} for i in range(20)]
session = _make_session(messages=msgs)
mock_sm.get_or_create.return_value = session
ac.sessions = mock_sm ac.sessions = mock_sm
ac.consolidator.archive = AsyncMock(return_value="Summary.") ac.consolidator.compact_idle_session = AsyncMock(return_value="(nothing)")
await ac._archive("cli:test")
assert "cli:test" not in ac._summaries
@pytest.mark.asyncio
async def test_exception_still_removes_from_archiving(self):
ac = _make_autocompact()
mock_sm = MagicMock(spec=SessionManager)
ac.sessions = mock_sm
ac.consolidator.compact_idle_session = AsyncMock(side_effect=RuntimeError("fail"))
ac._archiving.add("cli:test") ac._archiving.add("cli:test")
await ac._archive("cli:test") await ac._archive("cli:test")
assert "cli:test" not in ac._archiving assert "cli:test" not in ac._archiving

View File

@ -28,6 +28,12 @@ def mock_provider():
def consolidator(store, mock_provider): def consolidator(store, mock_provider):
sessions = MagicMock() sessions = MagicMock()
sessions.save = MagicMock() sessions.save = MagicMock()
# When maybe_consolidate_by_tokens refreshes the session reference via
# get_or_create(session.key), it should get back the same object the test
# passed in. Store sessions by key so the lookup is transparent.
_session_cache: dict[str, MagicMock] = {}
sessions.get_or_create = MagicMock(side_effect=lambda key: _session_cache.get(key, MagicMock()))
sessions._session_cache = _session_cache
return Consolidator( return Consolidator(
store=store, store=store,
provider=mock_provider, provider=mock_provider,
@ -117,6 +123,7 @@ class TestConsolidatorTokenBudget:
session.last_consolidated = 0 session.last_consolidated = 0
session.messages = [{"role": "user", "content": "hi"}] session.messages = [{"role": "user", "content": "hi"}]
session.key = "test:key" session.key = "test:key"
consolidator.sessions._session_cache[session.key] = session
consolidator.estimate_session_prompt_tokens = MagicMock(return_value=(100, "tiktoken")) consolidator.estimate_session_prompt_tokens = MagicMock(return_value=(100, "tiktoken"))
consolidator.archive = AsyncMock(return_value=True) consolidator.archive = AsyncMock(return_value=True)
await consolidator.maybe_consolidate_by_tokens(session) await consolidator.maybe_consolidate_by_tokens(session)
@ -152,6 +159,7 @@ class TestConsolidatorTokenBudget:
session.add_message("user", f"u{i}") session.add_message("user", f"u{i}")
session.add_message("assistant", f"a{i}") session.add_message("assistant", f"a{i}")
consolidator.sessions._session_cache[session.key] = session
consolidator.estimate_session_prompt_tokens = MagicMock(return_value=(100, "tiktoken")) consolidator.estimate_session_prompt_tokens = MagicMock(return_value=(100, "tiktoken"))
consolidator.archive = AsyncMock(return_value="old conversation summary") consolidator.archive = AsyncMock(return_value="old conversation summary")
@ -184,6 +192,7 @@ class TestConsolidatorTokenBudget:
session.add_message("tool", "tool result", tool_call_id="call-1", name="x") session.add_message("tool", "tool result", tool_call_id="call-1", name="x")
session.add_message("assistant", "final answer") session.add_message("assistant", "final answer")
consolidator.sessions._session_cache[session.key] = session
consolidator.estimate_session_prompt_tokens = MagicMock(return_value=(100, "tiktoken")) consolidator.estimate_session_prompt_tokens = MagicMock(return_value=(100, "tiktoken"))
consolidator.archive = AsyncMock(return_value="tool turn summary") consolidator.archive = AsyncMock(return_value="tool turn summary")
@ -210,6 +219,7 @@ class TestConsolidatorTokenBudget:
} }
for i in range(70) for i in range(70)
] ]
consolidator.sessions._session_cache[session.key] = session
consolidator.estimate_session_prompt_tokens = MagicMock( consolidator.estimate_session_prompt_tokens = MagicMock(
side_effect=[(1200, "tiktoken"), (400, "tiktoken")] side_effect=[(1200, "tiktoken"), (400, "tiktoken")]
) )
@ -238,6 +248,7 @@ class TestConsolidatorTokenBudget:
for i in range(70) for i in range(70)
] ]
session.metadata = {} session.metadata = {}
consolidator.sessions._session_cache[session.key] = session
consolidator.estimate_session_prompt_tokens = MagicMock( consolidator.estimate_session_prompt_tokens = MagicMock(
side_effect=[(1200, "tiktoken"), (400, "tiktoken")] side_effect=[(1200, "tiktoken"), (400, "tiktoken")]
) )
@ -263,6 +274,7 @@ class TestConsolidatorTokenBudget:
for i in range(70) for i in range(70)
] ]
session.metadata = {} session.metadata = {}
consolidator.sessions._session_cache[session.key] = session
# Keep estimates high so the loop would otherwise run multiple rounds. # Keep estimates high so the loop would otherwise run multiple rounds.
consolidator.estimate_session_prompt_tokens = MagicMock( consolidator.estimate_session_prompt_tokens = MagicMock(
return_value=(1200, "tiktoken") return_value=(1200, "tiktoken")
@ -287,6 +299,7 @@ class TestConsolidatorTokenBudget:
} }
for i in range(70) for i in range(70)
] ]
consolidator.sessions._session_cache[session.key] = session
consolidator.estimate_session_prompt_tokens = MagicMock( consolidator.estimate_session_prompt_tokens = MagicMock(
side_effect=[(1200, "tiktoken"), (400, "tiktoken")] side_effect=[(1200, "tiktoken"), (400, "tiktoken")]
) )
@ -299,6 +312,260 @@ class TestConsolidatorTokenBudget:
assert session.last_consolidated == 61 assert session.last_consolidated == 61
class TestCompactIdleSession:
"""Tests for Consolidator.compact_idle_session — lock-protected idle truncation."""
@pytest.fixture
def real_consolidator(self, store, mock_provider):
"""Create a Consolidator with a real SessionManager (not a mock)."""
from nanobot.session.manager import SessionManager
sessions = SessionManager(store.workspace)
return Consolidator(
store=store,
provider=mock_provider,
model="test-model",
sessions=sessions,
context_window_tokens=1000,
build_messages=MagicMock(return_value=[]),
get_tool_definitions=MagicMock(return_value=[]),
max_completion_tokens=100,
)
@pytest.mark.asyncio
async def test_archives_prefix_keeps_suffix(self, real_consolidator, mock_provider):
"""20 user/assistant turns → compact with max_suffix=8 → messages ≤ 8,
last_consolidated=0, _last_summary stored."""
mock_provider.chat_with_retry.return_value = MagicMock(
content="Summary of old conversation.", finish_reason="stop"
)
sessions = real_consolidator.sessions
session = sessions.get_or_create("cli:test")
for i in range(20):
session.add_message("user", f"user msg {i}")
session.add_message("assistant", f"assistant msg {i}")
sessions.save(session)
result = await real_consolidator.compact_idle_session("cli:test", max_suffix=8)
assert result == "Summary of old conversation."
reloaded = sessions.get_or_create("cli:test")
assert len(reloaded.messages) <= 8
assert reloaded.last_consolidated == 0
meta = reloaded.metadata.get("_last_summary")
assert meta is not None
assert meta["text"] == "Summary of old conversation."
assert "last_active" in meta
@pytest.mark.asyncio
async def test_empty_session_refreshes_timestamp(self, real_consolidator):
"""Empty session with old updated_at → refreshed after call, returns ''."""
from datetime import datetime, timedelta
sessions = real_consolidator.sessions
session = sessions.get_or_create("cli:empty")
old_ts = datetime.now() - timedelta(hours=2)
session.updated_at = old_ts
sessions.save(session)
result = await real_consolidator.compact_idle_session("cli:empty")
assert result == ""
reloaded = sessions.get_or_create("cli:empty")
assert reloaded.updated_at > old_ts
@pytest.mark.asyncio
async def test_nothing_summary_not_stored(self, real_consolidator, mock_provider):
"""LLM returns '(nothing)' → _last_summary NOT in metadata."""
mock_provider.chat_with_retry.return_value = MagicMock(
content="(nothing)", finish_reason="stop"
)
sessions = real_consolidator.sessions
session = sessions.get_or_create("cli:nothing")
for i in range(10):
session.add_message("user", f"u{i}")
session.add_message("assistant", f"a{i}")
sessions.save(session)
result = await real_consolidator.compact_idle_session("cli:nothing", max_suffix=4)
assert result == "(nothing)"
reloaded = sessions.get_or_create("cli:nothing")
assert "_last_summary" not in reloaded.metadata
@pytest.mark.asyncio
async def test_llm_failure_still_truncates(self, real_consolidator, mock_provider, store):
"""LLM raises RuntimeError → raw_archive fires, session still truncated, returns None."""
mock_provider.chat_with_retry.side_effect = RuntimeError("LLM unavailable")
sessions = real_consolidator.sessions
session = sessions.get_or_create("cli:fail")
for i in range(10):
session.add_message("user", f"u{i}")
session.add_message("assistant", f"a{i}")
sessions.save(session)
result = await real_consolidator.compact_idle_session("cli:fail", max_suffix=4)
assert result is None
# raw_archive should have been called (history.jsonl gets an entry)
entries = store.read_unprocessed_history(since_cursor=0)
assert any("[RAW]" in e["content"] for e in entries)
# Session should still be truncated
reloaded = sessions.get_or_create("cli:fail")
assert len(reloaded.messages) <= 4
@pytest.mark.asyncio
async def test_respects_last_consolidated(self, real_consolidator, mock_provider):
"""30 turns with last_consolidated=50 → only unconsolidated tail considered."""
mock_provider.chat_with_retry.return_value = MagicMock(
content="Tail summary.", finish_reason="stop"
)
sessions = real_consolidator.sessions
session = sessions.get_or_create("cli:offset")
for i in range(30):
session.add_message("user", f"u{i}")
session.add_message("assistant", f"a{i}")
session.last_consolidated = 50 # Only 10 messages unconsolidated
sessions.save(session)
result = await real_consolidator.compact_idle_session("cli:offset", max_suffix=4)
assert result == "Tail summary."
# Verify only the unconsolidated tail was processed:
# 10 unconsolidated messages (50-59), keep suffix of 4 → archive 6
archived_call = mock_provider.chat_with_retry.call_args
user_content = archived_call.kwargs["messages"][1]["content"]
# Should contain only tail messages, not early ones
assert "u0" not in user_content
assert "u25" in user_content or "a25" in user_content
@pytest.mark.asyncio
async def test_acquires_consolidation_lock(self, real_consolidator, mock_provider):
"""Verify lock is held during execution."""
import asyncio
# Use a slow LLM response to ensure the lock is held while we check
started = asyncio.Event()
async def slow_chat(**kwargs):
started.set()
await asyncio.sleep(0.1)
return MagicMock(content="Summary.", finish_reason="stop")
mock_provider.chat_with_retry = slow_chat
sessions = real_consolidator.sessions
session = sessions.get_or_create("cli:lock")
for i in range(10):
session.add_message("user", f"u{i}")
session.add_message("assistant", f"a{i}")
sessions.save(session)
lock = real_consolidator.get_lock("cli:lock")
assert not lock.locked()
task = asyncio.ensure_future(
real_consolidator.compact_idle_session("cli:lock", max_suffix=4)
)
await started.wait()
assert lock.locked()
await task
assert not lock.locked()
class TestConsolidatorSessionRefresh:
"""Background consolidation must detect stale session references."""
@pytest.mark.asyncio
async def test_reloads_before_empty_session_guard(self, tmp_path):
"""A stale empty reference must not skip a non-empty cached session."""
from nanobot.agent.memory import Consolidator, MemoryStore
from nanobot.session.manager import Session, SessionManager
store = MemoryStore(tmp_path)
provider = MagicMock()
provider.chat_with_retry = AsyncMock(
return_value=MagicMock(content="summary", finish_reason="stop")
)
provider.generation.max_tokens = 4096
provider.estimate_prompt_tokens = MagicMock(return_value=(10, "test"))
sessions = SessionManager(tmp_path)
consolidator = Consolidator(
store=store,
provider=provider,
model="test-model",
sessions=sessions,
context_window_tokens=128_000,
build_messages=MagicMock(return_value=[]),
get_tool_definitions=MagicMock(return_value=[]),
)
fresh = sessions.get_or_create("cli:test")
fresh.add_message("user", "fresh message")
sessions.save(fresh)
stale_empty = Session(key="cli:test")
seen: dict[str, Session] = {}
def estimate(session: Session):
seen["session"] = session
return 10, "test"
consolidator.estimate_session_prompt_tokens = MagicMock(side_effect=estimate)
await consolidator.maybe_consolidate_by_tokens(stale_empty)
assert seen["session"] is fresh
@pytest.mark.asyncio
async def test_reloads_stale_session_after_compact(self, tmp_path):
"""After compact_idle_session replaces the session, a concurrent
maybe_consolidate_by_tokens with the old reference should use the
fresh session from cache instead of overwriting."""
from nanobot.agent.memory import Consolidator, MemoryStore
from nanobot.session.manager import SessionManager
store = MemoryStore(tmp_path)
provider = MagicMock()
provider.chat_with_retry = AsyncMock(
return_value=MagicMock(content="summary", finish_reason="stop")
)
provider.generation.max_tokens = 4096
provider.estimate_prompt_tokens = MagicMock(return_value=(10, "test"))
sessions = SessionManager(tmp_path)
consolidator = Consolidator(
store=store,
provider=provider,
model="test-model",
sessions=sessions,
context_window_tokens=128_000,
build_messages=MagicMock(return_value=[]),
get_tool_definitions=MagicMock(return_value=[]),
)
# Populate session with many messages
session = sessions.get_or_create("cli:test")
for i in range(20):
session.add_message("user", f"u{i}")
session.add_message("assistant", f"a{i}")
sessions.save(session)
# Simulate: background consolidation captures old reference
old_ref = session
# AutoCompact runs first and truncates to 8
await consolidator.compact_idle_session("cli:test", max_suffix=8)
# Background consolidation runs with stale reference —
# should detect the session was replaced and not undo the compact.
await consolidator.maybe_consolidate_by_tokens(old_ref)
session_after = sessions.get_or_create("cli:test")
# Messages should still be truncated (not restored to 40)
assert len(session_after.messages) <= 8
class TestRawArchiveTruncation: class TestRawArchiveTruncation:
"""raw_archive() must cap entry size to avoid bloating history.jsonl.""" """raw_archive() must cap entry size to avoid bloating history.jsonl."""

View File

@ -387,6 +387,7 @@ class TestConsolidationUnaffectedByUnifiedSession:
session = Session(key="unified:default") session = Session(key="unified:default")
session.messages = [{"role": "user", "content": "msg"}] session.messages = [{"role": "user", "content": "msg"}]
sessions.get_or_create.return_value = session
# Simulate over-budget: estimated > budget # Simulate over-budget: estimated > budget
consolidator.estimate_session_prompt_tokens = MagicMock(return_value=(950, "tiktoken")) consolidator.estimate_session_prompt_tokens = MagicMock(return_value=(950, "tiktoken"))