mirror of
https://github.com/HKUDS/nanobot.git
synced 2026-06-15 07:14:08 +00:00
268 lines
9.4 KiB
Markdown
268 lines
9.4 KiB
Markdown
# Cron / Session / Memory Design Decisions
|
|
|
|
This note records the agreed design direction for fixing the mismatch between
|
|
scheduled cron jobs and chat session memory.
|
|
|
|
## Problem
|
|
|
|
User-created cron jobs currently run their agent turn under an internal key such
|
|
as `cron:{job.id}` and only deliver the final response back to the user channel.
|
|
That splits the turn's working memory from the session where the user sees and
|
|
continues the conversation.
|
|
|
|
The visible failure mode is awkward: a cron job reports something into a chat,
|
|
the user discusses it in that chat, and the next cron run behaves as if that
|
|
discussion never happened.
|
|
|
|
The fix is not to make cron a separate delivery system. A user cron job should
|
|
be a scheduled input into a session.
|
|
|
|
## Core Model
|
|
|
|
For new user-created cron jobs, `payload.session_key` is the canonical anchor.
|
|
|
|
- The cron job belongs to that session.
|
|
- The cron job reads that session's memory/history.
|
|
- The cron job produces a normal session turn.
|
|
- There is no separate delivery target concept for new jobs.
|
|
|
|
Legacy fields remain in the store only for compatibility:
|
|
|
|
- `payload.channel`
|
|
- `payload.to`
|
|
- `payload.channel_meta`
|
|
- `payload.deliver`
|
|
|
|
These fields are legacy-only. New cron creation should not depend on them.
|
|
|
|
## Job Categories
|
|
|
|
Use explicit branching:
|
|
|
|
- **Bound user cron job**: `payload.kind == "agent_turn"`,
|
|
`payload.session_key` is present, and no legacy delivery fields
|
|
(`deliver`, `channel`, `to`, or `channel_meta`) are set. This uses the new
|
|
session-turn model.
|
|
- **Legacy unbound cron job**: user job with no `payload.session_key`. Keep the
|
|
existing behavior. Do not migrate, infer, bind, or add UI for these jobs in
|
|
this change.
|
|
- **System job**: `payload.kind == "system_event"` or known internal jobs such
|
|
as `dream` / `heartbeat`. Keep their specialized paths.
|
|
|
|
The project should not grow a compatibility subsystem for legacy jobs. Missing
|
|
`session_key` means old behavior.
|
|
|
|
## New Job Creation
|
|
|
|
`CronTool` must create user cron jobs with a `session_key`.
|
|
|
|
- If no request/session context exists, `cron action=add` should fail.
|
|
- Do not create new unbound jobs.
|
|
- Do not infer `session_key` from `channel/to` for new jobs.
|
|
- Remove `deliver` from the advertised tool schema. It can remain as a Python
|
|
compatibility argument, but it must not affect new bound jobs.
|
|
- New bound jobs should persist `message` and `session_key`; legacy delivery
|
|
fields should not be populated as part of the new path.
|
|
|
|
## Execution Path
|
|
|
|
Bound user cron jobs should execute through `AgentLoop` as internal inbound
|
|
session events, not as an out-of-band `agent.process_direct()` call.
|
|
|
|
The intended flow is:
|
|
|
|
```text
|
|
cron due -> create cron inbound -> AgentLoop dispatches session turn
|
|
```
|
|
|
|
The inbound event should carry metadata identifying the cron run, such as:
|
|
|
|
- job id
|
|
- job name
|
|
- run id
|
|
- prompt reference
|
|
- persisted trigger content
|
|
|
|
This keeps locking, runtime status, session persistence, and WebUI behavior on
|
|
the same path as normal chat turns.
|
|
|
|
`session_key` is the ownership anchor, but an `InboundMessage` still needs an
|
|
execution context. Bound cron must resolve `channel`, `chat_id`, and any
|
|
channel metadata from the target session/session metadata. It must not fall back
|
|
to legacy `payload.channel`, `payload.to`, or `payload.channel_meta` for bound
|
|
jobs. Those fields are only for the legacy unbound path.
|
|
|
|
The scheduler must not mark a bound job run as complete just because the inbound
|
|
event was queued. It should either wait for the cron turn to complete and
|
|
record the real outcome, or explicitly model the run as separate states such as
|
|
`queued` and `turn_completed`. A failed cron turn must be reflected in the
|
|
cron run record/job state, not hidden behind a successful enqueue.
|
|
|
|
## Active Session Behavior
|
|
|
|
Cron must not interrupt an active session turn.
|
|
|
|
- If the target session is idle, run the cron turn immediately.
|
|
- If the target session is running, defer the cron turn until the current turn
|
|
completes.
|
|
- Do not inject the cron turn into the active turn's runtime context.
|
|
- Do not route cron messages into the existing mid-turn pending injection
|
|
queue.
|
|
- UI/runtime status may show that a cron run is queued, but the current LLM
|
|
call should not see the queued cron turn.
|
|
|
|
Cron inbound events need explicit metadata, for example
|
|
`_cron_trigger` plus `_cron_defer_until_session_idle`. `AgentLoop.run()` must
|
|
recognize that metadata before the existing `_pending_queues` mid-turn injection
|
|
branch. If the session is active, the event goes to a deferred cron queue,
|
|
not the pending injection queue.
|
|
|
|
The user experience goal is: cron can run after the current answer, but it
|
|
should not take over an answer already in progress.
|
|
|
|
## Session History
|
|
|
|
Do not persist the raw internal execution prompt as a normal user message.
|
|
|
|
Instead, persist a readable cron trigger event, for example:
|
|
|
|
```json
|
|
{
|
|
"role": "user",
|
|
"content": "Scheduled cron job triggered: daily monitor\n\nCheck ...",
|
|
"_cron_turn": true,
|
|
"cron_job_id": "abc123",
|
|
"cron_job_name": "daily monitor",
|
|
"cron_run_id": "abc123:1770000000000",
|
|
"cron_prompt_ref": {
|
|
"id": "cron.agent_turn.reminder",
|
|
"version": 1,
|
|
"sha256": "..."
|
|
}
|
|
}
|
|
```
|
|
|
|
The assistant result should be saved as the normal assistant response for that
|
|
turn, with source metadata suitable for WebUI rendering.
|
|
|
|
This gives future turns useful context without leaking internal instruction text
|
|
into the transcript.
|
|
|
|
## Prompt Traceability
|
|
|
|
The rendered execution prompt should remain traceable, but it should not be part
|
|
of normal session history.
|
|
|
|
Use a named/versioned prompt reference in session history and save the full
|
|
rendered prompt in an internal run record.
|
|
|
|
Preferred direction:
|
|
|
|
- Move the cron execution prompt out of `commands.py` into a named template.
|
|
- Use a stable prompt id such as `cron.agent_turn.reminder`.
|
|
- Store `prompt_ref` and `cron_run_id` in session history.
|
|
- Store the full rendered prompt, prompt variables, and errors in an internal
|
|
run record.
|
|
|
|
Avoid putting full prompt text into `jobs.json`; run records should not make the
|
|
cron store grow without bound.
|
|
|
|
## Visibility and Evaluation
|
|
|
|
A bound user cron job is a real session turn.
|
|
|
|
- If it succeeds, save and publish the assistant response.
|
|
- Do not pass bound cron responses through `evaluate_response()`.
|
|
- Keep `evaluate_response()` only for system/legacy paths where the old behavior
|
|
still applies.
|
|
- Avoid states where session history contains a response the user never saw.
|
|
|
|
If a bound cron job starts executing, it must leave a visible closure in the
|
|
session:
|
|
|
|
- success response
|
|
- short failure message
|
|
- or an empty-result status message
|
|
|
|
Full exceptions and diagnostic details belong in the internal run record, not in
|
|
the user-facing transcript.
|
|
|
|
## Deleting Sessions
|
|
|
|
Deleting a session with bound cron jobs should be a two-step operation.
|
|
|
|
Default delete behavior should block and return the associated cron jobs.
|
|
Existing WebUI/API response field names are kept for compatibility:
|
|
|
|
```json
|
|
{
|
|
"deleted": false,
|
|
"blocked_by_automations": true,
|
|
"automations": [
|
|
{"id": "abc123", "name": "daily monitor", "enabled": true}
|
|
]
|
|
}
|
|
```
|
|
|
|
After explicit confirmation, the API may delete the bound user cron jobs and
|
|
then delete the session/thread.
|
|
|
|
Rules:
|
|
|
|
- Only block on user-created bound jobs whose `payload.session_key` equals the
|
|
session being deleted.
|
|
- Do not block on system jobs.
|
|
- Do not block on legacy unbound jobs.
|
|
- In unified-session mode, WebUI-created cron jobs still belong to the concrete
|
|
`websocket:*` chat that created them, so deleting that chat should block on or
|
|
delete those jobs.
|
|
- If the user manually deletes files outside the WebUI/API, do not try to
|
|
compensate.
|
|
|
|
## Unified Session Mode
|
|
|
|
When `unified_session` is enabled, WebUI-created cron jobs should still bind to
|
|
the concrete WebUI chat that created them, for example `websocket:<chat_id>`.
|
|
The cron trigger is delivered through that original chat. `AgentLoop` then
|
|
applies `unified_session` normally, so the turn's memory/session context may be
|
|
`unified:default` even though the cron job's ownership key is concrete.
|
|
|
|
- Each WebUI chat should display cron jobs owned by that concrete chat.
|
|
- Individual WebUI thread deletion should block on cron jobs owned by that
|
|
concrete `websocket:*` thread.
|
|
- Toggling `unified_session` does not migrate existing cron jobs. Existing jobs
|
|
keep their stored `payload.session_key` and continue to execute against that
|
|
owner until explicitly removed or recreated.
|
|
|
|
## WebUI Scope
|
|
|
|
This change should not grow into a global scheduler/task manager.
|
|
|
|
Keep the scope focused:
|
|
|
|
- Fix cron/session/memory semantics for new bound jobs.
|
|
- Preserve legacy job behavior.
|
|
- Add deletion protection for sessions with bound cron jobs.
|
|
- Update the existing WebUI panel that lists scheduled jobs only as needed for
|
|
the new bound-job status.
|
|
|
|
Do not add deterministic legacy migration, legacy binding UI, or a global
|
|
calendar/task manager in this change.
|
|
|
|
## Manual Run
|
|
|
|
Do not add a user-visible "run now" feature as part of this design.
|
|
|
|
`CronService.run_job()` may remain an internal/test helper. It should not become
|
|
a product surface, and the implementation should avoid creating a separate
|
|
execution path that behaves differently from scheduled runs.
|
|
|
|
## Non-Goals
|
|
|
|
- No legacy migration.
|
|
- No automatic binding of legacy jobs.
|
|
- No runtime-context prompt asking the model to bind jobs.
|
|
- No new global scheduler/task manager.
|
|
- No new delivery-target abstraction.
|
|
- No user-visible manual cron run.
|