The old prompt framed cron firing as a "task triggered" status report,
which led the agent to reply with things like "Done ✅ 已提醒
U0AV8BJPV8D 喝水" — exposing the user id and reading like a system log
instead of a friendly reminder. Reword it to instruct the agent to
speak directly to the user and forbid status-style language.
Made-with: Cursor
Route heartbeat, cron, and message-tool deliveries through one gateway helper so user-visible proactive messages are available when the channel replies.
Made-with: Cursor
Calling GitHub Copilot with `gpt-5.*` / `o*` models (e.g.
`github_copilot/gpt-5.4`, `github_copilot/gpt-5.4-mini`) failed with a
chain of misleading errors:
1. `Unsupported parameter: 'max_tokens' is not supported with this
model. Use 'max_completion_tokens' instead.`
2. `model "gpt-5.4-mini" is not accessible via the /chat/completions
endpoint` (`unsupported_api_for_model`).
3. `The requested model is not supported.` (`model_not_supported`)
even after routing to /responses.
Root causes (each one masked the next):
* The `github_copilot` ProviderSpec did not opt into
`supports_max_completion_tokens`, so `_build_kwargs` always sent the
legacy `max_tokens` parameter that GPT-5/o-series reject.
* `_should_use_responses_api` was hard-gated to
`spec.name == "openai"` plus a direct-OpenAI base URL, so the
GitHub Copilot backend always went through /chat/completions even
for models the Copilot gateway exposes only via /responses
(e.g. `gpt-5.4-mini`).
* When /responses did fail on github_copilot, the existing
"compatibility marker" heuristic silently fell back to
/chat/completions — which can never succeed for these models — so
the real upstream error was hidden.
* `_build_responses_body` did not honour `spec.strip_model_prefix`,
so the request body sent `model="github_copilot/gpt-5.4-mini"`
(with the routing prefix), which the Copilot gateway rejects with
`model_not_supported`. (`_build_kwargs` already stripped it; this
branch was missed.)
Fix:
* registry.py: set `supports_max_completion_tokens=True` on the
`github_copilot` spec so requests use `max_completion_tokens`.
* openai_compat_provider.py:
- `_should_use_responses_api` now also allows the
`github_copilot` spec, and skips the direct-OpenAI base check
for it (the Copilot gateway is its own base URL).
- `_build_responses_body` now strips the model routing prefix
when `spec.strip_model_prefix` is set, matching `_build_kwargs`.
- `chat` / `chat_stream` no longer fall back from /responses to
/chat/completions on the `github_copilot` spec: the fallback
cannot succeed for GPT-5/o-series and would mask the real
gateway error.
Tests:
* tests/cli/test_commands.py: switched the
`test_github_copilot_provider_refreshes_client_api_key_before_chat`
fixture model from `gpt-5.1` to `gpt-4` so it continues to exercise
the /chat/completions code path it was designed for (gpt-5.1 now
correctly routes to /responses on github_copilot).
* `pytest tests/providers/ tests/cli/test_commands.py` — 314 passed.
* Verified end-to-end against the live Copilot gateway with both
`github_copilot/gpt-5.4` and `github_copilot/gpt-5.4-mini`.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
On filesystems with write-back caching (rclone VFS, NFS, FUSE mounts)
the OS page cache may buffer recent session writes. If the process is
killed before the cache flushes, the most recent conversation turns are
silently lost — causing the agent to "forget" recent context and
respond to stale history on the next startup.
Changes:
- session/manager.py: add fsync=True option to save() that flushes the
file and its parent directory to durable storage. Add flush_all() that
re-saves every cached session with fsync. Default save() behavior is
unchanged (no fsync) to avoid performance regression in normal
operation.
- cli/commands.py: call agent.sessions.flush_all() in the gateway
shutdown finally block, after stopping heartbeat/cron/channels.
- tests/session/test_session_fsync.py: 8 tests covering fsync flag
behavior, flush_all with empty/multiple/errored sessions, and
data survival across simulated process restart.
- tests/cli/test_commands.py: add sessions attribute to _FakeAgentLoop
so the gateway health endpoint test passes with the new shutdown
flush.
Cron jobs now pass on_progress=_silent to process_direct, matching
the heartbeat pattern. Previously, tool hints and streaming deltas
were published to the user channel via bus during execution, but the
final response could be rejected by evaluate_response — leaving users
with confusing partial output and no conclusion.
Closes#3319
Replace fixed sleep-based waits with condition polling in cron tests and mock the restart delay in CLI restart tests to reduce suite runtime without changing behavior.
The old test `test_make_console_uses_force_terminal` hardcoded
`force_terminal is True`, which contradicts the fix: we now defer
to sys.stdout.isatty() so piped / non-TTY output gets plain text
instead of ANSI escape codes.
Split into two tests covering both branches:
- test_make_console_force_terminal_when_stdout_is_tty: TTY path
(force_terminal=True, rich output)
- test_make_console_force_terminal_false_when_stdout_is_not_tty:
non-TTY path (force_terminal=False, plain text) — regression
guard for the bug reported in #3265
Co-authored with Claude Opus 4.7
- Pass resolved self.context_window_tokens to Consolidator instead of
raw parameter that could be None, preventing consolidation failures
- Calculate percentage against input budget (ctx - max_completion - 1024)
instead of raw context window, consistent with Consolidator/snip formulas
- Pass actual max_completion_tokens from provider to build_status_content
- Cap percentage display at 999 to prevent runaway values
- Add tests for budget-based percentage and cap behavior
Lock the /status task counter to the actual stop scope by asserting it sums unfinished dispatch tasks with running subagents for the current session.
Made-with: Cursor
Bind the gateway health listener to localhost by default and reduce the probe response to a minimal status payload so accidental public exposure leaks less information.
Made-with: Cursor
Keep the gateway health endpoint patch current with the latest gateway runtime changes, and lock the new HTTP routes in with CLI regression coverage and README guidance.
Made-with: Cursor
Allow config.json to reference environment variables via ${VAR_NAME}
syntax. Variables are resolved at runtime by resolve_config_env_vars(),
keeping the raw templates in the Pydantic model so save_config()
preserves them. This lets secrets live in a separate env file
(e.g. loaded by systemd EnvironmentFile=) instead of plain text
in config.json.
Fixes#2591
The "nanobot is thinking..." spinner was printing ANSI escape codes
literally in some terminals, causing garbled output like:
?[2K?[32m⠧?[0m ?[2mnanobot is thinking...?[0m
Root causes:
1. Console created without force_terminal=True, so Rich couldn't
reliably detect terminal capabilities
2. Spinner continued running during user input prompt, conflicting
with prompt_toolkit
Changes:
- Set force_terminal=True in _make_console() for proper ANSI handling
- Add stop_for_input() method to StreamRenderer
- Call stop_for_input() before reading user input in interactive mode
- Add tests for the new functionality
Replace single-stage MemoryConsolidator with a two-stage architecture:
- Consolidator: lightweight token-budget triggered summarization,
appends to HISTORY.md with cursor-based tracking
- Dream: cron-scheduled two-phase processor that analyzes HISTORY.md
and updates SOUL.md, USER.md, MEMORY.md via AgentRunner with
edit_file tools for surgical, fault-tolerant updates
New files: MemoryStore (pure file I/O), Dream class, DreamConfig,
/dream and /dream-log commands. 89 tests covering all components.
Address PR review feedback by avoiding an async method reference as the OpenAI client api_key.
Initialize the client with a placeholder key, refresh the Copilot token before each chat/chat_stream call, and update the runtime client api_key before dispatch.
Add a regression test that verifies the client api_key is refreshed to a real string before chat requests.
Generated with GitHub Copilot, GPT-5.4.
Implement the real GitHub device flow and Copilot token exchange for the GitHub Copilot provider.
Also route github-copilot models through a dedicated backend and strip the provider prefix before API requests.
Add focused regression coverage for provider wiring and model normalization.
Generated with GitHub Copilot, GPT-5.4.
Read serve host, port, and timeout from config by default, keep CLI flags higher priority, and bind the API to localhost by default for safer local usage.