Adds GeminiImageGenerationClient covering both Imagen 4 (:predict) and
Gemini Flash (:generateContent), wires the gemini ProviderConfig through
the SDK, API server, and gateway entry points, and updates the
image-generation docs and skill. Errors from the Gemini endpoints are
logged and surface with the HTTP status and parsed message instead of an
empty string.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add MiniMaxImageGenerationClient with support for:
- Text-to-image generation via MiniMax image-01 model
- Reference image support (subject_reference)
- Aspect ratio selection
- Proper error handling aligned with existing providers
Wire up MiniMax provider config in ImageGenerationTool, gateway,
serve, and Nanobot class.
Extract the [M] Model Presets interactive CRUD screen from PR #3696
and adapt it to the current main branch schema (fallback_models
instead of fallback_presets). Adds preset cache, field handlers for
model_preset/provider/fallback_models, and 9 new tests.
Add remark-breaks plugin so that single newlines in assistant messages
(such as /help output) render as line breaks instead of being collapsed
into a single paragraph by standard markdown behavior.
Replace AutoCompact._archive() direct session mutation with delegation
to Consolidator.compact_idle_session(). Remove _split_unconsolidated()
method since that logic now lives inside compact_idle_session.
All session mutation for idle compaction now goes through the
Consolidator's lock, eliminating the race condition between
background token consolidation and idle TTL compaction.
Changes:
- autocompact.py: rewrite _archive() to call compact_idle_session,
remove _split_unconsolidated(), clean up unused imports
- test_autocompact_unit.py: replace TestArchive/TestSplitUnconsolidated
with TestArchiveDelegates that verifies delegation behavior
- test_auto_compact.py: convert all consolidator.archive mocks to
consolidator.compact_idle_session mocks via _make_fake_compact helper
When background consolidation runs with a stale session reference (captured
before AutoCompact replaced the session via compact_idle_session), it could
operate on outdated data. Now, after acquiring the per-session lock, the
method refreshes its session reference from SessionManager.get_or_create().
If the session was replaced, it swaps in the fresh reference before doing
any consolidation work.
This prevents a race where AutoCompact truncates an idle session while a
background maybe_consolidate_by_tokens call is in flight with the old
session object.
Add Consolidator.compact_idle_session(session_key, max_suffix=8) that
performs hard-truncation of idle sessions under the per-session
consolidation lock. This is the single lock-protected path for AutoCompact
to use instead of modifying session state directly, fixing the race
condition between AutoCompact and Consolidator.
Behavior:
- Acquires per-session consolidation lock
- Invalidates cache and reloads fresh from disk
- Splits unconsolidated tail into archive prefix and retained suffix
- Archives prefix via LLM (with raw_archive fallback on failure)
- Persists _last_summary in session metadata on success
- Returns summary text, None on LLM failure, or '' if nothing to archive
Tests: 6 new tests covering prefix archival, empty session timestamp
refresh, (nothing) summary exclusion, LLM failure fallback,
last_consolidated offset, and lock acquisition verification.
The `docker run` example for `gateway` in `docs/deployment.md` had drifted from
the canonical configuration in `docker-compose.yml`:
- It omitted the security flags that `docker-compose.yml` already declares
(`cap_drop: ALL` + `cap_add: SYS_ADMIN` + unconfined apparmor/seccomp).
These are required whenever `tools.exec.sandbox: "bwrap"` is enabled, because
bwrap needs CAP_SYS_ADMIN for user namespaces; without them bwrap exits with
`clone3: Operation not permitted` and exec tools silently fail.
- It omitted `-p 8765:8765`, even though both the bundled `docker-compose.yml`
and `Dockerfile` (`EXPOSE 18790 8765`) already expose the WebSocket channel
/ WebUI port; users following the docs would get a reachable gateway health
endpoint but an unreachable WebUI.
This change keeps the two paths in sync so anyone reading deployment.md and
using `docker run` directly gets the same security posture and port surface
as the Compose path.
Also adds a short `!IMPORTANT` note documenting that `gateway.host` and
`channels.websocket.host` default to `127.0.0.1` (set in
`nanobot/config/schema.py:GatewayConfig`). Docker `-p` cannot forward to the
container's loopback interface, so the user must set both binds to `0.0.0.0`
in `config.json` for the published ports to actually be reachable. This is
the symptom reported as items 2 + 3 of #3873; items 1 + 4 of that issue are
already resolved on `main` (`Dockerfile` line 49 already exposes both ports,
and README.md lines 218-220 already reflect that the WebUI ships in the wheel).
Docs only, no code changes.
Signed-off-by: voidborne-d <258577966+voidborne-d@users.noreply.github.com>
- Note that any string field supports ${VAR_NAME} and resolved values are
never written back to disk.
- Document the failure mode for unset variables.
- Add MCP (stdio env + HTTP headers) and web-search examples.
- Add Docker, direnv, and secret-manager (1Password / pass / Bitwarden)
delivery patterns alongside the existing systemd example.
- Replace plaintext apiKey values in tools.web.search examples (Brave,
Tavily, Jina, Kagi, Olostep) with ${PROVIDER_API_KEY} placeholders so
the docs stop modelling the anti-pattern.
- Cross-link from the Security section.
Refs: HKUDS/nanobot#2172
Batch stream deltas, window long transcripts, lazy-load syntax highlighting, and refine activity/composer interactions.
Add title refresh retries plus tests for streaming, windowing, code blocks, and live activity behavior.
The xiaomi_mimo ProviderSpec carries thinking_style="thinking_type", but
gateway providers (OpenRouter etc.) route MiMo under their own spec
which has no thinking_style. As a result, `reasoning_effort="none"` was
silently ignored: `{"thinking": {"type": "disabled"}}` was never
injected and responses still contained reasoning_content.
Mirror the Kimi pattern that already handles the same problem: add an
explicit _MIMO_THINKING_MODELS allowlist (mimo-v2.5-pro, mimo-v2.5,
mimo-v2-pro, mimo-v2-omni — per Xiaomi docs), an _is_mimo_thinking_model
helper that strips publisher prefixes ("xiaomi/mimo-v2.5-pro" matches),
and a sibling branch in _build_kwargs that injects the thinking payload
by model name. mimo-v2-flash is intentionally excluded — it has no
thinking mode.
Also include MiMo in the explicit_thinking predicate so the
reasoning_content backfill (#3554, #3584) covers the gateway path
consistently with the direct path.
Tests cover the gateway disable/enable signals, bare-slug fallback,
flash exclusion, and a non-MiMo sanity check.
_drain_pending injected a full runtime context block (including goal
state) into every injected user message, but the initial message already
carries runtime context via build_messages(). This caused goal state to
appear multiple times in the LLM context window within a single turn,
wasting tokens (up to 4000 chars per duplicate).
Now _drain_pending only passes the raw user content without runtime
context. The initial turn message remains the sole carrier.
The previous regex r"(?:^|[;&|]\s*)format\b" incorrectly blocked
commands containing URL parameters like &format=json. Added negative
lookahead (?!=) so format= (URL param key=value) is allowed while
standalone format commands (e.g. ;format, &format, |format) remain
blocked. Added test cases for both blocking and allowing scenarios.
Centralize runner_wall_llm_timeout_s in session goal_state metadata helpers so
spawned subagents inherit the same policy as AgentLoop without coupling to
long_task. Pass optional resolver into SubagentManager and add tests.
Co-authored-by: Cursor <cursoragent@cursor.com>
The Development Setup block instructs new contributors to run
`ruff format nanobot/`, but the tree predates the formatter and many
lines exceed the configured 100-char limit (E501 is ignored). Running
the command as documented produces an ~80-file unrelated diff that
buries real changes. Document this and recommend formatting only the
files actually touched.
Startup no longer triggers preloadMarkdownText (#3746). Restore the named
export so MessageBubble can still warm the lazy markdown chunk when the
reasoning panel opens (compatible with current main).
Co-authored-by: Cursor <cursoragent@cursor.com>