2535 Commits

Author SHA1 Message Date
chengyongru
d7a73093a8 refactor: remove dead image media attachment code
- Remove generated_image_paths_from_messages() and _extract_text_payload() from artifacts.py (no runtime callers)
- Remove session_attachments.py entirely (merge_turn_media_into_last_assistant and stage_media_paths_for_session_replay had no runtime callers)
- Remove test_session_media_persist.py and the orphaned test in test_artifacts.py
2026-05-19 15:35:19 +08:00
chengyongru
59548b0a04 docs(image-generation): collapse redundant Quick Setup examples
Keep one minimal OpenRouter example and link to Provider Notes
for AIHubMix, MiniMax, and Gemini configuration.
2026-05-19 15:35:19 +08:00
chengyongru
fc1c8ea770 fix(image-generation): let LLM deliver images via message tool instead of runtime media attachment
The runtime media-attachment mechanism was broken for streaming channels
(e.g. WebSocket): the _streamed flag caused _send_once to skip the final
OutboundMessage that carried generated media, so images were never delivered.

Rather than adding complex coordination between streaming and media delivery,
delegate image delivery to the LLM: after generate_image returns artifact
paths, the next_step prompt now instructs the LLM to call the message tool
with the paths in the media parameter. This works uniformly across all
channels, streaming or not.

Remove generated_media from TurnContext, _assemble_outbound, and _state_save.
Update prompts in identity.md, SKILL.md, message tool description, and
artifacts.py to reflect the new flow.
2026-05-19 15:35:19 +08:00
chengyongru
99e4d25d4c docs(image-generation): add MiniMax to docs and skill
Updates docs/image-generation.md and skills/image-generation/SKILL.md to
include MiniMax configuration examples, supported aspect ratios, and
troubleshooting references. Also updates the supported provider list to
include minimax alongside openrouter, aihubmix, and gemini.
2026-05-19 15:35:19 +08:00
chengyongru
c588d56a77 refactor(image-generation): introduce provider registry to eliminate manual wiring
Adds ImageGenerationProvider ABC with shared __init__, _http_post(), and
_require_images(). Introduces _IMAGE_GEN_PROVIDERS registry with
register/get/image_gen_provider_configs() helpers.

Four existing providers (OpenRouter, AIHubMix, Gemini, MiniMax) now inherit
from the base class and self-register. Adding a new provider only requires
writing one class + one registration line.

Eliminates if/else chains in the tool dispatch and hardcoded provider config
dicts in commands.py (3 sites) and nanobot.py (1 site). Fixes the agent CLI
command missing image_generation_provider_configs entirely.

Also simplifies test monkeypatch targets to patch the registry lookup.
2026-05-19 15:35:19 +08:00
Kaloyan Tenchov
7367741ac1 feat(image-generation): add Gemini provider support
Adds GeminiImageGenerationClient covering both Imagen 4 (:predict) and
Gemini Flash (:generateContent), wires the gemini ProviderConfig through
the SDK, API server, and gateway entry points, and updates the
image-generation docs and skill. Errors from the Gemini endpoints are
logged and surface with the HTTP status and parsed message instead of an
empty string.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 15:35:19 +08:00
yaotutu
4e0d872588 feat: add MiniMax image generation provider support
Add MiniMaxImageGenerationClient with support for:
- Text-to-image generation via MiniMax image-01 model
- Reference image support (subject_reference)
- Aspect ratio selection
- Proper error handling aligned with existing providers

Wire up MiniMax provider config in ImageGenerationTool, gateway,
serve, and Nanobot class.
2026-05-19 15:35:19 +08:00
Xubin Ren
7411afa0e7 fix(webui): sync remark-breaks lockfile 2026-05-18 22:47:33 +08:00
Xubin Ren
c4293a7835 feat(providers): add Ant Ling support 2026-05-18 22:13:52 +08:00
Xubin Ren
40c1d83b32 fix(ci): update live file edit test expectations 2026-05-18 22:01:33 +08:00
Xubin Ren
0537cc1682 feat(webui): render live file edit activity 2026-05-18 22:01:33 +08:00
Xubin Ren
7e2dbdef7d feat(webui): stream live file edit events 2026-05-18 22:01:33 +08:00
chengyongru
d4ade8f680 feat(cli): add Model Preset wizard to onboard
Extract the [M] Model Presets interactive CRUD screen from PR #3696
and adapt it to the current main branch schema (fallback_models
instead of fallback_presets). Adds preset cache, field handlers for
model_preset/provider/fallback_models, and 9 new tests.
2026-05-18 15:13:41 +08:00
chengyongru
28d0f8560e fix(webui): preserve single newlines in markdown rendering
Add remark-breaks plugin so that single newlines in assistant messages
(such as /help output) render as line breaks instead of being collapsed
into a single paragraph by standard markdown behavior.
2026-05-18 15:12:27 +08:00
Xubin Ren
ba38f90832
Merge PR #3877: feat(webui+agent): optimize streaming, activity rendering, and runtime sync
feat(webui+agent): optimize streaming, activity rendering, and runtime sync
2026-05-18 02:04:36 +08:00
Xubin Ren
eb3aed359f Refine file edit progress gating 2026-05-18 01:59:55 +08:00
Xubin Ren
4445fcc8b9 refactor(cli): localize reasoning buffer state 2026-05-18 01:34:08 +08:00
liyazhou
b67205f5aa fix(cli): buffer reasoning tokens to avoid one-token-per-line display 2026-05-18 01:34:08 +08:00
Xubin Ren
de8761f25a fix(test): add gateway llm runtime fake 2026-05-18 01:19:45 +08:00
Xubin Ren
8708ccea86 Merge branch 'main' of https://github.com/HKUDS/nanobot into codex/webui-performance 2026-05-18 01:18:28 +08:00
Xubin Ren
eb0ff3ad1d fix(memory): refresh session before empty guard 2026-05-18 01:16:47 +08:00
chengyongru
c58a360b25 fix(test): seed get_or_create mock for session-refresh guard compatibility 2026-05-18 01:16:47 +08:00
chengyongru
5bb94edc99 refactor(autocompact): delegate _archive to Consolidator.compact_idle_session
Replace AutoCompact._archive() direct session mutation with delegation
to Consolidator.compact_idle_session(). Remove _split_unconsolidated()
method since that logic now lives inside compact_idle_session.

All session mutation for idle compaction now goes through the
Consolidator's lock, eliminating the race condition between
background token consolidation and idle TTL compaction.

Changes:
- autocompact.py: rewrite _archive() to call compact_idle_session,
  remove _split_unconsolidated(), clean up unused imports
- test_autocompact_unit.py: replace TestArchive/TestSplitUnconsolidated
  with TestArchiveDelegates that verifies delegation behavior
- test_auto_compact.py: convert all consolidator.archive mocks to
  consolidator.compact_idle_session mocks via _make_fake_compact helper
2026-05-18 01:16:47 +08:00
chengyongru
888d54790d fix(memory): add session-refresh guard to maybe_consolidate_by_tokens
When background consolidation runs with a stale session reference (captured
before AutoCompact replaced the session via compact_idle_session), it could
operate on outdated data. Now, after acquiring the per-session lock, the
method refreshes its session reference from SessionManager.get_or_create().
If the session was replaced, it swaps in the fresh reference before doing
any consolidation work.

This prevents a race where AutoCompact truncates an idle session while a
background maybe_consolidate_by_tokens call is in flight with the old
session object.
2026-05-18 01:16:47 +08:00
chengyongru
48d35bd2d9 feat(consolidator): add compact_idle_session method with lock-protected truncation
Add Consolidator.compact_idle_session(session_key, max_suffix=8) that
performs hard-truncation of idle sessions under the per-session
consolidation lock. This is the single lock-protected path for AutoCompact
to use instead of modifying session state directly, fixing the race
condition between AutoCompact and Consolidator.

Behavior:
- Acquires per-session consolidation lock
- Invalidates cache and reloads fresh from disk
- Splits unconsolidated tail into archive prefix and retained suffix
- Archives prefix via LLM (with raw_archive fallback on failure)
- Persists _last_summary in session metadata on success
- Returns summary text, None on LLM failure, or '' if nothing to archive

Tests: 6 new tests covering prefix archival, empty session timestamp
refresh, (nothing) summary exclusion, LLM failure fallback,
last_consolidated offset, and lock acquisition verification.
2026-05-18 01:16:47 +08:00
Xubin Ren
fce1550814 fix(webui): refresh bootstrap token before expiry 2026-05-18 00:53:36 +08:00
voidborne-d
bf8a6e35fd docs(deployment): match docker run gateway example to docker-compose.yml (refs #3873)
The `docker run` example for `gateway` in `docs/deployment.md` had drifted from
the canonical configuration in `docker-compose.yml`:

- It omitted the security flags that `docker-compose.yml` already declares
  (`cap_drop: ALL` + `cap_add: SYS_ADMIN` + unconfined apparmor/seccomp).
  These are required whenever `tools.exec.sandbox: "bwrap"` is enabled, because
  bwrap needs CAP_SYS_ADMIN for user namespaces; without them bwrap exits with
  `clone3: Operation not permitted` and exec tools silently fail.
- It omitted `-p 8765:8765`, even though both the bundled `docker-compose.yml`
  and `Dockerfile` (`EXPOSE 18790 8765`) already expose the WebSocket channel
  / WebUI port; users following the docs would get a reachable gateway health
  endpoint but an unreachable WebUI.

This change keeps the two paths in sync so anyone reading deployment.md and
using `docker run` directly gets the same security posture and port surface
as the Compose path.

Also adds a short `!IMPORTANT` note documenting that `gateway.host` and
`channels.websocket.host` default to `127.0.0.1` (set in
`nanobot/config/schema.py:GatewayConfig`). Docker `-p` cannot forward to the
container's loopback interface, so the user must set both binds to `0.0.0.0`
in `config.json` for the published ports to actually be reachable. This is
the symptom reported as items 2 + 3 of #3873; items 1 + 4 of that issue are
already resolved on `main` (`Dockerfile` line 49 already exposes both ports,
and README.md lines 218-220 already reflect that the WebUI ships in the wheel).

Docs only, no code changes.

Signed-off-by: voidborne-d <258577966+voidborne-d@users.noreply.github.com>
2026-05-18 00:45:49 +08:00
Xubin Ren
f017e209da docs(configuration): align Docker env-file example 2026-05-18 00:45:34 +08:00
olgagaga
5a34504b76 docs(configuration): expand "Environment Variables for Secrets" section
- Note that any string field supports ${VAR_NAME} and resolved values are
  never written back to disk.
- Document the failure mode for unset variables.
- Add MCP (stdio env + HTTP headers) and web-search examples.
- Add Docker, direnv, and secret-manager (1Password / pass / Bitwarden)
  delivery patterns alongside the existing systemd example.
- Replace plaintext apiKey values in tools.web.search examples (Brave,
  Tavily, Jina, Kagi, Olostep) with ${PROVIDER_API_KEY} placeholders so
  the docs stop modelling the anti-pattern.
- Cross-link from the Security section.

Refs: HKUDS/nanobot#2172
2026-05-18 00:45:34 +08:00
Xubin Ren
af26ed0041 fix(heartbeat): remove unused runtime import 2026-05-18 00:40:31 +08:00
Xubin Ren
112f40ad67 fix(agent): refresh llm runtime for background tasks 2026-05-18 00:35:12 +08:00
Xubin Ren
2f323e24c1 fix(webui): polish session titles and status 2026-05-17 23:52:50 +08:00
Xubin Ren
361f31c0e4 fix(webui): use portal file reference tooltips 2026-05-17 23:52:29 +08:00
Xubin Ren
945f208d38 feat(webui): render file edit activity 2026-05-17 23:52:14 +08:00
Xubin Ren
c8bb04a8fe feat(webui): persist agent activity events 2026-05-17 23:51:52 +08:00
Xubin Ren
4b5de66c58 Polish WebUI streaming and provider settings 2026-05-17 17:41:33 +08:00
Xubin Ren
9340567f2d Fix duplicate reasoning display 2026-05-17 17:11:38 +08:00
Xubin Ren
e5be4dac7a Optimize WebUI streaming and long history rendering
Batch stream deltas, window long transcripts, lazy-load syntax highlighting, and refine activity/composer interactions.

Add title refresh retries plus tests for streaming, windowing, code blocks, and live activity behavior.
2026-05-17 17:04:57 +08:00
Xubin Ren
175b58e259 fix(docker): document bundled webui port 2026-05-17 15:51:04 +08:00
huanglei.214
3bf8de047a fix docker build 2026-05-17 15:51:04 +08:00
chengyongru
400f822601 fix(providers): recognize Chinese rate-limit marker '访问量过大' as transient error 2026-05-17 14:25:20 +08:00
Xubin Ren
9fb9d7afcb docs: update README with v0.2.0 release details, including new features and improvements 2026-05-16 15:22:32 +00:00
Xubin Ren
c018c3fb6a chore(release): bundle webui into wheel and prep 0.2.0 v0.2.0 2026-05-16 13:38:11 +00:00
olgagaga
0ca0fe2221 fix(providers): wire MiMo thinking control on gateway providers (#3845)
The xiaomi_mimo ProviderSpec carries thinking_style="thinking_type", but
gateway providers (OpenRouter etc.) route MiMo under their own spec
which has no thinking_style. As a result, `reasoning_effort="none"` was
silently ignored: `{"thinking": {"type": "disabled"}}` was never
injected and responses still contained reasoning_content.

Mirror the Kimi pattern that already handles the same problem: add an
explicit _MIMO_THINKING_MODELS allowlist (mimo-v2.5-pro, mimo-v2.5,
mimo-v2-pro, mimo-v2-omni — per Xiaomi docs), an _is_mimo_thinking_model
helper that strips publisher prefixes ("xiaomi/mimo-v2.5-pro" matches),
and a sibling branch in _build_kwargs that injects the thinking payload
by model name. mimo-v2-flash is intentionally excluded — it has no
thinking mode.

Also include MiMo in the explicit_thinking predicate so the
reasoning_content backfill (#3554, #3584) covers the gateway path
consistently with the direct path.

Tests cover the gateway disable/enable signals, bare-slug fallback,
flash exclusion, and a non-MiMo sanity check.
2026-05-16 20:46:34 +08:00
chengyongru
8a819dda1e fix(agent): remove duplicate runtime context injection in mid-turn drain
_drain_pending injected a full runtime context block (including goal
state) into every injected user message, but the initial message already
carries runtime context via build_messages(). This caused goal state to
appear multiple times in the LLM context window within a single turn,
wasting tokens (up to 4000 chars per duplicate).

Now _drain_pending only passes the raw user content without runtime
context. The initial turn message remains the sole carrier.
2026-05-16 20:46:08 +08:00
chengyongru
45eacc3a98 docs: update CLAUDE.md to reflect current codebase state
- Update channels list: add WeCom, DingTalk, Email, MoChat, MS Teams
- Update providers: add Bedrock, Codex, Responses API, image generation, transcription
- Update tools: add long_task/sustained goals, image generation, sandbox backends
- Update session: add goal_state.py for sustained goal tracking
- Add missing subsystems: API Server, Command Router, Heartbeat, Pairing, Skills, Security
2026-05-16 20:45:52 +08:00
Xubin Ren
387724c355 test(agent): add tests to ensure goal state does not leak across sessions 2026-05-16 11:14:56 +00:00
ykstart
f97b960433 fix(exec): refine format command deny pattern to allow URL parameters
The previous regex r"(?:^|[;&|]\s*)format\b" incorrectly blocked
commands containing URL parameters like &format=json. Added negative
lookahead (?!=) so format= (URL param key=value) is allowed while
standalone format commands (e.g. ;format, &format, |format) remain
blocked. Added test cases for both blocking and allowing scenarios.
2026-05-16 18:52:42 +08:00
Xubin Ren
e87c07c368 fix(agent): prevent outer wall-clock timeout for streaming requests 2026-05-16 10:12:57 +00:00
Xubin Ren
06a1bef9fe fix(goal): reduce pre-long_task overthinking 2026-05-16 09:57:44 +00:00