861 Commits

Author SHA1 Message Date
chengyongru
e02615c93d Merge branch 'main' into nightly 2026-05-18 18:05:29 +08:00
Kaloyan Tenchov
e9259e680e feat(image-generation): add Gemini provider support
Adds GeminiImageGenerationClient covering both Imagen 4 (:predict) and
Gemini Flash (:generateContent), wires the gemini ProviderConfig through
the SDK, API server, and gateway entry points, and updates the
image-generation docs and skill. Errors from the Gemini endpoints are
logged and surface with the HTTP status and parsed message instead of an
empty string.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 15:25:03 +08:00
chengyongru
d4ade8f680 feat(cli): add Model Preset wizard to onboard
Extract the [M] Model Presets interactive CRUD screen from PR #3696
and adapt it to the current main branch schema (fallback_models
instead of fallback_presets). Adds preset cache, field handlers for
model_preset/provider/fallback_models, and 9 new tests.
2026-05-18 15:13:41 +08:00
Xubin Ren
ba38f90832
Merge PR #3877: feat(webui+agent): optimize streaming, activity rendering, and runtime sync
feat(webui+agent): optimize streaming, activity rendering, and runtime sync
2026-05-18 02:04:36 +08:00
Xubin Ren
eb3aed359f Refine file edit progress gating 2026-05-18 01:59:55 +08:00
Xubin Ren
4445fcc8b9 refactor(cli): localize reasoning buffer state 2026-05-18 01:34:08 +08:00
Xubin Ren
de8761f25a fix(test): add gateway llm runtime fake 2026-05-18 01:19:45 +08:00
Xubin Ren
8708ccea86 Merge branch 'main' of https://github.com/HKUDS/nanobot into codex/webui-performance 2026-05-18 01:18:28 +08:00
Xubin Ren
eb0ff3ad1d fix(memory): refresh session before empty guard 2026-05-18 01:16:47 +08:00
chengyongru
c58a360b25 fix(test): seed get_or_create mock for session-refresh guard compatibility 2026-05-18 01:16:47 +08:00
chengyongru
5bb94edc99 refactor(autocompact): delegate _archive to Consolidator.compact_idle_session
Replace AutoCompact._archive() direct session mutation with delegation
to Consolidator.compact_idle_session(). Remove _split_unconsolidated()
method since that logic now lives inside compact_idle_session.

All session mutation for idle compaction now goes through the
Consolidator's lock, eliminating the race condition between
background token consolidation and idle TTL compaction.

Changes:
- autocompact.py: rewrite _archive() to call compact_idle_session,
  remove _split_unconsolidated(), clean up unused imports
- test_autocompact_unit.py: replace TestArchive/TestSplitUnconsolidated
  with TestArchiveDelegates that verifies delegation behavior
- test_auto_compact.py: convert all consolidator.archive mocks to
  consolidator.compact_idle_session mocks via _make_fake_compact helper
2026-05-18 01:16:47 +08:00
chengyongru
888d54790d fix(memory): add session-refresh guard to maybe_consolidate_by_tokens
When background consolidation runs with a stale session reference (captured
before AutoCompact replaced the session via compact_idle_session), it could
operate on outdated data. Now, after acquiring the per-session lock, the
method refreshes its session reference from SessionManager.get_or_create().
If the session was replaced, it swaps in the fresh reference before doing
any consolidation work.

This prevents a race where AutoCompact truncates an idle session while a
background maybe_consolidate_by_tokens call is in flight with the old
session object.
2026-05-18 01:16:47 +08:00
chengyongru
48d35bd2d9 feat(consolidator): add compact_idle_session method with lock-protected truncation
Add Consolidator.compact_idle_session(session_key, max_suffix=8) that
performs hard-truncation of idle sessions under the per-session
consolidation lock. This is the single lock-protected path for AutoCompact
to use instead of modifying session state directly, fixing the race
condition between AutoCompact and Consolidator.

Behavior:
- Acquires per-session consolidation lock
- Invalidates cache and reloads fresh from disk
- Splits unconsolidated tail into archive prefix and retained suffix
- Archives prefix via LLM (with raw_archive fallback on failure)
- Persists _last_summary in session metadata on success
- Returns summary text, None on LLM failure, or '' if nothing to archive

Tests: 6 new tests covering prefix archival, empty session timestamp
refresh, (nothing) summary exclusion, LLM failure fallback,
last_consolidated offset, and lock acquisition verification.
2026-05-18 01:16:47 +08:00
Xubin Ren
112f40ad67 fix(agent): refresh llm runtime for background tasks 2026-05-18 00:35:12 +08:00
Xubin Ren
c8bb04a8fe feat(webui): persist agent activity events 2026-05-17 23:51:52 +08:00
Xubin Ren
4b5de66c58 Polish WebUI streaming and provider settings 2026-05-17 17:41:33 +08:00
olgagaga
0ca0fe2221 fix(providers): wire MiMo thinking control on gateway providers (#3845)
The xiaomi_mimo ProviderSpec carries thinking_style="thinking_type", but
gateway providers (OpenRouter etc.) route MiMo under their own spec
which has no thinking_style. As a result, `reasoning_effort="none"` was
silently ignored: `{"thinking": {"type": "disabled"}}` was never
injected and responses still contained reasoning_content.

Mirror the Kimi pattern that already handles the same problem: add an
explicit _MIMO_THINKING_MODELS allowlist (mimo-v2.5-pro, mimo-v2.5,
mimo-v2-pro, mimo-v2-omni — per Xiaomi docs), an _is_mimo_thinking_model
helper that strips publisher prefixes ("xiaomi/mimo-v2.5-pro" matches),
and a sibling branch in _build_kwargs that injects the thinking payload
by model name. mimo-v2-flash is intentionally excluded — it has no
thinking mode.

Also include MiMo in the explicit_thinking predicate so the
reasoning_content backfill (#3554, #3584) covers the gateway path
consistently with the direct path.

Tests cover the gateway disable/enable signals, bare-slug fallback,
flash exclusion, and a non-MiMo sanity check.
2026-05-16 20:46:34 +08:00
Xubin Ren
387724c355 test(agent): add tests to ensure goal state does not leak across sessions 2026-05-16 11:14:56 +00:00
ykstart
f97b960433 fix(exec): refine format command deny pattern to allow URL parameters
The previous regex r"(?:^|[;&|]\s*)format\b" incorrectly blocked
commands containing URL parameters like &format=json. Added negative
lookahead (?!=) so format= (URL param key=value) is allowed while
standalone format commands (e.g. ;format, &format, |format) remain
blocked. Added test cases for both blocking and allowing scenarios.
2026-05-16 18:52:42 +08:00
Xubin Ren
e87c07c368 fix(agent): prevent outer wall-clock timeout for streaming requests 2026-05-16 10:12:57 +00:00
Xubin Ren
e804f2fddb fix(agent): align LLM wall timeout with sustained goals for main + subagents
Centralize runner_wall_llm_timeout_s in session goal_state metadata helpers so
spawned subagents inherit the same policy as AgentLoop without coupling to
long_task. Pass optional resolver into SubagentManager and add tests.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-16 16:33:49 +08:00
Xubin Ren
2144af7cd0 fix(agent): disable LLM wall-clock timeout during sustained goals 2026-05-16 05:27:40 +00:00
yanalialiuk
18072856ec feat: add Atomic Chat as OpenAI-compatible local provider
Register atomic_chat in the provider registry with default base URL
http://localhost:1337/v1, schema field, docs, and config tests.
2026-05-16 12:14:33 +08:00
Xubin Ren
9ccef018c2 feat(telegram): add new slash commands and update regex for command handling 2026-05-15 17:55:52 +00:00
Xubin Ren
858b6610c3 fix(config): reduce max_tokens and context_window_tokens in schema 2026-05-15 17:19:47 +00:00
Xubin Ren
1c2ea1aad2
feat(goal): /goal command & long-running tasks (long_task)
* feat(long-task): add LongTaskTool for multi-step agent tasks

Implements a meta-ReAct loop where long-running tasks are broken into
sequential subagent steps, each starting fresh with the original goal
and progress from the previous step. This prevents context drift when
agents work on complex, multi-step tasks.

- Extract build_tool_registry() from SubagentManager for reuse
- Add run_step() for synchronous subagent execution (no bus announcement)
- Add HandoffTool and CompleteTool as signal mechanisms via shared dict
- Add LongTaskTool orchestrator with simplified prompt (8 iterations/step)
- Register LongTaskTool in main agent loop
- Add _extract_handoff_from_messages fallback for robustness

* fix(long-task): add debug logging for step-level observability

* feat(long-task): major overhaul with structured handoffs, validation, and observability

- Structured HandoffState: HandoffTool now accepts files_created,
  files_modified, next_step_hint, and verification fields instead of
  a plain string. Progress is passed between steps as structured data.

- Completion validation round: After complete() is called, a dedicated
  validator step runs to verify the claim against the original goal.
  If validation fails, the task continues rather than returning
  a false completion.

- Dynamic prompt system: 3 Jinja2 templates (step_start, step_middle,
  step_final) selected based on step number. Final steps get tighter
  budget and stronger "wrap up" guidance.

- Automatic file change tracking: Extracts write_file/edit_file events
  from tool_events and injects them into the next step's context if
  the subagent forgot to report them explicitly.

- Budget tracking & adaptive strategy: Cumulative token usage is tracked
  across steps. Per-step tool budget drops from 8 to 4 in the last
  two steps to force handoff/completion.

- Crash retry with graceful degradation: A step that crashes is retried
  once. Persistent crashes terminate the task and return partial progress.

- Full observability hooks for future WebUI integration:
  - set_hooks() with on_step_start, on_step_complete, on_handoff,
    on_validation_started, on_validation_passed, on_validation_failed,
    on_task_complete, on_task_error, and catch-all on_event.
  - Readable state properties: current_step, total_steps, status,
    last_handoff, cumulative_usage, goal.
  - inject_correction() allows external code to send user corrections
    that are injected into the next step's prompt.

- run_step() accepts optional max_iterations for dynamic budget control.

All 27 long-task tests and 11 subagent tests pass.

* test(long-task): add boundary tests and fix race conditions

- Add 7 edge-case tests: validation crash resilience, hook exception safety, mid-run correction injection, FIFO correction ordering, explicit file changes overriding auto-detection, final budget for max_steps=1, and dynamic budget switching boundaries

- Fix assertion in test_long_task_completes_after_multiple_handoffs to match exact prompt format

- Remove asyncio timing hack from test_state_exposure

- Add asyncio.sleep(0) yield in test_inject_correction_during_execution to prevent race between signal injection and step continuation

- All 34 tests passing

* fix(long-task): address code review findings

- Declare _scopes = {"core"} explicitly to prevent recursive nesting in subagent scope
- Document fragile coupling in _extract_file_changes: path extraction depends on
  write_file/edit_file detail format; add debug log for unexpected formats
- Align final-template threshold (max_steps - 2) with budget switch threshold
- Eliminate hasattr(self, "_state") in _reset_state by initializing in __init__

* fix(long-task): honor final signal and file tracking

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(long-task): improve prompt structure and agent contract

- Expand LongTaskTool.description to instruct parent agent on goal
  construction, return value semantics, and how to handle results.
- Expand CompleteTool.description to emphasize that the summary IS the
  final answer returned to the parent agent.
- Prefix validated return value with an explicit "final answer" directive
  to stop parent agent from re-running work.
- Redesign step_start.md: Step 1 is now explicitly for exploration,
  planning, and skeleton-building. complete() is discouraged.
- Remove bulky payload debug logging from _emit(); add targeted
  info/warning/error logs at key state transitions instead.
- Add signal_type to HandoffState for cleaner signal detection.

* test(long-task): expect wrapped completion message after validation

Align assertions with LongTaskTool final return shape on main.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(webui): turn timing strip, latency, and session-switch restore

- Agent loop: publish goal_status run/idle for WebSocket turns; attach
  wall-clock latency_ms on turn_end and persisted assistant metadata.
- WebSocket channel: forward goal_status and latency fields to clients.
- NanobotClient: track goal_status started_at per chat without requiring
  onChat; useNanobotStream restores run strip when returning to a chat.
- Thread UI: composer/shell viewport hooks for run duration and latency;
  format helpers and i18n strings.
- MessageBubble: drop trailing StreamCursor (layout artifact vs block markdown).
- Builtin / tests: model command coverage, websocket and loop tests.

Covers multi-session UX and round-trip timing visibility for the WebUI.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix: keep message-tool file attachments after canonical history hydrate

- MessageTool records per-turn media paths delivered to the active chat.
- nanobot.utils.session_attachments stages out-of-media-root files and
  merges into the last assistant message before save (loop stays a thin call).
- WebUI MediaCell: use a signed URL as a real download link when present.

Fixes attachments flashing then vanishing on turn_end when paths lived
outside get_media_dir (e.g. workspace files).

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(webui): agent activity cluster, stable keys, LTR sheen labels

- Group reasoning and tool traces in AgentActivityCluster with i18n summaries
- Stabilize React list keys for activity clusters (first message id anchor)
- Replace background-clip shimmer with overlay sheen for streaming labels
- ThreadMessages/MessageList integration and locale strings

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(webui): render assistant reasoning with Markdown + deferred stream

- Use MarkdownText for ReasoningBubble body (same GFM/KaTeX path as replies)
- Apply muted/italic prose tokens so thinking stays visually subordinate
- useDeferredValue while reasoningStreaming to ease parser work during deltas
- Preload markdown chunk when trace opens; add regression test with preloaded renderer

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(webui): default-collapse agent activity cluster while Working

Outer fold no longer auto-expands during isTurnStreaming; user opens to see traces.
Header sheen and live summary unchanged.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(long_task): cumulative run history, file union, and prompt tuning

Inject cross-step summaries and merged file paths into middle/final step
templates so chains do not lose early context. Strip the last run-history
block when it duplicates Previous Progress to save tokens. Add optional
cumulative_prompt_max_chars and cumulative_step_body_max_chars parameters
with clamped defaults.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(webui): session switch keeps in-flight thread and replays buffered WS

Save the prior chat message list to the per-chat cache in a layout effect
when chatId changes (before stale writes could corrupt another chat).
Skip one post-switch layout cache tick so we do not snapshot the wrong tab.

Buffer inbound events per chat_id when no onChat subscriber is registered
(e.g. user focused another session) and drain on resubscribe up to a cap,
so streaming deltas are not lost while off-tab.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(webui): snap thread scroll to bottom on session open (no smooth glide)

Use scroll-behavior auto on the viewport, instant programmatic scroll when
following new messages and on scrollToBottomSignal. Keep smooth only for
the explicit scroll-to-bottom button.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(webui): respect manual scroll-up after opening a session

Track when the user leaves the bottom with a ref and skip ResizeObserver
and deferred bottom snaps until they return or the conversation is reset.
Remove the time-based force-bottom window that overrode atBottom.

Multi-frame scrollToBottom honours the same guard unless force (scroll button).

Co-authored-by: Cursor <cursoragent@cursor.com>

* Publish long_task UI snapshots on outbound metadata

- Add OUTBOUND_META_AGENT_UI (_agent_ui) for channel-agnostic structured state
- LongTaskTool publishes {kind: long_task, data: snapshot} on the bus with _progress
- WebSocket send forwards metadata as agent_ui for WebUI clients
- Tests for bus payload, WS frame, and progress assertions
- Fix loop progress tests: ignore _goal_status in streaming final filter and
  avoid brittle outbound[-1] ordering after goal status idle messages

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat: WebUI long_task activity card and resilient history merge

Add optional ui_summary to the long_task tool for one-line UI labels. Stream
long_task agent_ui into a dedicated message row with timeline, markdown peek,
and a right sheet for details. Merge canonical history after turn_end while
re-inserting long_task rows before the final assistant reply. Collapse
duplicate task_start/step_start steps in the timeline and extend i18n.

Co-authored-by: Cursor <cursoragent@cursor.com>

* refactor: align long_task with thread_goal and drop orchestrator UI

- Persist sustained objectives via session metadata (long_task / complete_goal); no subagent wiring or tool-driven agent_ui payloads.\n- Remove WebUI long-task activity UI, types, and translations; history merge preserves trace replay only, with legacy long_task rows normalized to traces.\n- Drop long_task prompt templates and get_long_task_run_dir; add webui thread disk helper for gateway persistence tests.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(agent): thread goal runtime context, tools, and skill

- Add thread_goal_state helper and mirror active objectives into Runtime Context
- Wire loop/context/memory/events as needed for goal metadata in turns
- Expand long_task / complete_goal semantics (pivot/cancel/honest recap)
- Add always-on thread-goal SKILL.md; align /goal command prompt
- Tests for context builder and thread goal state
- Remove unused webui ChatPane component

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(thread-goal): add websocket snapshot helper and publish goal updates from long_task

Introduce thread_goal_ws_blob for bounded JSON snapshots, attach snapshots to
websocket turn_end metadata in AgentLoop, and let long_task fan-out dedicated
thread_goal frames on the websocket channel after persisting session metadata.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(channels): websocket thread_goal frames, turn_end replay, and session API scrub for subagent inject

Emit thread_goal events and optional thread_goal on turn_end; scrub persisted
subagent announce blobs on GET /api/sessions/.../messages and shorten session
list previews so WebUI does not surface full Task/Summarize scaffolding.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(webui): merge ephemeral traces per user turn when reconciling canonical history

Preserve disk/live trace rows inside the matching user–assistant segment instead
of stacking every trace before the final assistant reply (fixes inflated tool
counts after refresh or session switch).

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(webui): show assistant reply copy only on the last slice before the next user turn

Avoid duplicate copy affordances on intermediate assistant bubbles that precede
more agent activity in the same turn (tools or further assistant text).

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(webui): thread_goal stream plumbing, composer goal strip, sky glow, and client-side subagent scrub projection

Track thread_goal and turn_goal snapshots in NanobotClient, hydrate React state
from thread_goal frames and turn_end, surface objective/elapsed in the composer,
add breathing sky halo CSS while goals are active, mirror server scrub logic on
history hydration and webui_thread snapshots, and extend tests/client mocks.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(channels): add Slack Socket Mode connect timeout with actionable timeout errors

Abort hung websockets.connect handshakes after a bounded wait, log REST-vs-WSS
guidance, surface RuntimeError to channel startup, and log successful WSS setup.

Co-authored-by: Cursor <cursoragent@cursor.com>

* webui: expand thread goal in composer bottom sheet

Add ChevronUp control on the run/goal strip that opens a bottom Sheet
with full ui_summary and objective. Inline preview logic in RunElapsedStrip,
add i18n strings across locales, and a composer unit test.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(webui): widen dedupeToolCallsForUi input for session API typing

fetchSessionMessages types tool_calls as unknown; accept unknown so tsc
build passes when passing message.tool_calls through.

Co-authored-by: Cursor <cursoragent@cursor.com>

* refactor(agent): extract WebSocket turn run status to webui_turn_helpers

* refactor(skills): rename thread-goal to long-task and document idempotent goals

* feat(skills): rename sustained-goal skill to long-goal and tighten long_task guidance

* chore: remove unused subagent/context/router helpers

* feat(session): rename sustained goal to goal_state and align WS/WebUI

- Move helpers from agent/thread_goal_state to session/goal_state:
  GOAL_STATE_KEY, goal_state_runtime_lines, goal_state_ws_blob, parse_goal_state.
- Session metadata now uses "goal_state"; still read legacy "thread_goal";
  long_task writes drop the legacy key after save.
- WebSocket: event/field goal_state, _goal_state_sync; turn_end carries goal_state;
  accept legacy _thread_goal_sync/thread_goal inbound metadata for dispatch.
- WebUI: GoalStateWsPayload, goalState hook/client props, i18n keys goalState*.
- Runtime Context copy uses "Goal (active):" instead of "Thread goal".

* feat(agent): stream Anthropic thinking deltas and fix stream idle timeout

* refactor(webui): transcript jsonl as sole timeline source

* fix(agent): reject mismatched WS message chat_id and stream reasoning deltas

* feat(webui): hydrate sustained goal and run timer after websocket subscribe

* chore(webui,websocket): remove unused fetch helpers and legacy thread_goal WS paths

* Raise default max_tokens and context window in agent schema.

Align AgentDefaults and ModelPresetConfig with typical Claude-scale usage
(32k completion budget, 256k context window) and update migration tests.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(gateway): bootstrap prefers in-memory model; clarify websocket naming

* fix(websocket): websocket _handle_message passes is_dm; refresh /status test expectations

---------

Co-authored-by: chengyongru <2755839590@qq.com>
Co-authored-by: chengyongru <chengyongru.ai@gmail.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-16 01:14:11 +08:00
hanyuanling
2d17a095dc fix(codex): stabilize prompt cache key 2026-05-16 00:13:10 +08:00
hanyuanling
b2ac609bb5 fix(web): back off Brave search rate limits 2026-05-16 00:12:50 +08:00
chengyongru
0f3677c0d8 perf(agent): append runtime context after user content for cache stability
Runtime context (time, channel, sender) changes every turn, so placing
it before user content invalidated the prompt-cache prefix. Appending it
after user content keeps the prefix stable and improves KV cache hit
rates. The stripping logic in _save_turn was simplified from 16 lines
to 6 as a side benefit.
2026-05-15 23:06:37 +08:00
hinotoi-agent
164614ccf2 fix(message): share workspace path resolver 2026-05-15 17:19:20 +08:00
hinotoi-agent
57d7847dc8 fix(message): confine local media attachments 2026-05-15 17:19:20 +08:00
chengyongru
fe90edd71f refactor(tools): remove GlobTool
GlobTool is redundant — GrepTool already supports glob-based file
filtering via its `glob` parameter, making a standalone glob-only
tool unnecessary. Removing it simplifies the tool surface and reduces
LLM confusion between glob and grep.
2026-05-15 17:19:00 +08:00
Jiajun Xie
6a25d8042d fix(shell): support UNC paths in Windows path extraction
- Update regex in _extract_absolute_paths to match both drive paths (C:\...) and UNC paths (\server\share)
- Add comprehensive test cases for UNC paths, mixed paths, and edge cases
2026-05-15 15:47:15 +08:00
chengyongru
88ff64be48 feat(pairing): allow omitted allowFrom — pairing-only mode by default
Previously _validate_allow_from raised SystemExit when allowFrom was
missing, forcing every channel to declare an explicit allowlist.
With the pairing feature this is no longer necessary: a channel with
no allowFrom simply operates in pairing-only mode, letting users
approve senders via /pairing approve <code> from the WebUI or CLI.

- Replace SystemExit with an info log in _validate_allow_from
- Add test_validate_allow_from_allows_missing_allow_from
2026-05-15 15:46:44 +08:00
chengyongru
199a1bb8fa docs(pairing): address reviewer comments — comments, error msg, __all__ test
- Clarify SystemExit message for missing/null allowFrom (manager.py)
- Document why Feishu passes content="" for unauthorized DMs
- Document exact-match semantics in BaseChannel.is_allowed()
- Document negligible collision probability in generate_code()
- Add test_all_exports_are_importable for nanobot.pairing.__all__
2026-05-15 15:46:44 +08:00
chengyongru
ac9a2d0c25 test(pairing): cover _PENDING_USER_TURN_KEY cleanup and None allow_from
- Assert pending_user_turn is cleared from session metadata after
  shortcut commands (e.g. /help) in test_auto_compact.py.
- Add test for None allow_from / allowFrom values in
  test_base_channel.py to prevent TypeError regressions.
2026-05-15 15:46:44 +08:00
chengyongru
b68e9fa21e fix(pairing): persist shortcut commands and avoid Feishu side effects
- AgentLoop._state_command now persists user message and assistant
  response for shortcut commands (e.g. /pairing) so WebUI history
  hydration after _turn_end no longer shows an empty chat.  /new is
  excluded because it intentionally clears the session.

- Feishu _on_message sends pairing codes for unauthorized DMs before
  any media side effects (reactions, downloads, transcription).
  Group chat unauthorized senders are still silently ignored early.

- Update test_feishu_reply to assert the new DM pairing behavior.
2026-05-15 15:46:44 +08:00
chengyongru
f9d404618b refactor(pairing): move /pairing from BaseChannel to CommandRouter
/pairing is now a first-class built-in command dispatched through
CommandRouter, just like /status, /model, /dream, etc.

Benefits:
- WebUI automatically shows /pairing in the slash command palette
  (because builtin_command_palette() feeds /api/commands).
- All channels (Telegram, Discord, WebSocket, etc.) use the same
  dispatch path for /pairing; no more channel-level interception.
- The command still only works for already-authorised users because
  is_allowed() gates message ingestion before the bus.

Changes:
- Add handle_pairing_command() to nanobot.pairing.store — pure
  function callable from CLI, CommandRouter, and tests.
- Add cmd_pairing to nanobot.command.builtin and register in
  BUILTIN_COMMAND_SPECS + register_builtin_commands().
- Remove BaseChannel._handle_pairing_command() and the /pairing
  interception logic from _handle_message().
- Clean up unused pairing imports from base.py.
- Add unit tests for handle_pairing_command and cmd_pairing dispatch.
2026-05-15 15:46:44 +08:00
chengyongru
9bc86ee825 refactor(pairing): apply simplify review fixes
- Extract format_pairing_reply() and format_expiry() to eliminate
duplication between BaseChannel and SlackChannel.
- Use _write_text_atomic() from helpers.py instead of hand-rolled
fsync logic in pairing store.
- Convert approved lists to in-memory sets for O(1) lookup.
- Remove collision retry loop (8-char entropy is sufficient).
- Fix /pairing command parsing to split prefix exactly.
- Remove unused import time from base.py.
- Fix tests to pass subcommand_text, not full /pairing string.
2026-05-15 15:46:44 +08:00
chengyongru
f8e7e50759 code-review fixes: fsync, entropy, is_dm propagation, tests
- Add os.fsync with Windows-compatible directory flush in pairing store
- Increase pairing code length from 6 -> 8 characters for higher entropy
- Remove SystemExit on empty allowFrom; empty list now defers to pairing
- Update is_allowed docstring to document pairing fallback semantics
- Propagate is_dm to Matrix (direct rooms) and Slack (im channels)
- Slack _is_allowed now checks pairing store for DM allowlist mode
- Fix /pairing revoke to accept optional channel argument
- Move inline import time to module top-level
- Add WebSocket comment explaining is_dm=True assumption
- Add comprehensive tests for store and BaseChannel pairing integration
- Fix existing tests that expected empty allowFrom to hard-exit

Refs #3774
2026-05-15 15:46:44 +08:00
hinotoi-agent
39db5c4846 fix(feishu): confine downloaded media filenames 2026-05-15 15:44:52 +08:00
chengyongru
26665823e3 fix(agent): persist shortcut commands without polluting LLM context
Shortcut commands (e.g. /help, /pairing) skip BUILD and SAVE states,
so their turns were never persisted to the session.  This caused WebUI
chats to appear empty after _turn_end because history hydration reads
from the session file.

Fix by persisting the user message and assistant response inside
_state_command, but tag them with _command=True so Session.get_history
filters them out of LLM context.  /new is excluded because it
intentionally clears the session.

- AgentLoop._persist_user_message_early now accepts **kwargs so
  _state_command can pass _command=True for the user turn.
- Session.get_history skips messages with _command=True.
2026-05-14 23:51:58 +08:00
Xubin Ren
5d7f3f2751 fix(webui): stabilize live thread rendering and navigation 2026-05-13 16:39:07 +00:00
chengyongru
6a4ed255de fix(mcp): probe HTTP port before connecting to prevent event-loop crash
When an MCP server configured as streamableHttp or SSE is unreachable,
streamable_http_client's anyio task group cleanup raises RuntimeError /
ExceptionGroup that escapes the caller's try/except and crashes the
event loop with "Unhandled exception in event loop".

Fix: add a lightweight TCP probe (_probe_http_url) before entering the
MCP SDK transport. If the port is closed, the server is skipped with a
warning instead of crashing. stdio transport is not probed (local
process).

Closes #3739
2026-05-13 23:39:07 +08:00
Xubin Ren
5efd67919b feat(runner): support fallback candidates
Resolve fallbackModels as preset references or explicit inline provider configs so failover uses complete model settings without exposing fallback logic to the agent loop.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-13 15:34:03 +00:00
Xubin Ren
43db848db0 Revert "feat(runner): support structured fallback models"
This reverts commit 02b059a616dc6dc82ad15282102c7b27a5a34e40.
2026-05-13 14:11:08 +00:00
Xubin Ren
02b059a616 feat(runner): support structured fallback models
Bind fallback model chains to the active model configuration so defaults and presets do not inherit or merge fallback behavior implicitly. Require explicit fallback providers while preserving per-fallback generation overrides and context-window safety.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-13 13:57:30 +00:00
Xubin Ren
eaa8ebd5d3 Merge remote-tracking branch 'origin/main' into pr-3756 2026-05-13 13:12:56 +00:00
Xubin Ren
fb508a302a feat(webui): refresh session titles from live updates 2026-05-13 13:10:21 +00:00
chengyongru
913b0774d8 feat(runner): add model failover with fallback_models
When the primary model returns a non-transient error and no content
has been streamed yet, the runner now tries each model listed in the
active preset's fallback_models in order.  Each fallback model may
reside on a different provider — a temporary provider instance is
created on-the-fly via make_provider(config, model=...).

Key design:
- Failover is request-scoped (does not affect subagents/dream/consolidator)
- Provider is restored via try/finally after each fallback attempt
- Skipped when content was already streamed to avoid duplicate output
- Recursive failover prevented by clearing fallback_models on fallback spec
- Circuit breaker trips open after 3 consecutive primary failures (60s cooldown)
- Cross-provider routing: fallback model prefix (e.g. groq/) determines provider

Fixes: cross-provider fallback was broken because the factory passed the
original preset (with provider forced to primary's provider) when creating
fallback providers.  Now uses provider="auto" so the model string prefix
correctly routes to the right provider.

Also fixes: log messages now distinguish between primary-failed,
previous-fallback-failed, and circuit-open scenarios.

closes: https://github.com/HKUDS/nanobot/issues/3376
2026-05-13 17:30:49 +08:00