nanobot

mirror of https://github.com/HKUDS/nanobot.git synced 2026-05-21 00:52:34 +00:00

Author	SHA1	Message	Date
Xubin Ren	458b4ba235	feat(reasoning): stream reasoning content as a first-class channel Reasoning now flows as its own stream — symmetric to the answer's ``delta`` / ``stream_end`` pair — instead of being shipped as one oversized progress message. This lets WebUI render a live "Thinking…" bubble that updates in place, then auto-collapses when the stream closes. Other channels remain plugin no-ops by default. ## Protocol New metadata: ``_reasoning_delta`` (chunk) and ``_reasoning_end`` (close marker). ChannelManager routes both to the dedicated plugin hooks below; the legacy one-shot ``_reasoning`` is kept for back-compat and BaseChannel expands it into a single delta + end pair so plugins only ever implement the streaming primitives. WebSocket emits two new events: - ``reasoning_delta`` (event, chat_id, text, optional stream_id) - ``reasoning_end`` (event, chat_id, optional stream_id) ## BaseChannel surface - ``send_reasoning_delta(chat_id, delta, metadata)`` — no-op default - ``send_reasoning_end(chat_id, metadata)`` — no-op default - ``send_reasoning(msg)`` — back-compat wrapper, base impl forwards to the streaming primitives A channel adds reasoning support by overriding the two streaming primitives. Telegram / Slack / Discord / Feishu / WeChat / Matrix keep the base no-ops until their bubble UIs are adapted; reasoning silently drops at dispatch, never as a stray text message. ## AgentHook Adds ``emit_reasoning_end`` to the hook lifecycle. ``_LoopHook`` tracks whether a reasoning segment is open and closes it on: - the first answer delta arriving (so the UI locks the bubble before the answer renders below), - ``on_stream_end``, - one-shot ``reasoning_content`` / ``thinking_blocks`` after a single non-streaming response. ## WebUI - ``UIMessage.reasoning`` is now a single accumulated string with a companion ``reasoningStreaming`` flag. - ``useNanobotStream`` consumes ``reasoning_delta`` / ``reasoning_end``; legacy ``kind: "reasoning"`` is auto-translated to a delta + end. - New ``ReasoningBubble``: shimmer header + auto-expanded while streaming, collapses to a clickable "Thinking" pill once closed, respects ``prefers-reduced-motion``. - Answer deltas adopt the reasoning placeholder so the bubble and the answer share one assistant row. ## Tests - ``tests/channels/test_channel_manager_reasoning.py`` — manager routes delta + end, drops on channel opt-out, expands one-shot back-compat. - ``tests/channels/test_websocket_channel.py`` — new ``reasoning_delta`` / ``reasoning_end`` frames, empty-chunk safety, no-subscriber safety, back-compat expansion. - ``tests/agent/test_runner_reasoning.py`` — runner closes the segment on streaming answer start and after one-shot reasoning. - WebUI ``useNanobotStream`` + ``message-bubble`` cover the new protocol and the shimmer styling. ## Docs ``docs/configuration.md`` and ``docs/websocket.md`` document the new events and the plugin contract. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-13 07:13:43 +00:00
Xubin Ren	a6b059d379	refactor(reasoning): make channel plugins own reasoning rendering Reasoning was being shipped to every channel as a generic progress message with a `_reasoning: true` flag. Two problems with that: 1. Channels without a low-emphasis UI primitive (Telegram, Slack, Discord, Feishu...) would dump raw model thoughts as ordinary replies, polluting the conversation. 2. The agent loop double-gated by inspecting `channels_config`, which coupled the loop to display policy. Treat reasoning as its own plugin action — `BaseChannel.send_reasoning` defaults to a documented no-op; channels that have a fitting affordance override. ChannelManager routes `_reasoning` outbounds to that method only when the channel opts in via `show_reasoning` (camelCase alias `showReasoning` mirrors `sendProgress`). Plugins that don't override silently drop reasoning — "no fit, no leak" is the contract. Reference implementation lands for WebSocket / WebUI: a new `kind: "reasoning"` frame, parked on the active assistant bubble as a collapsible `Thinking` group above the answer. CLI keeps its existing direct path (it doesn't go through the bus). `ChannelsConfig.show_reasoning` flips to `true` by default — only adapted channels surface anything, others stay quiet. Loop net diff is -3 lines: the `channels_config.show_reasoning` check moves out, leaving emit_reasoning a one-liner that publishes and trusts the channel to decide. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-13 06:27:53 +00:00
chengyongru	05e0106592	refactor(logging): preserve tracebacks and add channel context - Preserve tracebacks: logger.error in except blocks → logger.exception - Channel context: BaseChannel injects self.logger = logger.bind(channel=name) - Third-party bridge: redirect_lib_logging() replaces ad-hoc stdlib-to-loguru bridges - Log levels: network timeouts downgraded from ERROR → WARNING - Fix --verbose flag to actually work with loguru (set handler to DEBUG)	2026-05-06 21:17:45 +08:00
chengyongru	74270bb8a8	refactor(channels): resolve progress overrides at init-time like transcription	2026-04-29 16:43:09 +08:00
k	123d69bfb7	fix: allow specifying transcription language	2026-04-22 12:41:32 +08:00
flobo3	1826ab44fa	feat(transcription): add language parameter for Groq Whisper STT	2026-04-22 12:41:32 +08:00
Mohamed Elkholy	ce5272c153	fix(transcription): honor api_base for OpenAI transcription provider Complete the symmetry left by #3214: ChannelManager._resolve_transcription_base already resolves providers.openai.api_base, but BaseChannel.transcribe_audio instantiated OpenAITranscriptionProvider without forwarding it, and the provider __init__ did not accept the parameter. Self-hosted OpenAI-compatible Whisper endpoints (LiteLLM, vLLM, etc.) configured via config.json were therefore ignored for the OpenAI backend. - OpenAITranscriptionProvider.__init__ now accepts api_base with env fallback (OPENAI_TRANSCRIPTION_BASE_URL) matching the Groq pattern. - BaseChannel.transcribe_audio forwards self.transcription_api_base to OpenAI. - Tests mirror the existing Groq coverage: manager propagation for provider "openai", BaseChannel-to-provider argument passing, and provider default vs override for api_url. Fully backward-compatible: when api_base is None and the env var is unset, the default https://api.openai.com/v1/audio/transcriptions is used. Refs #3213, follow-up to #3214.	2026-04-17 13:46:51 +08:00
flobo3	0401ca9dbc	fix: pass apiBase from config to GroqTranscriptionProvider	2026-04-17 13:46:51 +08:00
Xubin Ren	1f33df1ea6	fix: preserve empty dict allow_from handling Keep dict-backed channel configs compatible with both allow_from and allowFrom without losing empty-list semantics, and add focused regression coverage for the allow-list boundary. Made-with: Cursor	2026-04-15 01:26:51 +08:00
samy	73cf9a220b	fix: handle dict config in is_allowed() and _validate_allow_from() getattr() on a dict never finds custom keys — it only searches object attributes, not dict keys. When channel config is loaded as a Pydantic extra field (which is a plain dict), getattr(config, 'allow_from', []) always returns the default [], causing all access to be denied regardless of the allowFrom configuration. Fix both is_allowed() and _validate_allow_from() to use isinstance checks, falling back to dict.get() for dict configs while preserving getattr() for object-style configs.	2026-04-15 01:26:51 +08:00
Xubin Ren	019eaff225	simplify: remove transcription fallback, respect explicit config Configured provider is the only one used — no silent fallback. Made-with: Cursor	2026-04-06 06:13:43 +00:00
Xubin Ren	3bf1fa5225	feat: auto-fallback to other transcription provider on failure When the primary transcription provider fails (bad key, API error, etc.), automatically try the other provider if its API key is available. Made-with: Cursor	2026-04-06 06:10:08 +00:00
Xubin Ren	35dde8a30e	refactor: unify voice transcription config across all channels - Move transcriptionProvider to global channels config (not per-channel) - ChannelManager auto-resolves API key from matching provider config - BaseChannel gets transcription_provider attribute, no more getattr hack - Remove redundant transcription fields from WhatsAppConfig - Update README: document transcriptionProvider, update provider table Made-with: Cursor	2026-04-06 06:07:30 +00:00
comadreja	db50dd8a77	feat(whatsapp): add voice message transcription via OpenAI/Groq Whisper Automatically transcribe WhatsApp voice messages using OpenAI Whisper or Groq. Configurable via transcriptionProvider and transcriptionApiKey. Config: "whatsapp": { "transcriptionProvider": "openai", "transcriptionApiKey": "sk-..." }	2026-03-26 21:46:31 -05:00
Xubin Ren	33abe915e7	fix telegram streaming message boundaries	2026-03-26 02:35:12 +00:00
Xubin Ren	f0f0bf02d7	refactor(channel): centralize retry around explicit send failures Make channel delivery failures raise consistently so retry policy lives in ChannelManager rather than being split across individual channels. Tighten Telegram stream finalization, clarify sendMaxRetries semantics, and align the docs with the behavior the system actually guarantees.	2026-03-25 22:37:11 +08:00
chengyongru	556b21d011	refactor(channels): abstract login() into BaseChannel, unify CLI commands Move channel-specific login logic from CLI into each channel class via a new `login(force=False)` method on BaseChannel. The `channels login <name>` command now dynamically loads the channel and calls its login() method. - WeixinChannel.login(): calls existing _qr_login(), with force to clear saved token - WhatsAppChannel.login(): sets up bridge and spawns npm process for QR login - CLI no longer contains duplicate login logic per channel - Update CHANNEL_PLUGIN_GUIDE to document the login() hook Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-24 01:11:33 +08:00
Xubin Ren	bd621df57f	feat: add streaming channel support with automatic fallback Provider layer: add chat_stream / chat_stream_with_retry to all providers (base fallback, litellm, custom, azure, codex). Refactor shared kwargs building in each provider. Channel layer: BaseChannel gains send_delta (no-op) and supports_streaming (checks config + method override). ChannelManager routes _stream_delta / _stream_end to send_delta, skips _streamed final messages. AgentLoop._dispatch builds bus-backed on_stream/on_stream_end callbacks when _wants_stream metadata is set. Non-streaming path unchanged. CLI: clean up spinner ANSI workarounds, simplify commands.py flow. Made-with: Cursor	2026-03-23 10:20:41 +08:00
Xubin Ren	dbdb43faff	feat: channel plugin architecture with decoupled configs - Add plugin discovery via Python entry_points (group: nanobot.channels) - Move 11 channel Config classes from schema.py into their own channel modules - ChannelsConfig now only keeps send_progress + send_tool_hints (extra=allow) - Each built-in channel parses dict->Pydantic in __init__, zero internal changes - All channels implement default_config() for onboard auto-population - nanobot onboard injects defaults for all discovered channels (built-in + plugins) - Add nanobot plugins list CLI command - Add Channel Plugin Guide (docs/CHANNEL_PLUGIN_GUIDE.md) - Fully backward compatible: existing config.json and sessions work as-is - 340 tests pass, zero regressions	2026-03-14 16:13:38 +08:00
Re-bin	254cfd48ba	refactor: auto-discover channels via pkgutil, eliminate hardcoded registry	2026-03-11 14:23:19 +00:00
Re-bin	057927cd24	fix(auth): prevent allowlist bypass via sender_id token splitting	2026-03-07 16:36:12 +00:00
Re-bin	bbfc1b40c1	security: deny-by-default allowFrom with wildcard support and startup validation	2026-03-02 06:13:37 +00:00
chengyongru	d447be5ca2	security: deny by default in is_allowed for all channels When allow_from is not configured, block all access by default instead of allowing everyone. This prevents unauthorized access when channels are enabled without explicit allow lists.	2026-03-02 13:18:43 +08:00
JK_Lu	977ca725f2	style: unify code formatting and import order - Remove trailing whitespace and normalize blank lines - Unify string quotes and line breaks for long lines - Sort imports alphabetically across modules	2026-02-28 20:55:43 +08:00
Re-bin	2b983c708d	refactor: pass session_key as explicit param instead of via metadata	2026-02-23 13:10:47 +00:00
Paul	1f7a81e5ee	feat(slack): isolate session context per thread Each Slack thread now gets its own conversation session instead of sharing one session per channel. DM sessions are unchanged. Added as a generic feature to also support if Feishu threads support is added in the future.	2026-02-23 10:23:55 +00:00
Nikolas de Hor	f19baa8fc4	fix: convert remaining f-string logger calls to loguru native format Follow-up to #864. Three f-string logger calls in base.py and dingtalk.py were missed in the original sweep. These can cause KeyError if interpolated values contain curly braces, since loguru interprets them as format placeholders.	2026-02-20 10:01:38 -03:00
Re-bin	c5191eed1a	refactor: unify workspace restriction for file tools, remove redundant checks, fix SECURITY.md	2026-02-06 09:16:20 +00:00
copilot-swe-agent[bot]	8b4e0a8868	Security audit: Fix critical dependency vulnerabilities and add security controls Co-authored-by: kingassune <6126851+kingassune@users.noreply.github.com>	2026-02-03 22:08:33 +00:00
codeLzq	1663acd1a1	feat: enhance sender ID handling in Telegram channel - Update sender ID construction to prioritize user ID while maintaining username for allowlist compatibility. - Improve allowlist checking in BaseChannel to support sender IDs with multiple parts separated by '\|'.	2026-02-02 13:07:35 +00:00
Re-bin	d4cc48afd5	🐈nanobot: hello world!	2026-02-01 07:36:42 +00:00

31 Commits