31 Commits

Author SHA1 Message Date
Xubin Ren
458b4ba235 feat(reasoning): stream reasoning content as a first-class channel
Reasoning now flows as its own stream — symmetric to the answer's
``delta`` / ``stream_end`` pair — instead of being shipped as one
oversized progress message. This lets WebUI render a live "Thinking…"
bubble that updates in place, then auto-collapses when the stream
closes. Other channels remain plugin no-ops by default.

## Protocol

New metadata: ``_reasoning_delta`` (chunk) and ``_reasoning_end``
(close marker). ChannelManager routes both to the dedicated plugin
hooks below; the legacy one-shot ``_reasoning`` is kept for back-compat
and BaseChannel expands it into a single delta + end pair so plugins
only ever implement the streaming primitives.

WebSocket emits two new events:

- ``reasoning_delta`` (event, chat_id, text, optional stream_id)
- ``reasoning_end`` (event, chat_id, optional stream_id)

## BaseChannel surface

- ``send_reasoning_delta(chat_id, delta, metadata)`` — no-op default
- ``send_reasoning_end(chat_id, metadata)`` — no-op default
- ``send_reasoning(msg)`` — back-compat wrapper, base impl forwards
  to the streaming primitives

A channel adds reasoning support by overriding the two streaming
primitives. Telegram / Slack / Discord / Feishu / WeChat / Matrix keep
the base no-ops until their bubble UIs are adapted; reasoning silently
drops at dispatch, never as a stray text message.

## AgentHook

Adds ``emit_reasoning_end`` to the hook lifecycle. ``_LoopHook`` tracks
whether a reasoning segment is open and closes it on:

- the first answer delta arriving (so the UI locks the bubble before
  the answer renders below),
- ``on_stream_end``,
- one-shot ``reasoning_content`` / ``thinking_blocks`` after a single
  non-streaming response.

## WebUI

- ``UIMessage.reasoning`` is now a single accumulated string with a
  companion ``reasoningStreaming`` flag.
- ``useNanobotStream`` consumes ``reasoning_delta`` / ``reasoning_end``;
  legacy ``kind: "reasoning"`` is auto-translated to a delta + end.
- New ``ReasoningBubble``: shimmer header + auto-expanded while
  streaming, collapses to a clickable "Thinking" pill once closed,
  respects ``prefers-reduced-motion``.
- Answer deltas adopt the reasoning placeholder so the bubble and the
  answer share one assistant row.

## Tests

- ``tests/channels/test_channel_manager_reasoning.py`` — manager routes
  delta + end, drops on channel opt-out, expands one-shot back-compat.
- ``tests/channels/test_websocket_channel.py`` — new ``reasoning_delta``
  / ``reasoning_end`` frames, empty-chunk safety, no-subscriber safety,
  back-compat expansion.
- ``tests/agent/test_runner_reasoning.py`` — runner closes the segment
  on streaming answer start and after one-shot reasoning.
- WebUI ``useNanobotStream`` + ``message-bubble`` cover the new
  protocol and the shimmer styling.

## Docs

``docs/configuration.md`` and ``docs/websocket.md`` document the new
events and the plugin contract.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-13 07:13:43 +00:00
Xubin Ren
a6b059d379 refactor(reasoning): make channel plugins own reasoning rendering
Reasoning was being shipped to every channel as a generic progress
message with a `_reasoning: true` flag. Two problems with that:

1. Channels without a low-emphasis UI primitive (Telegram, Slack,
   Discord, Feishu...) would dump raw model thoughts as ordinary
   replies, polluting the conversation.
2. The agent loop double-gated by inspecting `channels_config`, which
   coupled the loop to display policy.

Treat reasoning as its own plugin action — `BaseChannel.send_reasoning`
defaults to a documented no-op; channels that have a fitting affordance
override. ChannelManager routes `_reasoning` outbounds to that method
only when the channel opts in via `show_reasoning` (camelCase alias
`showReasoning` mirrors `sendProgress`). Plugins that don't override
silently drop reasoning — "no fit, no leak" is the contract.

Reference implementation lands for WebSocket / WebUI: a new
`kind: "reasoning"` frame, parked on the active assistant bubble as a
collapsible `Thinking` group above the answer. CLI keeps its existing
direct path (it doesn't go through the bus). `ChannelsConfig.show_reasoning`
flips to `true` by default — only adapted channels surface anything,
others stay quiet.

Loop net diff is -3 lines: the `channels_config.show_reasoning` check
moves out, leaving emit_reasoning a one-liner that publishes and trusts
the channel to decide.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-13 06:27:53 +00:00
chengyongru
05e0106592 refactor(logging): preserve tracebacks and add channel context
- Preserve tracebacks: logger.error in except blocks → logger.exception
- Channel context: BaseChannel injects self.logger = logger.bind(channel=name)
- Third-party bridge: redirect_lib_logging() replaces ad-hoc stdlib-to-loguru bridges
- Log levels: network timeouts downgraded from ERROR → WARNING
- Fix --verbose flag to actually work with loguru (set handler to DEBUG)
2026-05-06 21:17:45 +08:00
chengyongru
74270bb8a8 refactor(channels): resolve progress overrides at init-time like transcription 2026-04-29 16:43:09 +08:00
k
123d69bfb7 fix: allow specifying transcription language 2026-04-22 12:41:32 +08:00
flobo3
1826ab44fa feat(transcription): add language parameter for Groq Whisper STT 2026-04-22 12:41:32 +08:00
Mohamed Elkholy
ce5272c153 fix(transcription): honor api_base for OpenAI transcription provider
Complete the symmetry left by #3214: ChannelManager._resolve_transcription_base
already resolves providers.openai.api_base, but BaseChannel.transcribe_audio
instantiated OpenAITranscriptionProvider without forwarding it, and the provider
__init__ did not accept the parameter. Self-hosted OpenAI-compatible Whisper
endpoints (LiteLLM, vLLM, etc.) configured via config.json were therefore
ignored for the OpenAI backend.

- OpenAITranscriptionProvider.__init__ now accepts api_base with env fallback
  (OPENAI_TRANSCRIPTION_BASE_URL) matching the Groq pattern.
- BaseChannel.transcribe_audio forwards self.transcription_api_base to OpenAI.
- Tests mirror the existing Groq coverage: manager propagation for provider
  "openai", BaseChannel-to-provider argument passing, and provider default vs
  override for api_url.

Fully backward-compatible: when api_base is None and the env var is unset,
the default https://api.openai.com/v1/audio/transcriptions is used.

Refs #3213, follow-up to #3214.
2026-04-17 13:46:51 +08:00
flobo3
0401ca9dbc fix: pass apiBase from config to GroqTranscriptionProvider 2026-04-17 13:46:51 +08:00
Xubin Ren
1f33df1ea6 fix: preserve empty dict allow_from handling
Keep dict-backed channel configs compatible with both allow_from and allowFrom without losing empty-list semantics, and add focused regression coverage for the allow-list boundary.

Made-with: Cursor
2026-04-15 01:26:51 +08:00
samy
73cf9a220b fix: handle dict config in is_allowed() and _validate_allow_from()
getattr() on a dict never finds custom keys — it only searches
object attributes, not dict keys. When channel config is loaded as
a Pydantic extra field (which is a plain dict), getattr(config,
'allow_from', []) always returns the default [], causing all access
to be denied regardless of the allowFrom configuration.

Fix both is_allowed() and _validate_allow_from() to use isinstance
checks, falling back to dict.get() for dict configs while preserving
getattr() for object-style configs.
2026-04-15 01:26:51 +08:00
Xubin Ren
019eaff225 simplify: remove transcription fallback, respect explicit config
Configured provider is the only one used — no silent fallback.

Made-with: Cursor
2026-04-06 06:13:43 +00:00
Xubin Ren
3bf1fa5225 feat: auto-fallback to other transcription provider on failure
When the primary transcription provider fails (bad key, API error, etc.),
automatically try the other provider if its API key is available.

Made-with: Cursor
2026-04-06 06:10:08 +00:00
Xubin Ren
35dde8a30e refactor: unify voice transcription config across all channels
- Move transcriptionProvider to global channels config (not per-channel)
- ChannelManager auto-resolves API key from matching provider config
- BaseChannel gets transcription_provider attribute, no more getattr hack
- Remove redundant transcription fields from WhatsAppConfig
- Update README: document transcriptionProvider, update provider table

Made-with: Cursor
2026-04-06 06:07:30 +00:00
comadreja
db50dd8a77 feat(whatsapp): add voice message transcription via OpenAI/Groq Whisper
Automatically transcribe WhatsApp voice messages using OpenAI Whisper
or Groq. Configurable via transcriptionProvider and transcriptionApiKey.

Config:
  "whatsapp": {
    "transcriptionProvider": "openai",
    "transcriptionApiKey": "sk-..."
  }
2026-03-26 21:46:31 -05:00
Xubin Ren
33abe915e7 fix telegram streaming message boundaries 2026-03-26 02:35:12 +00:00
Xubin Ren
f0f0bf02d7 refactor(channel): centralize retry around explicit send failures
Make channel delivery failures raise consistently so retry policy lives in ChannelManager rather than being split across individual channels. Tighten Telegram stream finalization, clarify sendMaxRetries semantics, and align the docs with the behavior the system actually guarantees.
2026-03-25 22:37:11 +08:00
chengyongru
556b21d011 refactor(channels): abstract login() into BaseChannel, unify CLI commands
Move channel-specific login logic from CLI into each channel class via a
new `login(force=False)` method on BaseChannel. The `channels login <name>`
command now dynamically loads the channel and calls its login() method.

- WeixinChannel.login(): calls existing _qr_login(), with force to clear saved token
- WhatsAppChannel.login(): sets up bridge and spawns npm process for QR login
- CLI no longer contains duplicate login logic per channel
- Update CHANNEL_PLUGIN_GUIDE to document the login() hook

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 01:11:33 +08:00
Xubin Ren
bd621df57f feat: add streaming channel support with automatic fallback
Provider layer: add chat_stream / chat_stream_with_retry to all providers
(base fallback, litellm, custom, azure, codex). Refactor shared kwargs
building in each provider.

Channel layer: BaseChannel gains send_delta (no-op) and supports_streaming
(checks config + method override). ChannelManager routes _stream_delta /
_stream_end to send_delta, skips _streamed final messages.

AgentLoop._dispatch builds bus-backed on_stream/on_stream_end callbacks
when _wants_stream metadata is set. Non-streaming path unchanged.

CLI: clean up spinner ANSI workarounds, simplify commands.py flow.
Made-with: Cursor
2026-03-23 10:20:41 +08:00
Xubin Ren
dbdb43faff feat: channel plugin architecture with decoupled configs
- Add plugin discovery via Python entry_points (group: nanobot.channels)
- Move 11 channel Config classes from schema.py into their own channel modules
- ChannelsConfig now only keeps send_progress + send_tool_hints (extra=allow)
- Each built-in channel parses dict->Pydantic in __init__, zero internal changes
- All channels implement default_config() for onboard auto-population
- nanobot onboard injects defaults for all discovered channels (built-in + plugins)
- Add nanobot plugins list CLI command
- Add Channel Plugin Guide (docs/CHANNEL_PLUGIN_GUIDE.md)
- Fully backward compatible: existing config.json and sessions work as-is
- 340 tests pass, zero regressions
2026-03-14 16:13:38 +08:00
Re-bin
254cfd48ba refactor: auto-discover channels via pkgutil, eliminate hardcoded registry 2026-03-11 14:23:19 +00:00
Re-bin
057927cd24 fix(auth): prevent allowlist bypass via sender_id token splitting 2026-03-07 16:36:12 +00:00
Re-bin
bbfc1b40c1 security: deny-by-default allowFrom with wildcard support and startup validation 2026-03-02 06:13:37 +00:00
chengyongru
d447be5ca2 security: deny by default in is_allowed for all channels
When allow_from is not configured, block all access by default
instead of allowing everyone. This prevents unauthorized access
when channels are enabled without explicit allow lists.
2026-03-02 13:18:43 +08:00
JK_Lu
977ca725f2 style: unify code formatting and import order
- Remove trailing whitespace and normalize blank lines
- Unify string quotes and line breaks for long lines
- Sort imports alphabetically across modules
2026-02-28 20:55:43 +08:00
Re-bin
2b983c708d refactor: pass session_key as explicit param instead of via metadata 2026-02-23 13:10:47 +00:00
Paul
1f7a81e5ee feat(slack): isolate session context per thread
Each Slack thread now gets its own conversation session instead of
sharing one session per channel. DM sessions are unchanged.

Added as a generic feature to also support if Feishu threads support
is added in the future.
2026-02-23 10:23:55 +00:00
Nikolas de Hor
f19baa8fc4 fix: convert remaining f-string logger calls to loguru native format
Follow-up to #864. Three f-string logger calls in base.py and dingtalk.py
were missed in the original sweep. These can cause KeyError if interpolated
values contain curly braces, since loguru interprets them as format placeholders.
2026-02-20 10:01:38 -03:00
Re-bin
c5191eed1a refactor: unify workspace restriction for file tools, remove redundant checks, fix SECURITY.md 2026-02-06 09:16:20 +00:00
copilot-swe-agent[bot]
8b4e0a8868 Security audit: Fix critical dependency vulnerabilities and add security controls
Co-authored-by: kingassune <6126851+kingassune@users.noreply.github.com>
2026-02-03 22:08:33 +00:00
codeLzq
1663acd1a1 feat: enhance sender ID handling in Telegram channel
- Update sender ID construction to prioritize user ID while maintaining username for allowlist compatibility.
- Improve allowlist checking in BaseChannel to support sender IDs with multiple parts separated by '|'.
2026-02-02 13:07:35 +00:00
Re-bin
d4cc48afd5 🐈nanobot: hello world! 2026-02-01 07:36:42 +00:00