Add signed media URLs to live WebSocket replies and teach the WebUI to classify and render video attachments, so bot-sent videos can play inline in both live chats and session history.
Made-with: Cursor
Telegram previously sent all video files as documents via send_document,
so users saw a file icon instead of an inline player. WebSocket only
accepted image MIME types, rejecting video uploads entirely.
Telegram:
- Recognize video extensions (mp4/mov/avi/mkv/webm/3gp) in _get_media_type
- Route videos through send_video with supports_streaming=True
- Add VIDEO/VIDEO_NOTE/ANIMATION to inbound message filters
- Add video MIME mappings to _get_extension
- Fix: local file sends now use _call_with_retry (previously no retry)
WebSocket:
- Expand upload MIME whitelist with video/mp4, video/webm, video/quicktime
- Add per-type size limits (_MAX_VIDEO_BYTES=20MB, _MAX_VIDEOS_PER_MESSAGE=1)
- Expand media serving endpoint to serve video with correct Content-Type
Agent:
- Add "video" to message tool media parameter description
- Add .mp4 example to identity.md system prompt
Made-with: Cursor
``InlineKeyboardButton(label, callback_data=label)`` fails Telegram's
API when the label exceeds 64 bytes UTF-8. An LLM-generated long
option (realistic in multilingual flows) used to 400 the ``send_message``
call silently — user got nothing, agent heard a successful retry-then-drop.
Decouple display from wire: button text keeps the full label, callback_data
gets truncated at a UTF-8 char boundary. Tap echoes the prefix back as the
user message; the LLM understands a prefix of its own option just fine,
and the display the user saw was always the full string.
Locks: helper boundary behavior (ASCII, CJK, short labels pass through)
and end-to-end ``_build_keyboard`` integration with an over-cap label.
Made-with: Cursor
Buttons are semantic options, not a separate channel protocol: a user
who taps "Yes" and a user who types "yes" arrive at the agent as the
same string. Dropping ``msg.buttons`` when ``inline_keyboards=False``
was the worst of both worlds — the agent got told "Message sent with
N button(s)" while the user saw a question with no options.
Splice the labels into the message text instead. The LLM produces the
same ``message(buttons=...)`` call regardless of channel; the channel
layer picks the richest rendering it can afford — native keyboard when
enabled, bracketed inline text otherwise. Layout is preserved (one row
per line). Other channels can adopt the same helper incrementally.
Locks: canonical ``_buttons_as_text`` format, flag-off send-path
splices labels, flag-on send-path keeps content clean and rides
``reply_markup``.
Made-with: Cursor
Non-priority slash commands (e.g. /new, /help, /dream-log) arriving
while a session has an active LLM turn were silently queued into the
pending injection buffer and later injected as raw user messages into
the LLM conversation. This caused the model to respond to "/new" as
plain text instead of executing the command.
Root cause: the run() loop only checked priority commands (/stop,
/restart, /status) before routing messages to the pending queue. All
other command tiers (exact, prefix) bypassed command dispatch entirely.
Changes:
- Add CommandRouter.is_dispatchable_command() to match exact/prefix
tiers, mirroring the existing is_priority() pattern.
- In run(), intercept dispatchable commands before pending queue
insertion and dispatch them directly via _dispatch_command_inline().
- Extract _cancel_active_tasks() from cmd_stop for reuse; cmd_new now
cancels active tasks before clearing the session to prevent shared
mutable state corruption from concurrent asyncio coroutines.
- Update /new semantics: stops active task first, then clears session.
- Update documentation in help text, docs, and Discord command list.
Problem:
Modern LLMs (GPT-5.4, Claude, Gemini) produce markdown-heavy responses with
numbered lists, headers, and nested formatting. The Telegram channel's
_markdown_to_telegram_html() converter has gaps that leave these poorly
formatted:
1. Numbered lists (1. 2. 3.) have zero handling — sent as raw text
2. Headers (# Title) are stripped to plain text, losing visual hierarchy
3. Mid-stream edits send raw markdown (users see **bold** and ### headers
while the response generates, before the final HTML conversion)
Root Cause:
_markdown_to_telegram_html() handles bullets (- *) but skips numbered lists
entirely. Headers are stripped of # but not given any emphasis. The streaming
path in send_delta() sends buf.text as-is during mid-stream edits (plain
text, no parse_mode) — only the final _stream_end edit converts to HTML.
Fix:
1. Headers now render as <b>bold</b> in the final HTML (using placeholder
markers that survive HTML escaping, restored after all other processing)
2. Numbered lists are normalized (extra whitespace after the dot is cleaned)
3. New _strip_md_block() function strips markdown syntax for readable
plain-text preview during streaming mid-edits
The final _stream_end HTML conversion is unchanged — it still produces
full HTML with parse_mode=HTML. Only the intermediate edits are improved.
Tests:
Added 10 new tests covering:
- Headers converting to bold HTML
- Numbered list preservation and whitespace normalization
- Headers with HTML special characters
- Mixed formatting (headers + bullets + numbers + bold)
- _strip_md_block for inline formatting, headers, bullets, numbers, links
- Streaming mid-edit markdown stripping (initial send + edit)
- Fix critical plain-text fallback that was sending raw HTML tags to
users: keep raw markdown available for the fallback path
- Extract TELEGRAM_HTML_MAX_LEN (4096) constant to replace hardcoded
magic number and document the difference from TELEGRAM_MAX_MESSAGE_LEN
- Add fallback to _send_text for extra HTML chunks when HTML parse fails
- Add missing @pytest.mark.asyncio decorator on
test_send_delta_stream_end_html_expansion_does_not_overflow
Cherry-picked from #3311 (stutiredboy). Streaming edits called
edit_message_text(text=buf.text) without chunking, so once accumulated
deltas crossed Telegram's 4096-char limit an ongoing stream would fail
with BadRequest.
Extracts _flush_stream_overflow helper that edits the first chunk in
place, sends any middle chunks, and re-anchors the buffer to a new
message for the tail so subsequent deltas keep streaming.
Co-Authored-By: stutiredboy <stutiredboy@users.noreply.github.com>
Cherry-picked from #3316 (himax12). When streaming completes in send_delta(),
the code was splitting raw markdown text by 4000, then converting to HTML.
The markdown-to-HTML conversion adds 10-33% characters, which could push
the result over Telegram's 4096 character limit.
The fix converts markdown to HTML first, then splits by 4096 (actual Telegram
limit), ensuring the edited message always fits.
Fixes#3315
Replaces inline dedup logic with the existing helper to match the
style of _is_self_address and other reject branches, and to keep the
_processed_uids eviction logic in one place.
Previously the Discord channel dropped every message from any bot
account via `if message.author.bot`, which prevented legitimate
multi-agent setups (one bot asking another for help, bot-to-bot
@mentions, etc.) from working.
Narrow the guard to only drop messages from this bot's own account
by comparing against self._bot_user_id (already populated in on_ready).
Self-loop protection is preserved — each bot instance still ignores
its own outbound messages.
Co-authored with Claude Opus 4.7
Skip inbound emails that come from the bot's own configured addresses so a mailbox wired to the same SMTP/IMAP account does not trigger infinite reply loops.
Complete the symmetry left by #3214: ChannelManager._resolve_transcription_base
already resolves providers.openai.api_base, but BaseChannel.transcribe_audio
instantiated OpenAITranscriptionProvider without forwarding it, and the provider
__init__ did not accept the parameter. Self-hosted OpenAI-compatible Whisper
endpoints (LiteLLM, vLLM, etc.) configured via config.json were therefore
ignored for the OpenAI backend.
- OpenAITranscriptionProvider.__init__ now accepts api_base with env fallback
(OPENAI_TRANSCRIPTION_BASE_URL) matching the Groq pattern.
- BaseChannel.transcribe_audio forwards self.transcription_api_base to OpenAI.
- Tests mirror the existing Groq coverage: manager propagation for provider
"openai", BaseChannel-to-provider argument passing, and provider default vs
override for api_url.
Fully backward-compatible: when api_base is None and the env var is unset,
the default https://api.openai.com/v1/audio/transcriptions is used.
Refs #3213, follow-up to #3214.
Add `allow_channels` config option to DiscordConfig that restricts
bot responses to specific Discord channels. When the list is empty
(default), the bot responds in all channels (backward compatible).
- Add `allow_channels: list[str]` field to DiscordConfig schema
- Add channel ID check in _handle_message_create after user filtering
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Default Microsoft Teams inbound auth validation to enabled, update the README to match, and prevent denied senders from persisting conversation refs before allowlist checks pass.
Made-with: Cursor
- Check both jwt and cryptography in MSTEAMS_AVAILABLE guard so
partial installs fail early with a clear message instead of at runtime
- Add aclose() to test FakeHttpClient so stop() won't crash
- Move MSTEAMS.md into README.md following the same details/summary
pattern used by every other channel
- Note in README that validateInboundAuth defaults to false
Warn when validate_inbound_auth is disabled (default) so operators are
aware the webhook accepts unverified requests. Restore pymupdf to the
dev optional-dependencies group — its removal in the original PR was
unrelated to the Teams channel feature.
channel_id is already assigned from self._channel_key(message.channel)
earlier in the same function. The second identical assignment on line 453
is dead code left over from a copy-paste.
Keep dict-backed channel configs compatible with both allow_from and allowFrom without losing empty-list semantics, and add focused regression coverage for the allow-list boundary.
Made-with: Cursor
getattr() on a dict never finds custom keys — it only searches
object attributes, not dict keys. When channel config is loaded as
a Pydantic extra field (which is a plain dict), getattr(config,
'allow_from', []) always returns the default [], causing all access
to be denied regardless of the allowFrom configuration.
Fix both is_allowed() and _validate_allow_from() to use isinstance
checks, falling back to dict.get() for dict configs while preserving
getattr() for object-style configs.
When Slack resolves a named target to another conversation, do not reuse the origin thread timestamp on the destination send, and keep reaction cleanup anchored to the source conversation.
Made-with: Cursor
Feishu streaming cards auto-close after 10 minutes from creation,
regardless of update activity. With resuming enabled, a single card
lives across multiple tool-call rounds and can exceed this limit,
causing the final response to be silently lost.
Remove the _resuming logic from send_delta so each tool-call round
gets its own short-lived streaming card (well under 10 min). Add a
fallback that sends a regular interactive card when the final
streaming update fails.
The catch-all except Exception in QQ send() was swallowing
aiohttp.ClientError and OSError that _send_media correctly
re-raises. Add explicit catch for network errors before the
generic handler.
Audited all channel implementations for overly broad exception handling
that causes retry amplification or silent message loss during network
errors. This is the same class of bug as #3050 (Telegram _send_text).
Fixes by channel:
Telegram (send_delta):
- _stream_end path used except Exception for HTML edit fallback
- Network errors (TimedOut, NetworkError) triggered redundant plain
text edit, doubling connection demand during pool exhaustion
- Changed to except BadRequest, matching the _send_text fix
Discord:
- send() caught all exceptions without re-raising
- ChannelManager._send_with_retry() saw successful return, never retried
- Messages silently dropped on any send failure
- Added raise after error logging
DingTalk:
- _send_batch_message() returned False on all exceptions including
network errors — no retry, fallback text sent unnecessarily
- _read_media_bytes() and _upload_media() swallowed transport errors,
causing _send_media_ref() to cascade through doomed fallback attempts
- Added except httpx.TransportError handlers that re-raise immediately
WeChat:
- Media send failure triggered text fallback even for network errors
- During network issues: 3×(media + text) = 6 API calls per message
- Added specific catches: TimeoutException/TransportError re-raise,
5xx HTTPStatusError re-raises, 4xx falls back to text
QQ:
- _send_media() returned False on all exceptions
- Network errors triggered fallback text instead of retry
- Added except (aiohttp.ClientError, OSError) that re-raises
Tests: 331 passed (283 existing + 48 new across 5 channel test files)
Fixes: #3054
Related: #3050, #3053
Previously _send_text() caught all exceptions (except Exception) when
sending HTML-formatted messages, falling back to plain text even for
network errors like TimedOut and NetworkError. This caused connection
demand to double during pool exhaustion scenarios (3 retries × 2
fallback attempts = 6 calls per message instead of 3).
Now only catches BadRequest (HTML parse errors), letting network errors
propagate immediately to the retry layer where they belong.
Fixes: HKUDS/nanobot#3050
Add 'domain' field to FeishuConfig (Literal['feishu', 'lark'], default 'feishu').
Pass domain to lark.Client.builder() and lark.ws.Client to support Lark global
(open.larksuite.com) in addition to Feishu China (open.feishu.cn).
Existing configs default to 'feishu' for backward compatibility.
Also add documentation for domain field in README.md and add tests for
domain config.
The plain reply() uses cmd="reply" which does not support "text" msgtype
and causes WeCom API to return errcode=40008 (invalid message type).
Unify both progress and final text messages to use reply_stream()
(cmd="aibot_respond_msg"), differentiating via finish flag.
Fixes#2999
- Use asyncio.to_thread for file I/O to avoid blocking event loop
- Add 200MB upload size limit with early rejection
- Fix file handle leak by using context manager
- Use memoryview for upload chunking to reduce peak memory
- Add inbound download size check to prevent OOM
- Use asyncio.to_thread for write_bytes in download path
- Extract inline media_type detection to _guess_wecom_media_type()
- Use asyncio.to_thread for file I/O to avoid blocking event loop
- Add 200MB upload size limit with early rejection
- Fix file handle leak by using context manager
- Free raw bytes early after chunking to reduce memory pressure
- Add file attachments to media_paths (was text-only, inconsistent with image)
- Use robust _sanitize_filename() instead of os.path.basename() for path safety
- Remove re-raise in send() for consistency with QQ channel
- Fix truncated media_id logging for short IDs
QQ channel improvements (on top of nightly):
- Add top-level try/except in _on_message and send() for resilience
- Use defensive getattr() for attachment attributes (botpy version compat)
- Skip file_name for image uploads to avoid QQ rendering as file attachment
- Extract only file_info from upload response to avoid extra fields
- Handle protocol-relative URLs (//...) in attachment downloads
WeCom channel improvements:
- Add _upload_media_ws() for WebSocket 3-step media upload protocol
- Send media files (image/video/voice/file) via WeCom rich media API
- Support progress messages (plain reply) vs final response (streaming)
- Support proactive send when no frame available (cron push)
- Pass media_paths to message bus for downstream processing