nanobot

mirror of https://github.com/HKUDS/nanobot.git synced 2026-04-30 23:05:51 +00:00

Author	SHA1	Message	Date
hussein1362	1572626100	fix(heartbeat): inject delivered messages into channel session for reply continuity When heartbeat delivers output to a channel (e.g. Telegram), the message is a raw OutboundMessage that bypasses the channel's session. If the user replies, their reply enters a different session with no context about the heartbeat message, so the agent cannot follow through. This change injects the delivered heartbeat message as an assistant turn into the target channel's session before publishing the outbound. When the user replies, the channel session has conversational context. Handles unified_session mode by resolving to UNIFIED_SESSION_KEY when enabled, matching the agent loop's own session routing. No changes to agent/loop.py, session/manager.py, channels, providers, or config schema — uses existing add_message() and save() APIs.	2026-04-26 20:08:21 +08:00
Xubin Ren	1e11b35b45	fix(providers): tighten local endpoint detection Parse the endpoint host before disabling keepalive so public hostnames that merely contain private-network substrings keep the default connection pool behavior. Made-with: Cursor	2026-04-26 16:14:24 +08:00
hussein1362	5943ab386d	fix(providers): disable HTTP keepalive for local/LAN endpoints Local model servers (Ollama, llama.cpp, vLLM) often close idle HTTP connections before the client-side keepalive timer expires. When two LLM calls happen seconds apart — for example the heartbeat _decide() phase followed immediately by process_direct() — the second call grabs a now-dead pooled connection, causing a transient APIConnectionError on every first attempt. The fix detects local endpoints via: - ProviderSpec.is_local (Ollama, LM Studio, vLLM, OVMS) - Private-network URL patterns (localhost, 127.x, 192.168.x, 10.x, 172.16-31.x, host.docker.internal, [::1]) For these endpoints, the AsyncOpenAI client is created with a custom httpx.AsyncClient that sets keepalive_expiry=0, forcing a fresh TCP connection for each request. This is cheap on LAN (sub-5ms connect) and eliminates the stale-connection retry tax entirely. Cloud providers (OpenAI, Anthropic, OpenRouter, etc.) keep the default 5-second keepalive, which is fine for high-frequency API usage. The private-network heuristic also covers the common case where users configure provider='openai' but point apiBase at a LAN IP running llama.cpp — the spec says is_local=False, but the URL clearly is.	2026-04-26 16:14:24 +08:00
Xubin Ren	d0e1b1393a	fix(feishu): scope streaming buffers by message Keep concurrent Feishu group replies from sharing one streaming card buffer when sessions are split by topic or top-level message. Made-with: Cursor	2026-04-26 16:09:31 +08:00
chengyongru	39eea1b762	feat(feishu): per-message session for group top-level messages Align with deer-flow: group top-level messages (no root_id) now get their own session keyed by message_id instead of sharing a single group-wide session. Topic replies continue to share session via root_id.	2026-04-26 16:09:31 +08:00
chengyongru	0e92936cf3	chore(test): remove stale reaction_id from test metadata The production code no longer reads reaction_id from metadata, so remove the leftover key from the test_no_removal_when_message_id_missing test case.	2026-04-26 16:09:31 +08:00
chengyongru	3eb8838dd9	fix(test): update reaction cleanup test for _reaction_ids dict The stream-end reaction cleanup now reads from _reaction_ids instead of metadata, so pre-populate the dict in the test instead of passing reaction_id via metadata.	2026-04-26 16:09:31 +08:00
chengyongru	2a9fc9392b	fix(feishu): use message_id as reply target and fix keyword-only arg Align reply targeting with deer-flow: always reply to the inbound message_id (not root_id). The Feishu Reply API keeps responses in the same topic automatically when the target message is inside a topic. Also fix run_in_executor calls that passed reply_in_thread as a positional arg to a keyword-only parameter, and route standalone tool hints through the reply API for group chats.	2026-04-26 16:09:31 +08:00
chengyongru	d36fba8bf5	feat(feishu): add reply_in_thread for visual topic grouping When reply_to_message config is enabled, the bot's first reply now uses reply_in_thread=True to create a visual topic/thread in the Feishu client. Subsequent chunks fall back to regular create. The reply_to_message default remains False for backward compatibility. Failed replies still fall back to regular send — messages are never silently dropped.	2026-04-26 16:09:31 +08:00
chengyongru	13bb31c789	feat(feishu): add thread-scoped session isolation for group chats Thread replies (messages with root_id != message_id) in group chats now get their own session key: feishu:{chat_id}:{root_id}. This means each Feishu thread has an independent conversation context. Top-level group messages and all private chat messages keep the default session key (no override), consistent with Telegram and Slack channel behavior. Co-authored-by: shenchengtsi <228445050+shenchengtsi@users.noreply.github.com>	2026-04-26 16:09:31 +08:00
T3chC0wb0y	fd3d7ea752	fix(msteams): normalize nbsp in inbound text	2026-04-26 00:56:06 +08:00
T3chC0wb0y	722d935d37	fix(msteams): prune bad notify refs	2026-04-26 00:56:06 +08:00
T3chC0wb0y	7e65884acb	fix(msteams): send threaded replies via replyToId	2026-04-26 00:56:06 +08:00
Xubin Ren	403ce23d22	fix(agent): tighten ask_user CLI handling Made-with: Cursor	2026-04-25 22:10:19 +08:00
Xubin Ren	3b1ea99ee1	fix(agent): render ask_user options without buttons Made-with: Cursor	2026-04-25 22:10:19 +08:00
Xubin Ren	cfc76ffbbf	feat(agent): add ask_user tool Made-with: Cursor	2026-04-25 22:10:19 +08:00
Xubin Ren	39a5a77874	fix(feishu): send videos with media message type	2026-04-24 20:00:56 +00:00
yorkhellen	076e4166d7	fix(agent): add LLM request timeout to prevent session lock starvation	2026-04-25 03:40:34 +08:00
Xubin Ren	e52fe2a8e2	feat(webui): render video media attachments Add signed media URLs to live WebSocket replies and teach the WebUI to classify and render video attachments, so bot-sent videos can play inline in both live chats and session history. Made-with: Cursor	2026-04-25 03:20:40 +08:00
Xubin Ren	be05189f39	feat(channels): add video support for Telegram and WebSocket Telegram previously sent all video files as documents via send_document, so users saw a file icon instead of an inline player. WebSocket only accepted image MIME types, rejecting video uploads entirely. Telegram: - Recognize video extensions (mp4/mov/avi/mkv/webm/3gp) in _get_media_type - Route videos through send_video with supports_streaming=True - Add VIDEO/VIDEO_NOTE/ANIMATION to inbound message filters - Add video MIME mappings to _get_extension - Fix: local file sends now use _call_with_retry (previously no retry) WebSocket: - Expand upload MIME whitelist with video/mp4, video/webm, video/quicktime - Add per-type size limits (_MAX_VIDEO_BYTES=20MB, _MAX_VIDEOS_PER_MESSAGE=1) - Expand media serving endpoint to serve video with correct Content-Type Agent: - Add "video" to message tool media parameter description - Add .mp4 example to identity.md system prompt Made-with: Cursor	2026-04-25 02:20:13 +08:00
Xubin Ren	3441d5f89c	test(anthropic): cover remaining opus-4-7 temperature branches The existing test only verified the adaptive path. Add two more cases: - enabled thinking (high): temperature must also be omitted - no thinking (None): temperature must still be omitted Made-with: Cursor	2026-04-24 15:33:59 +08:00
04cb	9239429a00	fix(anthropic): omit temperature for opus-4-7 (#3417 )	2026-04-24 15:33:59 +08:00
Xubin Ren	7f1913f619	fix(provider): add DeepSeek thinking toggle; backfill reasoning_content on legacy messages Two issues with DeepSeek V4 thinking mode support: 1. Missing thinking parameter injection. DeepSeek V4 requires `extra_body: {"thinking": {"type": "enabled/disabled"}}` — identical to VolcEngine/BytePlus. The code had this for volcengine, byteplus, dashscope, minimax, and kimi but not DeepSeek. This means `reasoning_effort=minimal` (thinking off) silently has no effect. Root cause: the thinking-style→wire-format mapping was an if/elif chain on provider names. DeepSeek was forgotten. Fix: make the mapping declarative via `ProviderSpec.thinking_style`: - "thinking_type" → {"thinking": {"type": "..."}} (DeepSeek, Volc, BytePlus) - "enable_thinking" → {"enable_thinking": bool} (DashScope) - "reasoning_split" → {"reasoning_split": bool} (MiniMax) `_build_kwargs` now does a single dict lookup. Adding a new provider with an existing wire format requires zero changes to the function. 2. Legacy session messages crash thinking-mode requests. When a session was started without thinking mode (or with a different model), assistant messages lack reasoning_content. DeepSeek V4 in thinking mode rejects these with 400: "The reasoning_content in the thinking mode must be passed back to the API." This affects ALL assistant messages, not just those with tool_calls (despite the docs only mentioning the tool_calls case). Fix: `_build_kwargs` backfills `reasoning_content: ""` on every assistant message missing it, but only when thinking mode is active. This is semantically neutral — the model treats empty reasoning_content as "no thinking happened on that turn". The backfill only touches the in-memory request copy; session files on disk are untouched. Tests: +5 (3 thinking toggle, 2 backfill). Full suite: 2377 passed. Made-with: Cursor	2026-04-24 15:06:39 +08:00
Xubin Ren	4531167c12	fix(agent): bound remaining memory/history pollution paths from #3412 #3412 stopped the headline raw_archive bloat but left four adjacent leaks on the same pollution chain: - archive() success path appended uncapped LLM summaries to history.jsonl, so a misbehaving LLM could re-open the #3412 bug from the happy path. - maybe_consolidate_by_tokens did not advance last_consolidated when archive() fell back to raw_archive, causing duplicate [RAW] dumps of the same chunk on every subsequent call. - Dream's Phase 1/2 prompt injected MEMORY.md / SOUL.md / USER.md and each history entry without caps, so any legacy oversized record (or an unbounded user edit) would blow past the context window every dream. - append_history itself had no default cap, leaving future new callers one forgotten-cap-away from the same vector. Changes: - Cap LLM-produced summaries at 8K chars (_ARCHIVE_SUMMARY_MAX_CHARS) before writing to history.jsonl. - Advance session.last_consolidated after archive() regardless of whether it summarized or raw-archived — both outcomes materialize the chunk; still break the round loop on fallback so a degraded LLM isn't hammered. - Truncate MEMORY.md / SOUL.md / USER.md and each history entry in Dream's Phase 1 prompt preview (Phase 2 still reaches full files via read_file). - Add _HISTORY_ENTRY_HARD_CAP (64K) as belt-and-suspenders default in append_history with a once-per-store warning, so any new caller that forgets its own tighter cap gets caught and observable. Layer the caps by scope: raw_archive=16K, archive summary=8K, append_history default=64K. Tight per-caller values cover expected payloads; the wide default only catches regressions. Tests: +9 regression tests covering each fix. Full suite: 2372 passed. Made-with: Cursor	2026-04-24 04:17:19 +08:00
Xubin Ren	81a5af2352	test(consolidation): add regression tests for tiktoken truncation path and history char cap Cover two untested boundaries from #3412: - _truncate_to_token_budget with positive budget exercises tiktoken - _MAX_HISTORY_CHARS caps Recent History section in system prompt Made-with: Cursor	2026-04-24 03:57:59 +08:00
chengyongru	2848f69897	fix(agent): prevent history.jsonl bloat from raw_archive and stuck consolidation Root cause: when consolidation LLM fails, raw_archive() dumped full message content (~1MB) into history.jsonl with no size limit. Since build_system_prompt() injects history.jsonl into every system prompt, all subsequent LLM calls exceeded the 200K context window with error 1261. Additionally, _cap_consolidation_boundary's 60-message cap caused consolidation to get stuck on sessions with long tool chains (200+ iterations), triggering the raw_archive fallback in the first place. Three-layer fix: - Remove _cap_consolidation_boundary: let pick_consolidation_boundary drive chunk sizing based solely on token budget - Truncate archive() input: use tiktoken to cap formatted text to the model's input token budget before sending to consolidation LLM - Truncate raw_archive() output: cap history.jsonl entries at 16K chars	2026-04-24 03:57:59 +08:00
Xubin Ren	469fc90fe6	fix(agent): on_progress tool_events only when callback accepts; align progress tests with main Made-with: Cursor	2026-04-23 20:06:11 +08:00
Pablo Cabeza	c23d719780	feat(agent): emit structured _tool_events progress metadata Extend the existing on_progress callback to carry structured tool-event payloads alongside the plain-text hint, so channels can render rich tool execution state (start/finish/error, arguments, results, file attachments) rather than only the pre-formatted hint string. Changes ------- - AgentLoop._tool_event_start_payload() — builds a version-1 start payload from a ToolCallRequest - AgentLoop._tool_event_result_extras() — extracts files/embeds from a tool result dict - AgentLoop._tool_event_finish_payloads() — maps tool_calls + tool_results + tool_events from AgentHookContext into finish payloads - _LoopHook.before_execute_tools() — passes tool_events=[...] to on_progress together with the existing tool_hint flag - _LoopHook.after_iteration() — emits a second on_progress call with the finish payloads once tool results are available - _bus_progress() — forwards tool_events as _tool_events in OutboundMessage metadata so channel implementations can read them - on_progress type widened to Callable[..., Awaitable[None]] on all public entry points; _cli_progress updated to accept and ignore tool_events The contract is additive: callers that only accept (content, *, tool_hint) continue to work unchanged. Callers that also accept tool_events receive the structured data. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-23 20:06:11 +08:00
Xubin Ren	06503cd0fc	fix(telegram): keep callback_data under Telegram's 64-byte cap ``InlineKeyboardButton(label, callback_data=label)`` fails Telegram's API when the label exceeds 64 bytes UTF-8. An LLM-generated long option (realistic in multilingual flows) used to 400 the ``send_message`` call silently — user got nothing, agent heard a successful retry-then-drop. Decouple display from wire: button text keeps the full label, callback_data gets truncated at a UTF-8 char boundary. Tap echoes the prefix back as the user message; the LLM understands a prefix of its own option just fine, and the display the user saw was always the full string. Locks: helper boundary behavior (ASCII, CJK, short labels pass through) and end-to-end ``_build_keyboard`` integration with an over-cap label. Made-with: Cursor	2026-04-23 13:26:06 +08:00
Xubin Ren	6bc2983ab1	fix(telegram): fall back buttons to inline text when keyboard disabled Buttons are semantic options, not a separate channel protocol: a user who taps "Yes" and a user who types "yes" arrive at the agent as the same string. Dropping ``msg.buttons`` when ``inline_keyboards=False`` was the worst of both worlds — the agent got told "Message sent with N button(s)" while the user saw a question with no options. Splice the labels into the message text instead. The LLM produces the same ``message(buttons=...)`` call regardless of channel; the channel layer picks the richest rendering it can afford — native keyboard when enabled, bracketed inline text otherwise. Layout is preserved (one row per line). Other channels can adopt the same helper incrementally. Locks: canonical ``_buttons_as_text`` format, flag-off send-path splices labels, flag-on send-path keeps content clean and rides ``reply_markup``. Made-with: Cursor	2026-04-23 13:26:06 +08:00
Xubin Ren	b9b81d9301	test(telegram): pin inline-keyboards flag gate and buttons validation Two kill-switch tests for the new inline-keyboards path. Neither is flashy — they just make sure the next unrelated refactor can't quietly regress two narrow contracts the PR relies on. 1. TelegramChannel._build_keyboard returns None whenever TelegramConfig.inline_keyboards is False, even if buttons are supplied. The flag defaults off; if someone ever flips that default the change should fail this test before it reaches prod bots. 2. MessageTool rejects malformed `buttons` payloads (non-list, mixed list/str row, non-str label, None label) up front instead of letting them slip into the channel layer where Telegram would silently 400 the send. Parametrized over four shapes the guard needs to reject. No production code touched. Made-with: Cursor	2026-04-23 13:26:06 +08:00
Xubin Ren	707c0d7f3a	fix(websocket): scrub partial media batches, nosniff /api/media	2026-04-23 00:07:27 +08:00
Xubin Ren	61a28c2c0a	feat(webui): support image uploads in composer and message bubbles	2026-04-23 00:07:27 +08:00
Xubin Ren	c1e7aa5504	refactor(config): resolve env vars via in-place Pydantic walk Replace the dump→resolve→model_validate roundtrip with a recursive walk that substitutes ${VAR} in string values directly on BaseModel / __pydantic_extra__ / dict / list nodes. Identity is preserved on any subtree with no references, so the original Config instance is returned unchanged when nothing needs resolving. Side effects: - exclude=True fields (e.g. DreamConfig.cron) now survive even when other fields in the same config contain ${VAR} references, closing the edge case left open by the previous fast-path-only fix. - _has_env_refs is dropped (the walker short-circuits naturally). - Added a regression test pairing cron with a resolved providers.groq api_key to lock the coexistence case. Made-with: Cursor	2026-04-22 22:31:40 +08:00
Saimon Ventura	c9a21d96d8	fix(config): preserve excluded fields in resolve_config_env_vars `resolve_config_env_vars` unconditionally dumped the config via `model_dump(mode="json")` and revalidated it, which silently dropped any field declared with `exclude=True` (e.g. `DreamConfig.cron` — introduced by the Dream rename refactor in #2717). Result: `agents.defaults.dream.cron` was never honored at runtime — the gateway always fell back to the default `every 2h` schedule even when `cron` was set in config.json. Fix: skip the roundtrip entirely when the config has no `${VAR}` references. Env-var interpolation still works unchanged when refs exist; the legacy `cron` override now survives the common case of fully-resolved config. Regression test covers the bug path.	2026-04-22 22:31:40 +08:00
Xubin Ren	239e91a4d6	test(anthropic): pin tool_result image_url conversion regression Adds a focused regression test so the fix for tool_result image handling cannot silently revert. Two cases: - list content with an image_url + text block -> image_url is translated to a native Anthropic image block, sibling text passes through unchanged - plain string content passes through untouched (the new list branch must not alter the string path) These cover the exact symptom surface (silent image drop with a "Non-transient LLM error with image content" warning) and the only two content shapes tool results actually take today. Made-with: Cursor	2026-04-22 22:10:53 +08:00
chengyongru	42c4af2118	fix(agent): prevent duplicate responses when sub-agents complete concurrently When the main agent spawns multiple sub-agents, each completion independently triggered a new _dispatch, causing 3-4 user-visible responses instead of a single comprehensive report. - Extend _drain_pending to block-wait on pending_queue when sub-agents are still running, keeping the runner loop alive for in-order injection - Pass pending_queue in the system message path so subsequent sub-agent results can still be injected mid-turn via a new dispatch	2026-04-22 20:02:19 +08:00
Xubin Ren	79247545ac	Merge remote-tracking branch 'origin/main' into pr-3379	2026-04-22 08:08:05 +00:00
Xubin Ren	427deb4a70	test(providers): add regression tests for GitHub Copilot /responses routing Locks in the four behaviors introduced by the fix so they can't silently revert: - _should_use_responses_api accepts github_copilot on its non-OpenAI base - _build_responses_body strips the 'github_copilot/' routing prefix - /responses failures on github_copilot do not fall back to /chat/completions Made-with: Cursor	2026-04-22 06:53:37 +00:00
Peixian Gong	dd26b4407d	fix(providers): make GitHub Copilot backend work with GPT-5/o-series models Calling GitHub Copilot with `gpt-5.` / `o` models (e.g. `github_copilot/gpt-5.4`, `github_copilot/gpt-5.4-mini`) failed with a chain of misleading errors: 1. `Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.` 2. `model "gpt-5.4-mini" is not accessible via the /chat/completions endpoint` (`unsupported_api_for_model`). 3. `The requested model is not supported.` (`model_not_supported`) even after routing to /responses. Root causes (each one masked the next): * The `github_copilot` ProviderSpec did not opt into `supports_max_completion_tokens`, so `_build_kwargs` always sent the legacy `max_tokens` parameter that GPT-5/o-series reject. * `_should_use_responses_api` was hard-gated to `spec.name == "openai"` plus a direct-OpenAI base URL, so the GitHub Copilot backend always went through /chat/completions even for models the Copilot gateway exposes only via /responses (e.g. `gpt-5.4-mini`). * When /responses did fail on github_copilot, the existing "compatibility marker" heuristic silently fell back to /chat/completions — which can never succeed for these models — so the real upstream error was hidden. * `_build_responses_body` did not honour `spec.strip_model_prefix`, so the request body sent `model="github_copilot/gpt-5.4-mini"` (with the routing prefix), which the Copilot gateway rejects with `model_not_supported`. (`_build_kwargs` already stripped it; this branch was missed.) Fix: * registry.py: set `supports_max_completion_tokens=True` on the `github_copilot` spec so requests use `max_completion_tokens`. * openai_compat_provider.py: - `_should_use_responses_api` now also allows the `github_copilot` spec, and skips the direct-OpenAI base check for it (the Copilot gateway is its own base URL). - `_build_responses_body` now strips the model routing prefix when `spec.strip_model_prefix` is set, matching `_build_kwargs`. - `chat` / `chat_stream` no longer fall back from /responses to /chat/completions on the `github_copilot` spec: the fallback cannot succeed for GPT-5/o-series and would mask the real gateway error. Tests: * tests/cli/test_commands.py: switched the `test_github_copilot_provider_refreshes_client_api_key_before_chat` fixture model from `gpt-5.1` to `gpt-4` so it continues to exercise the /chat/completions code path it was designed for (gpt-5.1 now correctly routes to /responses on github_copilot). * `pytest tests/providers/ tests/cli/test_commands.py` — 314 passed. * Verified end-to-end against the live Copilot gateway with both `github_copilot/gpt-5.4` and `github_copilot/gpt-5.4-mini`. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-22 14:28:19 +08:00
k	03ec28dd49	fix(mcp): avoid WinError 193 for Windows stdio launchers	2026-04-22 14:50:55 +09:00
hussein1362	0932189860	fix: handle Windows PermissionError on directory fsync On Windows, opening a directory with O_RDONLY raises PermissionError. Wrap the directory fsync in a try/except PermissionError — NTFS journals metadata synchronously so the directory sync is unnecessary there. Also adjust test assertions to expect 1 fsync call (file only) on Windows vs 2 (file + directory) on POSIX.	2026-04-22 13:19:53 +08:00
hussein1362	512bf59b3c	fix(session): fsync sessions on graceful shutdown to prevent data loss On filesystems with write-back caching (rclone VFS, NFS, FUSE mounts) the OS page cache may buffer recent session writes. If the process is killed before the cache flushes, the most recent conversation turns are silently lost — causing the agent to "forget" recent context and respond to stale history on the next startup. Changes: - session/manager.py: add fsync=True option to save() that flushes the file and its parent directory to durable storage. Add flush_all() that re-saves every cached session with fsync. Default save() behavior is unchanged (no fsync) to avoid performance regression in normal operation. - cli/commands.py: call agent.sessions.flush_all() in the gateway shutdown finally block, after stopping heartbeat/cron/channels. - tests/session/test_session_fsync.py: 8 tests covering fsync flag behavior, flush_all with empty/multiple/errored sessions, and data survival across simulated process restart. - tests/cli/test_commands.py: add sessions attribute to _FakeAgentLoop so the gateway health endpoint test passes with the new shutdown flush.	2026-04-22 13:19:53 +08:00
Xubin Ren	ef8bbab7b3	test(cli): lock _render_interactive_ansi force_terminal to isatty Made-with: Cursor	2026-04-22 13:12:29 +08:00
Xubin Ren	88c619901e	review(providers): tighten comments in reasoning_effort normalize path Made-with: Cursor	2026-04-22 12:49:55 +08:00
hlg	28c42628b0	fix: normalize DashScope reasoning_effort (minimal vs minimum) DashScope rejects the OpenAI-style value "minimal" with `'reasoning_effort.effort' must be one of: 'none', 'minimum', 'low', 'medium', 'high', 'xhigh'`, but nanobot was passing the string through verbatim. Users who tried the documented "minimal" to disable thinking got a 400; users who tried the DashScope-native "minimum" to work around it got `enable_thinking=True` because the internal comparison was a hard string match on "minimal". Introduce a semantic/wire split in `_build_kwargs`: - `semantic_effort` is the internal canonical form (OpenAI vocabulary). "minimum" on the way in is normalized to "minimal" here so both spellings share one meaning. - `wire_effort` is what we actually serialize. For DashScope with semantic_effort == "minimal" we translate to "minimum" on the way out; other providers are unchanged. - `thinking_enabled` and the Kimi thinking branch now compare on `semantic_effort`, so either user spelling correctly disables provider-side thinking. Tests: - Strengthen `test_dashscope_thinking_disabled_for_minimal` to assert the wire value is "minimum" in addition to the extra_body signal; the original version only checked extra_body and let the invalid-value bug slip through. - Add `test_dashscope_thinking_disabled_for_minimum_alias` so a user who read the DashScope docs and configured "minimum" still gets thinking off. - Add `test_non_dashscope_minimal_not_retranslated` to pin down that the DashScope-specific translation does not leak to OpenAI et al.	2026-04-22 12:49:55 +08:00
chengyongru	f6a417e77d	fix(transcription): harden language parameter validation and tests - Add ISO-639 pattern validation (2-3 lowercase letters) to schema - Normalize empty language to None in provider constructors - Extract shared httpx mock stubs, parameterize provider tests - Add test for language=None omitting field from multipart body - Add test for Pydantic pattern validation rejecting invalid codes	2026-04-22 12:41:32 +08:00
k	123d69bfb7	fix: allow specifying transcription language	2026-04-22 12:41:32 +08:00
k	e5b288c6eb	fix: map MiniMax reasoning_effort to reasoning_split	2026-04-22 00:52:56 +08:00
aiguozhi123456	53ba410e49	feat(read_file): add DOCX, XLSX, PPTX support via document.extract_text() Wire up the existing office document extractors in document.py to ReadFileTool by adding an extension guard and _read_office_doc() method that follows the established PDF pattern. Handles missing libraries, corrupt files, empty documents, and 128K truncation consistently.	2026-04-21 22:12:19 +08:00

1 2 3 4 5 ...

633 Commits