nanobot

mirror of https://github.com/HKUDS/nanobot.git synced 2026-06-15 15:24:06 +00:00

Author	SHA1	Message	Date
moran	9ed638ad70	feat(transcription): add SiliconFlow as transcription provider - Register SiliconFlow in transcription registry with default model FunAudioLLM/SenseVoiceSmall and alias 'silicon' - Reuse existing OpenAITranscriptionProvider adapter (Whisper-compatible) - Add generic key/base resolution: fallback to registry env_key and default_api_base when provider config is absent - Add tests for registry entry, alias, adapter, default model, and config resolution with env var fallback	2026-06-10 23:05:12 +08:00
chengyongru	bc4bb508a1	fix: continue recovered streams in a new segment maintainer edit: streamed timeout recovery was returning the retried response internally while the channel still treated the final outbound as already streamed. End the current stream segment before retry/fallback recovery so subsequent deltas are delivered in a new segment.	2026-06-10 18:10:44 +08:00
aiguozhi123456	2c5a4e0703	fix(providers): allow retry and fallback on stream stalled timeout When a stream stalls mid-response, both the retry layer and FallbackProvider blocked recovery because content had already been emitted via on_content_delta. This left users with truncated replies and no automatic recovery. For error_kind="timeout" specifically: - _run_with_retry now suppresses delta callbacks and retries the same model instead of returning immediately - FallbackProvider now allows failover to a different model with delta callbacks suppressed Non-timeout errors retain the original "skip retry/failover after streamed content" behavior to avoid duplicate output.	2026-06-10 18:10:44 +08:00
Xubin Ren	62a35c21b8	fix(asr): normalize StepFun transcription endpoint	2026-06-10 15:50:38 +08:00
moran	7930058348	feat(asr): add StepFun ASR SSE transcription provider - Add StepFunTranscriptionProvider class in nanobot/providers/transcription.py - New _post_stepfun_asr_with_retry() function handling SSE stream parsing (transcript.text.delta → transcript.text.done event sequence) - Register 'stepfun' in transcription_registry.py with default model stepaudio-2.5-asr - Reuse existing stepfun provider config (apiBase can point to Plan endpoint) - Add 17 tests covering SSE parsing, retry contract, empty-text edge case, and registry integration - Update docs/configuration.md with stepfun ASR documentation StepFun ASR uses a dedicated SSE endpoint (/v1/audio/asr/sse) rather than the chat-completions or Whisper multipart formats used by other providers. Users on Step Plan can set apiBase to the Plan endpoint.	2026-06-10 15:50:38 +08:00
chengyongru	99f7f371fa	fix: cover o1 max-completion token fallback Maintainer edit: keep the GPT-5/o-series fallback on slug-boundary matching so unrelated model names are not caught by substring checks, and include o1 alongside o3/o4 because it is also an o-series chat model.	2026-06-10 14:47:10 +08:00
04cb	a779e7c29e	fix(providers): use max_completion_tokens for gpt-5/o-series on flagless specs (#4261 )	2026-06-10 14:47:10 +08:00
chengyongru	0a396aa6e2	Improve tool call validation strictness (#4190 ) * Improve tool call validation strictness Reject near-miss tool names without executing suggested tools. Require object-shaped tool parameters while preserving only lossless JSON wire-shape normalization. * Tighten tool call argument validation * Simplify tool argument validation tests * Improve tool name suggestions * Simplify tool suggestion helpers * Limit tool suggestions to canonical matches * Allow repair only for tool history replay * Clarify non-object tool argument errors * Inline replay tool argument normalization * Track only successful tool executions * Reject JSON null tool arguments	2026-06-09 14:50:40 +08:00
comadreja	f3eb2aa08b	feat(transcription): add AssemblyAI as transcription provider Add AssemblyAI as a third transcription provider option alongside OpenAI and Groq. AssemblyAI offers better accuracy for certain audio types (distant voices, noisy environments) and serves as a reliable fallback when other providers struggle. Changes: - Add AssemblyAITranscriptionProvider class in providers/transcription.py - Add 'assemblyai' option in base channel's transcribe_audio() - Per-channel configuration via transcriptionProvider in config Usage: Set transcriptionProvider: 'assemblyai' and provide an AssemblyAI API key via transcriptionApiKey in the channel config.	2026-06-09 05:33:18 +08:00
NanoBot	c20ecc52d7	feat(transcription): add Xiaomi MiMo ASR provider (mimo-v2.5-asr) Add support for Xiaomi MiMo ASR as a third transcription backend alongside Groq and OpenAI Whisper. Xiaomi ASR uses the /v1/chat/completions endpoint with base64-encoded audio input, rather than the standard Whisper multipart upload format. Co-Authored-By:连 <lian@tangping.homes>	2026-06-09 04:29:09 +08:00
Ilia Breitburg	0eb3010e40	feat(transcription): configurable STT model + OpenRouter provider Add a `transcriptionModel` channel setting and an OpenRouter transcription backend so voice messages can be transcribed through OpenRouter's speech-to-text endpoint (e.g. nvidia/parakeet-tdt-0.6b-v3, openai/whisper-1), alongside the existing Groq/OpenAI Whisper providers. - schema: add channels.transcriptionModel (None = provider default) - providers/transcription: extract a shared POST/retry skeleton; add a JSON+base64 OpenRouterTranscriptionProvider; make the STT model a constructor param on all providers instead of hardcoding it - channels: route transcriptionProvider="openrouter" and thread the model through the manager to each channel - docs + tests Only dedicated STT models work on OpenRouter's transcription endpoint; chat LLMs (e.g. google/gemini-3.5-flash) are rejected there. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-09 04:01:37 +08:00
axelray-dev	28f3a20d64	feat(providers): add extra_query config for OpenAI-compatible providers Adds ProviderConfig.extra_query, threaded into AsyncOpenAI(default_query) so that Azure-style gateways requiring query params like api-version can be configured without URL hacks. Also updates provider_signature to track extra_query changes so per-turn refresh rebuilds the provider when the value changes. Addresses the extra_query portion of #4204. The max_completion_tokens model-awareness enhancement is intentionally left separate.	2026-06-09 03:18:14 +08:00
Xubin Ren	9c81280300	feat(transcription): add shared voice input support (#4232 ) * feat(webui): add voice transcription input * feat(webui): render ANSI output in code blocks * refactor(webui): isolate voice recorder logic * refactor(transcription): keep websocket ingress thin * refactor(transcription): resolve channel audio settings on demand * style(webui): neutralize voice waveform color * feat(webui): add voice input tooltip * feat(webui): add voice input keyboard shortcut * fix(webui): distinguish voice shortcut platforms * fix(webui): place voice button after model selector * refactor(webui): share voice hold recording helpers * fix(desktop): allow microphone voice input * fix(webui): stabilize token usage month labels * feat(webui): show voice input on settings overview * fix(webui): label voice capability as recognition * fix(webui): align capability overview status * refactor(webui): isolate transcription socket handling * fix(webui): soften silent voice waveform * refactor(audio): clarify transcription service location * docs(transcription): clarify audio and provider boundaries * fix(exec): reduce session output polling flake	2026-06-09 01:08:49 +08:00
chengyongru	631fdb4a46	test: cover empty reasoning_content history preservation maintainer edit: add SDK-object and tool-call history regressions so the empty-string reasoning_content fix is covered across both parse branches and the sanitized request path.	2026-06-08 01:08:27 +08:00
michaelxer	05de864f5b	fix: preserve empty-string reasoning_content instead of coercing to None Custom providers (e.g. DeepSeek) may return reasoning_content as an empty string "" to explicitly indicate no reasoning occurred. The previous truthiness checks (, ) treated "" as falsy and converted it to None, which caused the field to be dropped from the message history entirely. Providers that require reasoning_content on all assistant messages then rejected subsequent requests. Replace truthiness checks with identity checks () so that empty-string reasoning_content is preserved as-is. The streaming path is unchanged since an empty join genuinely means no chunks received. Fixes #4105	2026-06-08 01:08:27 +08:00
Xubin Ren	ab9f49970d	feat(desktop): polish desktop shell and shared WebUI surfaces (#4195 ) * feat(desktop): add native host scaffold * feat(webui): track turns and usage in gateway * feat(webui): polish desktop chat experience * feat(apps): add ArcGIS and Joplin logos * feat(desktop): polish shell and shared surfaces * fix(webui): avoid preview chips for glob references * test: align CI expectations for token fallback * feat(webui): preview prompt rail entries * feat(webui): add prompt navigator drawer * style(webui): refine prompt navigator placement * style(webui): align prompt navigator with header actions * style(webui): simplify prompt navigator header * refactor(webui): clean thread resource refresh * feat(desktop): add native reply notifications * fix(webui): preserve desktop restart and replay state * fix(desktop): harden gateway proxy startup * fix(web): fall back when readability is unavailable * fix(desktop): hide window instead of closing on macos * fix(webui): unify desktop header actions * fix(webui): simplify prompt history rows * fix(desktop): log notification delivery failures * chore(desktop): clean source package artifacts * fix(cron): support one-time relative reminders * fix(webui): reveal scroll button in place * Revert "fix(cron): support one-time relative reminders" This reverts commit 4c4661da120a3c7283e0768412bae48604e7390b. * refactor(webui): extract token usage heatmap * docs(desktop): clarify contributor guides --------- Co-authored-by: chengyongru <2755839590@qq.com>	2026-06-06 19:49:33 +08:00
Xubin Ren	a1b9577224	test(image): cover dropping null OpenAI image params	2026-06-06 19:35:46 +08:00
chengyongru	d435cb0b21	fix: harden custom image provider compatibility Maintainer edit: preserve provider-specific size hints for custom image generation endpoints while keeping the default 1K mapping compatible. Clarify the custom provider contract in docs and cover response_format/size overrides in tests.	2026-06-05 15:56:03 +08:00
chengyongru	ae17a79bdf	fix: harden custom image generation config Maintainer edit: require providers.custom.apiBase before making custom image requests and allow unauthenticated local endpoints by omitting Authorization when no apiKey is configured.	2026-06-05 15:56:03 +08:00
axelray-dev	748b28da01	feat(image): support custom image generation provider Addresses #4132. Add CustomImageGenerationClient for any OpenAI-compatible image generation API (POST {apiBase}/images/generations). Uses the existing providers.custom config slot. No schema changes required. Tests: 54 passed, ruff clean. Signed-off-by: axelray-dev <110029405+axelray-dev@users.noreply.github.com>	2026-06-05 15:56:03 +08:00
Kunal Karmakar	fa423dffbc	Remove check from the test	2026-06-05 01:17:34 +08:00
Kunal Karmakar	c849ff6eec	Address PR review comments	2026-06-05 01:17:34 +08:00
Kunal Karmakar	ba3fa38e97	Add support for Azure AAD based Auth	2026-06-05 01:17:34 +08:00
Xubin Ren	3dcf511c84	feat(webui): refine output timeline and model controls (#4108 ) * feat(webui): refine output timeline and composer queue * feat(webui): add provider model picker * fix(webui): polish model settings and heartbeat checks * chore: keep heartbeat changes out of webui pr * refactor(webui): isolate settings routes * fix(providers): align minimax anthropic test * fix(providers): keep minimax anthropic base sdk-compatible * fix(providers): normalize anthropic base urls	2026-05-30 23:45:26 +08:00
chengyongru	b2e43955e3	fix: add regression tests for bare-dict coercion, update stale comment	2026-05-30 15:35:04 +08:00
04cb	9d3fe7c34b	fix(providers): surface clear arrearage warning on quota/billing errors (#3006 )	2026-05-29 15:31:17 +08:00
Xubin Ren	3a420136bb	feat(webui): add project workspaces and access controls (#4007 ) * feat(webui): add project workspaces and access controls * feat(webui): add project workspaces and access controls * refactor(tools): centralize workspace access resolution * refactor(webui): remove unused workspace host state * fix(webui): hide estimated file edit label * fix(webui): clarify file edit deletion feedback * fix(webui): label deleted file activity * fix(webui): flatten file edit activity rows * fix(core): remove path-only patch deletion * fix(core): keep apply patch non-destructive * refactor(webui): trim workspace host plumbing * fix(tools): register exec with tools config	2026-05-29 03:42:53 +08:00
hamb1y	0df60416ba	fix(agent): address session and streaming concurrency bugs	2026-05-28 22:54:46 +08:00
yeounhyeok	ac8bef76f6	fix(provider): honor NANOBOT_STREAM_IDLE_TIMEOUT_S in Codex provider Every other streaming provider (anthropic, bedrock, openai_compat, litellm) reads NANOBOT_STREAM_IDLE_TIMEOUT_S with a 90s default. The Codex provider hardcoded 60s in _request_codex, so it could not be tuned the same way and aborted streams sooner than its peers. Read the same env var with the same default and pass it as the httpx client timeout. The variable name and int parsing match anthropic / openai_compat / bedrock verbatim. #4009 normalized the error response when the timeout fires; this PR fixes the timeout knob itself.	2026-05-28 02:17:15 +08:00
EunHyunsu	18567daaa0	Handle blank Codex transport errors	2026-05-27 03:01:32 +08:00
outlook84	92f2ff3a33	test: Add test to ensure responses API is used regardless of circuit breaker state	2026-05-25 01:23:36 +08:00
outlook84	c433d60681	feat: Enhance OpenAI provider configuration with extraBody support and apiType validation	2026-05-25 01:23:36 +08:00
outlook84	d472595417	feat: Add OpenAI API type configuration and update provider settings	2026-05-25 01:23:36 +08:00
Yuxin Lou	3f0098839e	fix(provider): preserve OpenAI-compatible tool call ids	2026-05-24 20:53:14 +08:00
04cb	ef2ef4f789	fix(transcription): normalize chat-style apiBase to audio endpoint (#3637 )	2026-05-23 17:32:59 +08:00
Jiajun Xie	3e6f9907fe	feat: Add Zhipu (智谱) image generation provider	2026-05-23 17:06:36 +08:00
Xubin Ren	143224e25a	Merge remote-tracking branch 'origin/main' into codex/review-pr-3929	2026-05-22 22:15:46 +08:00
Yuxin Lou	055c9be359	fix: dedupe Responses replay item ids Ensure converted Responses API input items use unique replay ids when restoring assistant messages and function calls. This prevents Codex from rejecting resumed conversations with duplicate rs_* item ids while preserving call_id-based tool result linkage.	2026-05-22 22:14:07 +08:00
Xubin Ren	f5534bcaa0	Merge origin/main into fix-ollama-image-generation	2026-05-22 21:15:42 +08:00
Xubin Ren	8c0b2c1a29	fix(image-generation): clamp OpenAI sizes by model family	2026-05-22 17:42:01 +08:00
ZegWe	ffd85a8611	fix image generation provider settings	2026-05-22 17:42:01 +08:00
ZegWe	3483141ed7	feat(providers): add OpenAI and OpenAI Codex image generation providers Add two new image generation providers: - `openai` — uses the standalone OpenAI Images API (`/v1/images/generations`) with an API key. Supports DALL-E and gpt-image-* models, with automatic parameter adjustment (gpt-image models don't accept response_format or n). - `openai_codex` — uses the Codex Responses API with the `image_generation` tool, authenticated via OAuth subscription token. The same mechanism ChatGPT uses internally. Also remove the API key pre-check in ImageGenerationTool so providers that handle their own auth fallback (like Codex OAuth) can work without a configured key. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-22 17:42:01 +08:00
A.G. Bocsardi	9b2f452b6e	fix: drop redundant reasoning_effort for Kimi thinking models Moonshot's API rejects requests that carry both 'reasoning_effort' (top-level kwarg) and 'thinking' (extra_body) at the same time. After the unified thinking-style injection loop injects the native 'thinking' param for kimi models, pop 'reasoning_effort' from kwargs since it is redundant and causes a 400 error. Uses _model_slug() + _KIMI_THINKING_MODELS lookup to stay consistent with the refactored code (the old _is_kimi_thinking_model helper was removed in 4f895e63). Existing kimi tests updated to assert 'reasoning_effort' is absent. Xiaomi MiMo models are unaffected — their API accepts both params. Closes #3939	2026-05-22 03:36:28 +08:00
Xubin Ren	8281cd1946	test(providers): cover Novita gateway fallback	2026-05-21 16:16:32 +08:00
Alex-wuhu	e5476573f4	test(providers): align Novita provider coverage	2026-05-21 16:16:32 +08:00
Alex-wuhu	0d1d23b5fb	feat: add Novita AI provider	2026-05-21 16:16:32 +08:00
Xubin Ren	23d5148a57	fix(provider): dedupe repeated tool ids in history	2026-05-21 15:33:49 +08:00
Haisam Abbas	84603f4cf2	Add Ollama image generation support	2026-05-21 12:06:08 +05:00
Xubin Ren	4f895e6307	refactor(providers): centralize gateway reasoning control	2026-05-21 14:41:50 +08:00
olgagaga	0cd2f626c0	fix(providers): inject OpenRouter `reasoning.effort` for thinking models Follow-up to #3851: that PR added `extra_body.thinking={type: disabled}` for MiMo via OpenRouter, but OR doesn't forward provider-specific thinking shapes to upstream — it strips unknown extra_body fields and uses its own unified `reasoning` parameter. So MiMo via OR kept thinking despite the injection (reproduced by @ClearPlume on #3851 with identical kwargs but provider switched from openrouter → xiaomi_mimo). For known thinking-capable models (Kimi, MiMo) routed via the openrouter spec, also inject `extra_body.reasoning = {effort: <effort>}` in OR's documented enum ("none"\|"minimal"\|"low"\|"medium"\|"high"\|"xhigh"). OR translates this to the upstream model's native shape. Existing tests updated to expect both fields on the OR path. The direct xiaomi_mimo and moonshot paths are unchanged (the new branch is gated on spec.name == "openrouter"). Flash and non-MiMo models on OR continue to receive no injection.	2026-05-21 14:41:50 +08:00

1 2 3 4

163 Commits