61 Commits

Author SHA1 Message Date
Xubin Ren
b6d63fb1ec fix: normalize responses circuit breaker keys
Made-with: Cursor
2026-04-19 20:16:25 +08:00
Mohamed Elkholy
baba3b2160 fix(providers): add circuit breaker for Responses API fallback
When the Responses API fails repeatedly (3 consecutive compatibility
errors), skip it and fall back directly to Chat Completions.  Unlike a
permanent disable, the circuit re-probes after 5 minutes so recovery
is automatic when the API comes back.  Success resets the counter.

Keyed per (model, reasoning_effort) so a failure with one model does
not affect others.
2026-04-19 20:16:25 +08:00
Xubin Ren
b8d327dc41 test + docs: lock should_execute_tools guard semantics (#3220)
Two small follow-ups to the guard:

1. Fix the should_execute_tools docstring so it matches the actual code.
   The previous version said "Only execute when finish_reason explicitly
   signals tool intent" but the code also accepts finish_reason == "stop".
   Explain why (some compliant providers emit "stop" with legitimate tool
   calls — openai_compat_provider.py already mirrors this at lines ~633 /
   ~678 where ("tool_calls", "stop") are both treated as the terminal
   tool-call state). Without this, a strict "tool_calls"-only guard would
   regress 15 existing runner tests that construct LLMResponse with
   tool_calls but no explicit finish_reason (default = "stop").

2. Add tests/providers/test_llm_response.py. This locks the three cases:
   - no tool calls                  -> never executes
   - tool calls + "tool_calls"/stop -> executes
   - tool calls + refusal / content_filter / error / length / ... -> blocked

   These are exactly the boundary cases the #3220 fix is about; without a
   test here a future refactor could silently revert the guard.

Body + tests only, no behavior change beyond the existing PR's intent.

Made-with: Cursor
2026-04-17 20:39:46 +08:00
chengyongru
8c0c4e5b31 refactor(agent): tighten comments, extract constant, strengthen edge case test
- Extract synthetic user message string to module-level constant
- Tighten comments in _snip_history recovery branch
- Strengthen no-user edge case test to verify safety net interaction
2026-04-17 16:20:53 +08:00
chengyongru
44b526c4ee fix(agent): preserve user message in _snip_history to prevent GLM error 1214
When _snip_history truncates the message history and the only user message
ends up outside the kept window, providers like GLM reject the resulting
system→assistant sequence with error 1214 ("messages 参数非法").

Two-layer fix:
1. _snip_history now walks backwards through non_system messages to recover
   the nearest user message when none exists in the kept window.
2. _enforce_role_alternation inserts a synthetic user message
   "(conversation continued)" when the first non-system message is a bare
   assistant (no tool_calls), serving as a safety net for any edge cases
   that slip through.

Co-authored-by: darlingbud <darlingbud@users.noreply.github.com>
2026-04-17 16:20:53 +08:00
Xubin Ren
a6ea06e6bf docs(providers): explain MiniMax thinking endpoint
Document why MiniMax thinking mode uses a separate Anthropic-compatible provider and list the matching base URLs. Add a small registry test so the new provider stays wired to the expected backend and API key.

Made-with: Cursor
2026-04-16 01:00:45 +08:00
04cb
eacc9fbb5f refactor(providers): drop unreachable GenerationSettings fallback 2026-04-15 23:52:38 +08:00
04cb
54f7ad3752 fix(providers): guard chat_with_retry against explicit None max_tokens (#3102) 2026-04-15 23:52:38 +08:00
razzh
9e2278826f feat(provider): enable Kimi thinking via extra_body for k2.5 and k2.6
- Inject `thinking={"type": "enabled|disabled"}` via extra_body for
  Kimi thinking-capable models (kimi-k2.5, k2.6-code-preview).
- Add _is_kimi_thinking_model helper to handle both bare slugs and
  OpenRouter-style prefixed names (e.g. moonshotai/kimi-k2.5).
- reasoning_effort="minimal" maps to disabled; any other value enables it.
- Add tests for enabled/disabled states and OpenRouter prefix handling.
2026-04-15 01:59:32 +08:00
Xubin Ren
a0812ad60e test: cover retry termination notifications
Lock the new interaction-channel retry termination hints so both exhausted standard retries and persistent identical-error stops keep emitting the final progress message.

Made-with: Cursor
2026-04-15 01:55:57 +08:00
Xubin Ren
b60e8dc0ba test: cover missing tool-call arguments normalization
Lock the strict-provider sanitization path so assistant tool calls without function.arguments are normalized to {} instead of being forwarded as missing values.

Made-with: Cursor
2026-04-15 01:37:41 +08:00
Michael-lhh
f293ff7f18 fix: normalize tool-call arguments for strict providers
Ensure assistant tool-call function.arguments is always emitted as valid JSON text so strict OpenAI-compatible backends (including Alibaba code models) do not reject requests. Add regressions for dict and malformed-string argument payloads in message sanitization.

Made-with: Cursor
2026-04-15 01:37:41 +08:00
chengyongru
ac714803f6 fix(provider): recover trailing assistant message as user to prevent empty request
When a subagent result is injected with current_role="assistant",
_enforce_role_alternation drops the trailing assistant message, leaving
only the system prompt. Providers like Zhipu/GLM reject such requests
with error 1214 ("messages parameter invalid"). Now the last popped
assistant message is recovered as a user message when no user/tool
messages remain.
2026-04-13 12:54:39 +08:00
haosenwang1018
3573109408 fix(provider): preserve static error helper compatibility 2026-04-13 09:37:31 +08:00
haosenwang1018
c68b3edb9d fix(provider): clarify local 502 recovery hints 2026-04-13 09:37:31 +08:00
Xubin Ren
217e1fc957 test(retry): lock in-place image fallback behavior
Add a focused regression test for the successful no-image retry path so the original message history stays stripped after fallback and the repeated retry loop cannot silently return.

Made-with: Cursor
2026-04-12 20:10:06 +08:00
yanghan-cyber
b261201985 fix(retry): strip images in-place to prevent repeated error-retry cycles
When a non-transient LLM error occurs with image content, the retry
mechanism strips images from a copy but never updates the original
conversation history. Subsequent iterations rebuild context from the
unmodified history, causing the same error-retry cycle to repeat
every iteration until max_iterations is reached.

Add _strip_image_content_inplace() that mutates the original message
content lists in-place after a successful no-image retry, so callers
sharing those references (e.g. the runner's conversation history)
also see the stripped version.
2026-04-12 20:10:06 +08:00
Xubin Ren
2bef9cb650 fix(agent): preserve interrupted tool-call turns
Keep tool-call assistant messages valid across provider sanitization and avoid trailing user-only history after model errors. This prevents follow-up requests from sending broken tool chains back to the gateway.
2026-04-10 05:37:25 +00:00
Xubin Ren
dadf453097 Merge origin/main into fix/sanitize-messages-non-claude
Resolved conflict in azure_openai_provider.py by keeping main's
Responses API implementation (role alternation not needed for the
Responses API input format).

Made-with: Cursor
2026-04-09 04:45:45 +00:00
Xubin Ren
d084d10dc2 feat(openai): auto-route direct reasoning requests with responses fallback 2026-04-08 15:21:08 +00:00
Xubin Ren
63acfc4f2f test: fix trailing-space mismatch and add regression tests for normal models
- Fix assertion in streaming dict fallback test (trailing space in data
  not reflected in expected value).
- Add two regression tests proving that models with reasoning_content
  (e.g. DeepSeek-R1) and standard models (no reasoning fields) are
  completely unaffected by the reasoning fallback.

Made-with: Cursor
2026-04-08 00:59:39 +08:00
moranfong
9e7c07ac89 test(provider): add StepFun reasoning field fallback tests
Add comprehensive tests for the StepFun Plan API compatibility fix:
- _parse dict branch: content and reasoning_content fallback to reasoning
- _parse SDK object branch: same fallback for pydantic response objects
- _parse_chunks dict branch: reasoning field handled in streaming mode
- _parse_chunks SDK branch: reasoning fallback for SDK delta objects
- Precedence tests: reasoning_content field takes priority over reasoning

Refs: fix(provider): support StepFun Plan API reasoning field fallback
2026-04-08 00:59:39 +08:00
Xubin Ren
1e8a6663ca test(anthropic): add regression tests for thinking modes incl. adaptive
Also update schema comment to mention 'adaptive' as a valid value.

Made-with: Cursor
2026-04-07 22:53:43 +08:00
Xubin Ren
35f53a721d refactor: consolidate _parse_retry_after_headers into base class
Merge the three retry-after header parsers (base, OpenAI, Anthropic)
into a single _extract_retry_after_from_headers on LLMProvider that
handles retry-after-ms, case-insensitive lookup, and HTTP date.

Remove the per-provider _parse_retry_after_headers duplicates and
their now-unused email.utils / time imports. Add test for retry-after-ms.

Made-with: Cursor
2026-04-06 08:44:52 +00:00
Xubin Ren
aeba9a23e6 refactor: remove dead _error_response wrapper in Anthropic provider
Fold _error_response back into _handle_error to match OpenAI/Azure
convention. Update all call sites and tests accordingly.

Made-with: Cursor
2026-04-06 08:35:02 +00:00
Xubin Ren
b575aed20e Merge origin/main into fix/structured-retry-classification-main
Made-with: Cursor
2026-04-06 08:28:20 +00:00
Xubin Ren
ebf29d87ae fix: include byteplus providers, guard None reasoning_effort, merge extra_body
- Add byteplus and byteplus_coding_plan to thinking param providers
- Only send extra_body when reasoning_effort is explicitly set
- Use setdefault().update() to avoid clobbering existing extra_body
- Add 7 regression tests for thinking params

Made-with: Cursor
2026-04-06 16:12:08 +08:00
Xubin Ren
77a88446fb Merge remote-tracking branch 'origin/main' into pr-2722 2026-04-04 13:51:59 +00:00
Xubin Ren
17d9d74ccc fix(provider): omit temperature for GPT-5 models 2026-04-04 20:18:22 +08:00
Lingao Meng
519911456a test(provider): fix incorrect assertion in reasoning_content sanitize test
The test test_openai_compat_strips_message_level_reasoning_fields was
added in fbedf7a and incorrectly asserted that reasoning_content and
extra_content should be stripped from messages. This contradicts the
intent of b5302b6 which explicitly added these fields to _ALLOWED_MSG_KEYS
to preserve them through sanitization.

Rename the test and fix assertions to match the original design intent:
reasoning_content and extra_content at message level should be preserved,
and extra_content inside tool_calls should also be preserved.

Signed-off-by: Lingao Meng <menglingao@xiaomi.com>
2026-04-04 20:08:44 +08:00
pikaxinge
31d3061a0a fix(retry): classify 429 as WAIT vs STOP using semantic signals 2026-04-04 05:23:21 +00:00
pikaxinge
cabf093915 Merge remote-tracking branch 'origin/main' into fix/structured-retry-classification-main
# Conflicts:
#	nanobot/providers/anthropic_provider.py
#	nanobot/providers/base.py
#	nanobot/providers/openai_compat_provider.py
2026-04-04 05:04:43 +00:00
Xubin Ren
7229a81594 fix(providers): disable Azure SDK retries by default
Made-with: Cursor
2026-04-04 12:36:45 +08:00
pikaxinge
dbdf7e5955 fix: prevent retry amplification by disabling SDK retries 2026-04-04 12:36:45 +08:00
Xubin Ren
91a9b7db24 Merge origin/main into fix/retry-after-robust
Made-with: Cursor
2026-04-03 19:07:30 +00:00
Lingao Meng
a05f83da89 test(providers): cover reasoning_content extraction in OpenAI compat provider
Add regression tests for the non-streaming (_parse dict branch) and
streaming (_parse_chunks dict and SDK-object branches) paths that extract
reasoning_content, ensuring the field is populated when present and None
when absent.

Signed-off-by: Lingao Meng <menglingao@xiaomi.com>
2026-04-04 02:09:57 +08:00
pikaxinge
b951b37c97 fix: use structured error metadata for app-layer retry 2026-04-02 18:42:20 +00:00
pikaxinge
5d1ea43858 fix: robust Retry-After extraction across provider backends 2026-04-02 18:39:24 +00:00
Xubin Ren
714a4c7bb6 fix(runtime): address review feedback on retry and cleanup 2026-04-02 10:57:12 +00:00
Xubin Ren
eefd7e60f2 Merge remote-tracking branch 'origin/main' into feat/runtime-hardening 2026-04-02 10:40:49 +00:00
Xubin Ren
cc33057985 refactor(providers): rename openai responses helpers 2026-04-02 13:43:34 +08:00
Xubin Ren
ded0967c18 fix(providers): sanitize azure responses input messages 2026-04-02 13:43:34 +08:00
Kunal Karmakar
61d7411238 Fix failing test 2026-04-02 13:43:34 +08:00
Kunal Karmakar
76226274bf Failing test 2026-04-02 13:43:34 +08:00
Kunal Karmakar
e206cffd7a Add tests and handle json 2026-04-02 13:43:34 +08:00
Kunal Karmakar
ac2ee58791 Add tests and logs 2026-04-02 13:43:34 +08:00
Kunal Karmakar
8c0607e079 Use SDK for stream 2026-04-02 13:43:34 +08:00
Kunal Karmakar
0417c3f03b Use OpenAI responses API 2026-04-02 13:43:34 +08:00
Xubin Ren
a3e4c77fff fix(providers): normalize anthropic cached token usage 2026-04-02 12:51:45 +08:00
chengyongru
da08dee144 feat(provider): show cache hit rate in /status (#2645) 2026-04-02 12:51:45 +08:00