2112 Commits

Author SHA1 Message Date
Pablo Cabeza
c23d719780 feat(agent): emit structured _tool_events progress metadata
Extend the existing on_progress callback to carry structured tool-event
payloads alongside the plain-text hint, so channels can render rich
tool execution state (start/finish/error, arguments, results, file
attachments) rather than only the pre-formatted hint string.

Changes
-------
- AgentLoop._tool_event_start_payload() — builds a version-1 start
  payload from a ToolCallRequest
- AgentLoop._tool_event_result_extras() — extracts files/embeds from a
  tool result dict
- AgentLoop._tool_event_finish_payloads() — maps tool_calls +
  tool_results + tool_events from AgentHookContext into finish payloads
- _LoopHook.before_execute_tools() — passes tool_events=[...] to
  on_progress together with the existing tool_hint flag
- _LoopHook.after_iteration() — emits a second on_progress call with
  the finish payloads once tool results are available
- _bus_progress() — forwards tool_events as _tool_events in OutboundMessage
  metadata so channel implementations can read them
- on_progress type widened to Callable[..., Awaitable[None]] on all
  public entry points; _cli_progress updated to accept and ignore
  tool_events

The contract is additive: callers that only accept (content, *, tool_hint)
continue to work unchanged. Callers that also accept tool_events receive
the structured data.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-23 20:06:11 +08:00
Xubin Ren
185a8fd34d fix(webui): opaque composer, equal-width message area, cleaner user pill 2026-04-23 07:48:32 +00:00
Xubin Ren
06503cd0fc fix(telegram): keep callback_data under Telegram's 64-byte cap
``InlineKeyboardButton(label, callback_data=label)`` fails Telegram's
API when the label exceeds 64 bytes UTF-8. An LLM-generated long
option (realistic in multilingual flows) used to 400 the ``send_message``
call silently — user got nothing, agent heard a successful retry-then-drop.

Decouple display from wire: button text keeps the full label, callback_data
gets truncated at a UTF-8 char boundary. Tap echoes the prefix back as the
user message; the LLM understands a prefix of its own option just fine,
and the display the user saw was always the full string.

Locks: helper boundary behavior (ASCII, CJK, short labels pass through)
and end-to-end ``_build_keyboard`` integration with an over-cap label.

Made-with: Cursor
2026-04-23 13:26:06 +08:00
Xubin Ren
6bc2983ab1 fix(telegram): fall back buttons to inline text when keyboard disabled
Buttons are semantic options, not a separate channel protocol: a user
who taps "Yes" and a user who types "yes" arrive at the agent as the
same string. Dropping ``msg.buttons`` when ``inline_keyboards=False``
was the worst of both worlds — the agent got told "Message sent with
N button(s)" while the user saw a question with no options.

Splice the labels into the message text instead. The LLM produces the
same ``message(buttons=...)`` call regardless of channel; the channel
layer picks the richest rendering it can afford — native keyboard when
enabled, bracketed inline text otherwise. Layout is preserved (one row
per line). Other channels can adopt the same helper incrementally.

Locks: canonical ``_buttons_as_text`` format, flag-off send-path
splices labels, flag-on send-path keeps content clean and rides
``reply_markup``.

Made-with: Cursor
2026-04-23 13:26:06 +08:00
Xubin Ren
b9b81d9301 test(telegram): pin inline-keyboards flag gate and buttons validation
Two kill-switch tests for the new inline-keyboards path. Neither is
flashy — they just make sure the next unrelated refactor can't quietly
regress two narrow contracts the PR relies on.

  1. TelegramChannel._build_keyboard returns None whenever
     TelegramConfig.inline_keyboards is False, even if buttons are
     supplied. The flag defaults off; if someone ever flips that default
     the change should fail this test before it reaches prod bots.

  2. MessageTool rejects malformed `buttons` payloads (non-list, mixed
     list/str row, non-str label, None label) up front instead of
     letting them slip into the channel layer where Telegram would
     silently 400 the send. Parametrized over four shapes the guard
     needs to reject.

No production code touched.

Made-with: Cursor
2026-04-23 13:26:06 +08:00
Gunnar Thielebein
8d33c1cb37 feat(telegram): add inline keyboard buttons 2026-04-23 13:26:06 +08:00
Xubin Ren
e3bca929fb fix(webui): left-align prose inside user message pill 2026-04-23 00:07:27 +08:00
Xubin Ren
e493eb09e7 test(webui): realign thread-composer attach test with current types 2026-04-23 00:07:27 +08:00
Xubin Ren
707c0d7f3a fix(websocket): scrub partial media batches, nosniff /api/media 2026-04-23 00:07:27 +08:00
Xubin Ren
61a28c2c0a feat(webui): support image uploads in composer and message bubbles 2026-04-23 00:07:27 +08:00
Xubin Ren
c1e7aa5504 refactor(config): resolve env vars via in-place Pydantic walk
Replace the dump→resolve→model_validate roundtrip with a recursive walk
that substitutes ${VAR} in string values directly on BaseModel /
__pydantic_extra__ / dict / list nodes. Identity is preserved on any
subtree with no references, so the original Config instance is returned
unchanged when nothing needs resolving.

Side effects:
- exclude=True fields (e.g. DreamConfig.cron) now survive even when
  other fields in the same config contain ${VAR} references, closing
  the edge case left open by the previous fast-path-only fix.
- _has_env_refs is dropped (the walker short-circuits naturally).
- Added a regression test pairing cron with a resolved providers.groq
  api_key to lock the coexistence case.

Made-with: Cursor
2026-04-22 22:31:40 +08:00
Saimon Ventura
c9a21d96d8 fix(config): preserve excluded fields in resolve_config_env_vars
`resolve_config_env_vars` unconditionally dumped the config via
`model_dump(mode="json")` and revalidated it, which silently dropped
any field declared with `exclude=True` (e.g. `DreamConfig.cron` —
introduced by the Dream rename refactor in #2717). Result:
`agents.defaults.dream.cron` was never honored at runtime — the gateway
always fell back to the default `every 2h` schedule even when `cron`
was set in config.json.

Fix: skip the roundtrip entirely when the config has no `${VAR}`
references. Env-var interpolation still works unchanged when refs
exist; the legacy `cron` override now survives the common case of
fully-resolved config.

Regression test covers the bug path.
2026-04-22 22:31:40 +08:00
Xubin Ren
239e91a4d6 test(anthropic): pin tool_result image_url conversion regression
Adds a focused regression test so the fix for tool_result image
handling cannot silently revert. Two cases:

- list content with an image_url + text block -> image_url is
  translated to a native Anthropic image block, sibling text passes
  through unchanged
- plain string content passes through untouched (the new list branch
  must not alter the string path)

These cover the exact symptom surface (silent image drop with a
"Non-transient LLM error with image content" warning) and the only
two content shapes tool results actually take today.

Made-with: Cursor
2026-04-22 22:10:53 +08:00
lentan
29a08df06a fix(anthropic): convert image_url blocks inside tool_result content
_tool_result_block passed list content through unchanged, so image_url
blocks returned by tools (e.g. read_file on an image file, which
returns OpenAI-format image_url blocks via build_image_content_blocks)
reached the Anthropic API unconverted and were rejected. User-role
messages already ran through _convert_user_content at the call site,
so inbound Telegram photos worked, but tool results did not.

Run _convert_user_content on list content inside _tool_result_block
so image_url blocks become native Anthropic image blocks. Required
making _convert_user_content a @staticmethod (it did not use self)
and calling _convert_image_block via the class to match.

Repro: an agent calling read_file on any image file got a
"Non-transient LLM error with image content, retrying without images"
warning and the image was silently dropped from the conversation.
2026-04-22 22:10:53 +08:00
chengyongru
42c4af2118 fix(agent): prevent duplicate responses when sub-agents complete concurrently
When the main agent spawns multiple sub-agents, each completion
independently triggered a new _dispatch, causing 3-4 user-visible
responses instead of a single comprehensive report.

- Extend _drain_pending to block-wait on pending_queue when sub-agents
  are still running, keeping the runner loop alive for in-order injection
- Pass pending_queue in the system message path so subsequent sub-agent
  results can still be injected mid-turn via a new dispatch
2026-04-22 20:02:19 +08:00
Xubin Ren
7c21349828 Merge pull request #3379 from lahuman/fix/3324-windows-mcp-stdio
fix(mcp): avoid WinError 193 for Windows stdio launchers

Co-authored-by: lahuman <6156679+lahuman@users.noreply.github.com>
2026-04-22 08:09:46 +00:00
Xubin Ren
79247545ac Merge remote-tracking branch 'origin/main' into pr-3379 2026-04-22 08:08:05 +00:00
Xubin Ren
f718a71dcc Merge pull request #3380 from gongpx20069/fix/github-copilot-gpt5-support
fix(providers): support GPT-5 models on GitHub Copilot backend

Co-authored-by: gongpx20069 <21985921+gongpx20069@users.noreply.github.com>
2026-04-22 06:53:39 +00:00
Xubin Ren
427deb4a70 test(providers): add regression tests for GitHub Copilot /responses routing
Locks in the four behaviors introduced by the fix so they can't silently
revert:
- _should_use_responses_api accepts github_copilot on its non-OpenAI base
- _build_responses_body strips the 'github_copilot/' routing prefix
- /responses failures on github_copilot do not fall back to /chat/completions

Made-with: Cursor
2026-04-22 06:53:37 +00:00
Peixian Gong
dd26b4407d fix(providers): make GitHub Copilot backend work with GPT-5/o-series models
Calling GitHub Copilot with `gpt-5.*` / `o*` models (e.g.
`github_copilot/gpt-5.4`, `github_copilot/gpt-5.4-mini`) failed with a
chain of misleading errors:

  1. `Unsupported parameter: 'max_tokens' is not supported with this
     model. Use 'max_completion_tokens' instead.`
  2. `model "gpt-5.4-mini" is not accessible via the /chat/completions
     endpoint` (`unsupported_api_for_model`).
  3. `The requested model is not supported.` (`model_not_supported`)
     even after routing to /responses.

Root causes (each one masked the next):

  * The `github_copilot` ProviderSpec did not opt into
    `supports_max_completion_tokens`, so `_build_kwargs` always sent the
    legacy `max_tokens` parameter that GPT-5/o-series reject.
  * `_should_use_responses_api` was hard-gated to
    `spec.name == "openai"` plus a direct-OpenAI base URL, so the
    GitHub Copilot backend always went through /chat/completions even
    for models the Copilot gateway exposes only via /responses
    (e.g. `gpt-5.4-mini`).
  * When /responses did fail on github_copilot, the existing
    "compatibility marker" heuristic silently fell back to
    /chat/completions — which can never succeed for these models — so
    the real upstream error was hidden.
  * `_build_responses_body` did not honour `spec.strip_model_prefix`,
    so the request body sent `model="github_copilot/gpt-5.4-mini"`
    (with the routing prefix), which the Copilot gateway rejects with
    `model_not_supported`. (`_build_kwargs` already stripped it; this
    branch was missed.)

Fix:

  * registry.py: set `supports_max_completion_tokens=True` on the
    `github_copilot` spec so requests use `max_completion_tokens`.
  * openai_compat_provider.py:
      - `_should_use_responses_api` now also allows the
        `github_copilot` spec, and skips the direct-OpenAI base check
        for it (the Copilot gateway is its own base URL).
      - `_build_responses_body` now strips the model routing prefix
        when `spec.strip_model_prefix` is set, matching `_build_kwargs`.
      - `chat` / `chat_stream` no longer fall back from /responses to
        /chat/completions on the `github_copilot` spec: the fallback
        cannot succeed for GPT-5/o-series and would mask the real
        gateway error.

Tests:

  * tests/cli/test_commands.py: switched the
    `test_github_copilot_provider_refreshes_client_api_key_before_chat`
    fixture model from `gpt-5.1` to `gpt-4` so it continues to exercise
    the /chat/completions code path it was designed for (gpt-5.1 now
    correctly routes to /responses on github_copilot).
  * `pytest tests/providers/ tests/cli/test_commands.py` — 314 passed.
  * Verified end-to-end against the live Copilot gateway with both
    `github_copilot/gpt-5.4` and `github_copilot/gpt-5.4-mini`.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-22 14:28:19 +08:00
k
03ec28dd49 fix(mcp): avoid WinError 193 for Windows stdio launchers 2026-04-22 14:50:55 +09:00
hussein1362
0932189860 fix: handle Windows PermissionError on directory fsync
On Windows, opening a directory with O_RDONLY raises PermissionError.
Wrap the directory fsync in a try/except PermissionError — NTFS journals
metadata synchronously so the directory sync is unnecessary there.

Also adjust test assertions to expect 1 fsync call (file only) on
Windows vs 2 (file + directory) on POSIX.
2026-04-22 13:19:53 +08:00
hussein1362
512bf59b3c fix(session): fsync sessions on graceful shutdown to prevent data loss
On filesystems with write-back caching (rclone VFS, NFS, FUSE mounts)
the OS page cache may buffer recent session writes. If the process is
killed before the cache flushes, the most recent conversation turns are
silently lost — causing the agent to "forget" recent context and
respond to stale history on the next startup.

Changes:

- session/manager.py: add fsync=True option to save() that flushes the
  file and its parent directory to durable storage. Add flush_all() that
  re-saves every cached session with fsync. Default save() behavior is
  unchanged (no fsync) to avoid performance regression in normal
  operation.

- cli/commands.py: call agent.sessions.flush_all() in the gateway
  shutdown finally block, after stopping heartbeat/cron/channels.

- tests/session/test_session_fsync.py: 8 tests covering fsync flag
  behavior, flush_all with empty/multiple/errored sessions, and
  data survival across simulated process restart.

- tests/cli/test_commands.py: add sessions attribute to _FakeAgentLoop
  so the gateway health endpoint test passes with the new shutdown
  flush.
2026-04-22 13:19:53 +08:00
Xubin Ren
ef8bbab7b3 test(cli): lock _render_interactive_ansi force_terminal to isatty
Made-with: Cursor
2026-04-22 13:12:29 +08:00
wood3n
2e419f9ba2 fix(cli): respect sys.stdout.isatty() in commands.py 2026-04-22 13:12:29 +08:00
Xubin Ren
88c619901e review(providers): tighten comments in reasoning_effort normalize path
Made-with: Cursor
2026-04-22 12:49:55 +08:00
hlg
28c42628b0 fix: normalize DashScope reasoning_effort (minimal vs minimum)
DashScope rejects the OpenAI-style value "minimal" with
`'reasoning_effort.effort' must be one of: 'none', 'minimum', 'low',
'medium', 'high', 'xhigh'`, but nanobot was passing the string through
verbatim. Users who tried the documented "minimal" to disable thinking
got a 400; users who tried the DashScope-native "minimum" to work
around it got `enable_thinking=True` because the internal comparison
was a hard string match on "minimal".

Introduce a semantic/wire split in `_build_kwargs`:

- `semantic_effort` is the internal canonical form (OpenAI vocabulary).
  "minimum" on the way in is normalized to "minimal" here so both
  spellings share one meaning.
- `wire_effort` is what we actually serialize. For DashScope with
  semantic_effort == "minimal" we translate to "minimum" on the way
  out; other providers are unchanged.
- `thinking_enabled` and the Kimi thinking branch now compare on
  `semantic_effort`, so either user spelling correctly disables
  provider-side thinking.

Tests:

- Strengthen `test_dashscope_thinking_disabled_for_minimal` to assert
  the wire value is "minimum" in addition to the extra_body signal;
  the original version only checked extra_body and let the
  invalid-value bug slip through.
- Add `test_dashscope_thinking_disabled_for_minimum_alias` so a user
  who read the DashScope docs and configured "minimum" still gets
  thinking off.
- Add `test_non_dashscope_minimal_not_retranslated` to pin down that
  the DashScope-specific translation does not leak to OpenAI et al.
2026-04-22 12:49:55 +08:00
chengyongru
f6a417e77d fix(transcription): harden language parameter validation and tests
- Add ISO-639 pattern validation (2-3 lowercase letters) to schema
- Normalize empty language to None in provider constructors
- Extract shared httpx mock stubs, parameterize provider tests
- Add test for language=None omitting field from multipart body
- Add test for Pydantic pattern validation rejecting invalid codes
2026-04-22 12:41:32 +08:00
k
123d69bfb7 fix: allow specifying transcription language 2026-04-22 12:41:32 +08:00
flobo3
1826ab44fa feat(transcription): add language parameter for Groq Whisper STT 2026-04-22 12:41:32 +08:00
Xubin Ren
f5b8ee9f78 docs: update v0.1.5.post2 release news 2026-04-21 17:50:54 +00:00
Xubin Ren
950dddec49 chore: bump version to 0.1.5.post2 v0.1.5.post2 2026-04-21 17:25:08 +00:00
k
e5b288c6eb fix: map MiniMax reasoning_effort to reasoning_split 2026-04-22 00:52:56 +08:00
Xubin Ren
558aa98491 chore: temporary keep WebUI source-only 2026-04-21 14:33:44 +00:00
aiguozhi123456
53ba410e49 feat(read_file): add DOCX, XLSX, PPTX support via document.extract_text()
Wire up the existing office document extractors in document.py to
ReadFileTool by adding an extension guard and _read_office_doc() method
that follows the established PDF pattern. Handles missing libraries,
corrupt files, empty documents, and 128K truncation consistently.
2026-04-21 22:12:19 +08:00
彭星杰
46864b0911 fix: use try/finally in _extract_xlsx to prevent resource leak 2026-04-21 22:01:17 +08:00
彭星杰
a00beebd06 fix: use context manager in _extract_xlsx to prevent resource leak 2026-04-21 22:01:17 +08:00
chengyongru
e15705b471 fix(tests): add _cancel_active_tasks mock to cmd_new test fixtures
The existing test_unified_session tests construct a SimpleNamespace
loop mock that now needs _cancel_active_tasks since cmd_new calls it.
2026-04-21 21:50:37 +08:00
chengyongru
d4e34f8c67 fix(commands): intercept non-priority commands during active turn
Non-priority slash commands (e.g. /new, /help, /dream-log) arriving
while a session has an active LLM turn were silently queued into the
pending injection buffer and later injected as raw user messages into
the LLM conversation. This caused the model to respond to "/new" as
plain text instead of executing the command.

Root cause: the run() loop only checked priority commands (/stop,
/restart, /status) before routing messages to the pending queue. All
other command tiers (exact, prefix) bypassed command dispatch entirely.

Changes:
- Add CommandRouter.is_dispatchable_command() to match exact/prefix
  tiers, mirroring the existing is_priority() pattern.
- In run(), intercept dispatchable commands before pending queue
  insertion and dispatch them directly via _dispatch_command_inline().
- Extract _cancel_active_tasks() from cmd_stop for reuse; cmd_new now
  cancels active tasks before clearing the session to prevent shared
  mutable state corruption from concurrent asyncio coroutines.
- Update /new semantics: stops active task first, then clears session.
- Update documentation in help text, docs, and Discord command list.
2026-04-21 21:50:37 +08:00
hussein1362
f8a023218d fix(telegram): improve markdown rendering for modern LLM output
Problem:
Modern LLMs (GPT-5.4, Claude, Gemini) produce markdown-heavy responses with
numbered lists, headers, and nested formatting. The Telegram channel's
_markdown_to_telegram_html() converter has gaps that leave these poorly
formatted:

1. Numbered lists (1. 2. 3.) have zero handling — sent as raw text
2. Headers (# Title) are stripped to plain text, losing visual hierarchy
3. Mid-stream edits send raw markdown (users see **bold** and ### headers
   while the response generates, before the final HTML conversion)

Root Cause:
_markdown_to_telegram_html() handles bullets (- *) but skips numbered lists
entirely. Headers are stripped of # but not given any emphasis. The streaming
path in send_delta() sends buf.text as-is during mid-stream edits (plain
text, no parse_mode) — only the final _stream_end edit converts to HTML.

Fix:
1. Headers now render as <b>bold</b> in the final HTML (using placeholder
   markers that survive HTML escaping, restored after all other processing)
2. Numbered lists are normalized (extra whitespace after the dot is cleaned)
3. New _strip_md_block() function strips markdown syntax for readable
   plain-text preview during streaming mid-edits

The final _stream_end HTML conversion is unchanged — it still produces
full HTML with parse_mode=HTML. Only the intermediate edits are improved.

Tests:
Added 10 new tests covering:
- Headers converting to bold HTML
- Numbered list preservation and whitespace normalization
- Headers with HTML special characters
- Mixed formatting (headers + bullets + numbers + bold)
- _strip_md_block for inline formatting, headers, bullets, numbers, links
- Streaming mid-edit markdown stripping (initial send + edit)
2026-04-21 21:35:34 +08:00
chengyongru
37ea8b8f5b fix(retry): recognize ZhiPu 1302 rate-limit error for retry
ZhiPu API returns code 1302 with Chinese text "速率限制" instead of
standard HTTP 429 + "rate limit", causing the retry engine to treat
it as non-transient and fail immediately.
2026-04-21 21:23:20 +08:00
Xubin Ren
1b692debdc docs(webui): revise README to clarify WebSocket channel setup and sequence of startup steps 2026-04-21 12:46:17 +00:00
Xubin Ren
c1957e14ff refactor(memory): centralize cursor validation behind a single gate
Move the non-int cursor guard out of the two consumer sites and into a
shared ``_iter_valid_entries`` iterator so the invariant lives in one
place.  Closes three gaps left by the original fix:

* ``bool`` is now rejected — ``isinstance(True, int)`` is ``True`` in
  Python, so the previous guard silently treated ``{"cursor": true}`` as
  cursor ``1``.
* Recovery now returns ``max(valid cursors) + 1``.  Under adversarial
  corruption "first int scanning in reverse" is not the same thing, and
  only ``max`` keeps the recovered cursor strictly greater than every
  legitimate cursor still on disk.
* Non-int cursors are logged exactly once per ``MemoryStore``.  Silently
  dropping corrupted entries hides the root cause (an external writer
  to ``memory/history.jsonl``); rate-limiting keeps the log clean when
  the same poisoned file is read every turn.

All 7 tests from the original fix pass unchanged; 3 new tests pin the
invariants above.

Made-with: Cursor
2026-04-21 14:02:53 +08:00
Muata Kamdibe
c0a11c7cf4 fix(memory): harden cursor recovery against non-integer corruption
_next_cursor now checks isinstance(cursor, int) before arithmetic,
falling back to a reverse scan of all entries when the last entry's
cursor is corrupted. read_unprocessed_history skips entries with
non-int cursors instead of crashing on comparison.

Root cause: external callers (cron jobs, plugins) occasionally wrote
string cursors to history.jsonl, which blocked all subsequent
append_history calls with TypeError/ValueError.

Includes 7 regression tests covering string, float, null, and list
cursor types.
2026-04-21 14:02:53 +08:00
chengyongru
409afe1a3d test(tools): add basic regression tests for ContextVar routing context 2026-04-21 13:25:30 +08:00
jr_blue_551
ff8c28d5a8 agent: use ContextVar for tool routing context 2026-04-21 13:25:30 +08:00
Xubin Ren
82aa9efc02 test(mcp): pin CancelledError short-circuits the retry loop
The retry branch is only reachable via `except Exception`, and
`CancelledError` inherits from `BaseException`, so today it naturally
bypasses the retry path and /stop still works.  Add one focused
regression test so any future refactor that widens the retry catch to
`BaseException`, re-orders the handlers, or adds `CancelledError` to
`_TRANSIENT_EXC_NAMES` fails CI instead of silently swallowing /stop.

Made-with: Cursor
2026-04-21 13:24:40 +08:00
hussein1362
368752e707 fix(mcp): retry once on transient connection errors
When an MCP server restarts or a network connection drops between
tool calls, the existing session throws ClosedResourceError,
BrokenPipeError, ConnectionResetError, etc. Currently these are
caught as generic exceptions and returned as permanent failures
to the LLM, which then tells the user 'my tools are broken.'

This change adds a single automatic retry with a 1-second backoff
for transient connection-class errors in MCPToolWrapper,
MCPResourceWrapper, and MCPPromptWrapper. Non-transient errors
(ValueError, RuntimeError, McpError, etc.) are not retried.

The retry is conservative:
- Only 1 retry (not configurable, to keep the change minimal)
- Only for a specific set of connection-class exceptions
- Matched by exception class name to avoid importing anyio/etc.
- 1s sleep between attempts to allow the server to recover
- Clear logging distinguishes retried vs permanent failures

In production this eliminates most 'MCP tool call failed:
ClosedResourceError' noise when MCP bridge processes restart
(e.g. after config changes or OOM kills).

Tests: 22 new tests covering retry, exhaustion, non-transient
bypass, timeout bypass, and all three wrapper types.
2026-04-21 13:24:40 +08:00
Xubin Ren
6c24f24e9e feat(models): add support for kimi-k2.6 with temperature override and update documentation 2026-04-20 18:18:06 +00:00
Xubin Ren
009cce78ad fix(anthropic): also enforce leading-user + empty-array recovery
Extend `_merge_consecutive` so the three invariants from
`LLMProvider._enforce_role_alternation` all hold for Anthropic:

1. collapse consecutive same-role turns (unchanged)
2. no trailing assistant — Anthropic rejects prefill (unchanged)
3. no leading assistant — Anthropic requires the first turn be user
4. non-empty messages array — recover the last stripped assistant as a
   user turn when every turn got stripped, so callers don't hit a
   secondary "messages array empty" 400

Anthropic-specific wrinkle: `tool_use` blocks live inside `content` (not
a separate `tool_calls` field) and are illegal inside user turns, so
both recovery paths skip any message carrying them rather than silently
producing a malformed request.

Adds 4 unit tests covering the new branches, including the tool_use
opt-outs, and updates the existing `test_single_assistant_stripped` to
reflect the new rerouting contract.

Made-with: Cursor
2026-04-21 01:32:32 +08:00