fix: stop leaking reasoning_content to stream output

The streaming path in OpenAICompatProvider.chat_stream() was passing
reasoning_content deltas through on_content_delta(), causing model
internal reasoning to be displayed to the user alongside the actual
response content.

reasoning_content is already collected separately in _parse_chunks()
and stored in LLMResponse.reasoning_content for session history.
It should never be forwarded to the user-facing stream.
This commit is contained in:
chengyongru 2026-04-05 17:16:54 +08:00 committed by Xubin Ren
parent 2cecaf0d5d
commit 5479a44691

View File

@ -671,9 +671,6 @@ class OpenAICompatProvider(LLMProvider):
break
chunks.append(chunk)
if on_content_delta and chunk.choices:
text = getattr(chunk.choices[0].delta, "reasoning_content", None)
if text:
await on_content_delta(text)
text = getattr(chunk.choices[0].delta, "content", None)
if text:
await on_content_delta(text)