Xubin Ren b8d327dc41 test + docs: lock should_execute_tools guard semantics (#3220)
Two small follow-ups to the guard:

1. Fix the should_execute_tools docstring so it matches the actual code.
   The previous version said "Only execute when finish_reason explicitly
   signals tool intent" but the code also accepts finish_reason == "stop".
   Explain why (some compliant providers emit "stop" with legitimate tool
   calls — openai_compat_provider.py already mirrors this at lines ~633 /
   ~678 where ("tool_calls", "stop") are both treated as the terminal
   tool-call state). Without this, a strict "tool_calls"-only guard would
   regress 15 existing runner tests that construct LLMResponse with
   tool_calls but no explicit finish_reason (default = "stop").

2. Add tests/providers/test_llm_response.py. This locks the three cases:
   - no tool calls                  -> never executes
   - tool calls + "tool_calls"/stop -> executes
   - tool calls + refusal / content_filter / error / length / ... -> blocked

   These are exactly the boundary cases the #3220 fix is about; without a
   test here a future refactor could silently revert the guard.

Body + tests only, no behavior change beyond the existing PR's intent.

Made-with: Cursor
2026-04-17 20:39:46 +08:00
..