The streaming API currently logs backend exceptions but still emits the
same `finish_reason: "stop"` + `[DONE]` terminator used for successful
responses. That makes a failed streamed request look successful to
OpenAI-compatible clients.
This keeps the fix narrow: track whether the stream backend failed and
suppress the success terminator in that case. A regression test locks in
the expected behavior.
Constraint: Keep the non-streaming response path untouched
Constraint: Follow up on the known limitation called out during PR #3222 review without redesigning the SSE protocol
Rejected: Introduce a custom SSE error event shape in the same patch | expands API surface and review scope
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: If explicit streamed error events are added later, keep them distinct from the success stop+[DONE] terminator to preserve client retry semantics
Tested: PYTHONPATH=$PWD pytest -q tests/test_api_stream.py /Users/jh0927/Workspace/nanobot-validation-artifacts-2026-04-18/test_api_stream_error_regression.py
Not-tested: Full repository test suite
Related: #3260
Related: #3222
Three fixes in the API upload handling:
1. Multipart uploads now prefix filenames with a UUID to prevent
overwrites when two requests upload files with the same name.
2. JSON image_url content blocks with remote HTTPS URLs now return
a 400 error instead of silently dropping the image.
3. Model validation runs for both JSON and multipart requests,
fixing an inconsistency where multipart bypassed the check.
Move extract_documents() to nanobot.utils.document as a reusable helper
and call it once in AgentLoop._process_message, the single entry point
for all message processing (API + all channels).
This replaces the previous API-only _extract_documents() in server.py,
ensuring Telegram, Feishu, Slack, WeChat, and all other channels also
benefit from automatic document text extraction.
Adds a configurable max_file_size guard (default 50 MB) to skip
oversized files gracefully, preventing unbounded memory/CPU usage
from channel-downloaded attachments.
- server.py: removed _extract_documents and related imports
- document.py: added extract_documents() with size limit
- loop.py: calls extract_documents() at the top of _process_message
- Tests updated: 70 related tests pass
Made-with: Cursor
ContextBuilder._build_user_content now only handles images (its original
responsibility). Document text extraction (PDF, DOCX, XLSX, PPTX) is
performed by the new _extract_documents() helper in server.py, called
before process_direct(). This keeps the core context builder free of
format-specific dependencies and makes the API boundary the single place
where uploaded files are pre-processed.
Tests updated to reflect the new responsibility boundary.
Made-with: Cursor
Keep the API file upload branch current with main, enforce the documented JSON base64 per-file limit, and avoid leaking document extraction error strings into user prompts.
Made-with: Cursor
Reject mismatched models and require a single user message so the OpenAI-compatible endpoint reflects the fixed-session nanobot runtime without extra compatibility noise.
Read serve host, port, and timeout from config by default, keep CLI flags higher priority, and bind the API to localhost by default for safer local usage.
Expose OpenAI-compatible chat completions and models endpoints through a single persistent API session, keeping the integration simple without adding multi-session isolation yet.