fix(api): remove enable_compression to restore real SSE streaming

The HTTP compression buffer in aiohttp held all SSE chunks until
the stream ended, making streaming appear batched instead of
incremental. SSE payloads are small and frequent, so compression
provides negligible benefit while breaking real-time delivery.
This commit is contained in:
zhonghongwei 2026-05-07 10:26:51 +08:00 committed by Xubin Ren
parent 536c456e5e
commit 6a3069514c

View File

@ -239,7 +239,6 @@ async def handle_chat_completions(request: web.Request) -> web.Response:
resp.content_type = "text/event-stream" resp.content_type = "text/event-stream"
resp.headers["Cache-Control"] = "no-cache" resp.headers["Cache-Control"] = "no-cache"
resp.headers["Connection"] = "keep-alive" resp.headers["Connection"] = "keep-alive"
resp.enable_compression()
await resp.prepare(request) await resp.prepare(request)
chunk_id = f"chatcmpl-{uuid.uuid4().hex[:12]}" chunk_id = f"chatcmpl-{uuid.uuid4().hex[:12]}"