nanobot

mirror/nanobot

Fork 0

mirror of https://github.com/HKUDS/nanobot.git synced 2026-05-20 16:42:25 +00:00

Commit Graph

Author	SHA1	Message	Date
chengyongru	af9f8d54b8	perf: optimize gateway cold start from ~6.9s to ~460ms (#3918 ) Channel lazy load: discover_enabled() only imports enabled channel modules instead of all 18 modules with heavy SDKs (telegram, discord, slack, etc). discover_all() now delegates to discover_enabled(). Lazy OpenAI client: defer AsyncOpenAI() + httpx construction to _ensure_client() with asyncio.Lock double-checked locking. openai and httpx imports moved from module-level into _ensure_client(). Minor: lazy Nanobot/RunResult and CronService exports via __getattr__. Benchmark: 6910ms → 460ms (-93.3%)	2026-05-20 12:02:23 +08:00
Xubin Ren	1e11b35b45	fix(providers): tighten local endpoint detection Parse the endpoint host before disabling keepalive so public hostnames that merely contain private-network substrings keep the default connection pool behavior. Made-with: Cursor	2026-04-26 16:14:24 +08:00
hussein1362	5943ab386d	fix(providers): disable HTTP keepalive for local/LAN endpoints Local model servers (Ollama, llama.cpp, vLLM) often close idle HTTP connections before the client-side keepalive timer expires. When two LLM calls happen seconds apart — for example the heartbeat _decide() phase followed immediately by process_direct() — the second call grabs a now-dead pooled connection, causing a transient APIConnectionError on every first attempt. The fix detects local endpoints via: - ProviderSpec.is_local (Ollama, LM Studio, vLLM, OVMS) - Private-network URL patterns (localhost, 127.x, 192.168.x, 10.x, 172.16-31.x, host.docker.internal, [::1]) For these endpoints, the AsyncOpenAI client is created with a custom httpx.AsyncClient that sets keepalive_expiry=0, forcing a fresh TCP connection for each request. This is cheap on LAN (sub-5ms connect) and eliminates the stale-connection retry tax entirely. Cloud providers (OpenAI, Anthropic, OpenRouter, etc.) keep the default 5-second keepalive, which is fine for high-frequency API usage. The private-network heuristic also covers the common case where users configure provider='openai' but point apiBase at a LAN IP running llama.cpp — the spec says is_local=False, but the URL clearly is.	2026-04-26 16:14:24 +08:00

Author

SHA1

Message

Date

chengyongru

af9f8d54b8

perf: optimize gateway cold start from ~6.9s to ~460ms (#3918 )

Channel lazy load: discover_enabled() only imports enabled channel
modules instead of all 18 modules with heavy SDKs (telegram, discord,
slack, etc). discover_all() now delegates to discover_enabled().

Lazy OpenAI client: defer AsyncOpenAI() + httpx construction to
_ensure_client() with asyncio.Lock double-checked locking. openai
and httpx imports moved from module-level into _ensure_client().

Minor: lazy Nanobot/RunResult and CronService exports via __getattr__.

Benchmark: 6910ms → 460ms (-93.3%)

2026-05-20 12:02:23 +08:00

Xubin Ren

1e11b35b45

fix(providers): tighten local endpoint detection

Parse the endpoint host before disabling keepalive so public hostnames that merely contain private-network substrings keep the default connection pool behavior.

Made-with: Cursor

2026-04-26 16:14:24 +08:00

hussein1362

5943ab386d

fix(providers): disable HTTP keepalive for local/LAN endpoints

Local model servers (Ollama, llama.cpp, vLLM) often close idle HTTP
connections before the client-side keepalive timer expires.  When two
LLM calls happen seconds apart — for example the heartbeat _decide()
phase followed immediately by process_direct() — the second call grabs
a now-dead pooled connection, causing a transient APIConnectionError
on every first attempt.

The fix detects local endpoints via:
- ProviderSpec.is_local (Ollama, LM Studio, vLLM, OVMS)
- Private-network URL patterns (localhost, 127.x, 192.168.x, 10.x,
  172.16-31.x, host.docker.internal, [::1])

For these endpoints, the AsyncOpenAI client is created with a custom
httpx.AsyncClient that sets keepalive_expiry=0, forcing a fresh TCP
connection for each request.  This is cheap on LAN (sub-5ms connect)
and eliminates the stale-connection retry tax entirely.

Cloud providers (OpenAI, Anthropic, OpenRouter, etc.) keep the default
5-second keepalive, which is fine for high-frequency API usage.

The private-network heuristic also covers the common case where users
configure provider='openai' but point apiBase at a LAN IP running
llama.cpp — the spec says is_local=False, but the URL clearly is.

2026-04-26 16:14:24 +08:00

3 Commits