mirror of
https://github.com/HKUDS/nanobot.git
synced 2026-06-15 07:14:08 +00:00
docs: clarify streamed timeout fallback behavior
maintainer edit: update fallback docs and provider docstring to describe the new stream-stall timeout recovery exception.
This commit is contained in:
parent
bc4bb508a1
commit
c00371c761
@ -1268,7 +1268,7 @@ Inline fallback object:
|
||||
|
||||
Use inline objects only when a fallback is not worth naming as a reusable preset. `fallbackModels` belongs under `agents.defaults`, not inside individual `modelPresets` entries.
|
||||
|
||||
Failover only runs when the primary provider returns a retryable model/provider error before any answer text has been streamed. Typical fallback cases include timeouts, connection errors, 5xx server errors, 429 rate limits, overloads, and quota/balance exhaustion. It does not run for malformed requests, authentication/permission errors, content filtering/refusals, or context-length/message-format errors.
|
||||
Failover normally runs when the primary provider returns a retryable model/provider error before any answer text has been streamed. Stream-stall timeouts are the recovery exception: if the provider already emitted partial answer text and then stalls, nanobot closes the current stream segment and retries/fails over in a new segment. Typical fallback cases include timeouts, connection errors, 5xx server errors, 429 rate limits, overloads, and quota/balance exhaustion. It does not run for malformed requests, authentication/permission errors, content filtering/refusals, or context-length/message-format errors.
|
||||
|
||||
If fallback candidates use smaller `contextWindowTokens` values, nanobot builds context using the smallest window in the active chain so every candidate can receive the same prompt.
|
||||
|
||||
|
||||
@ -58,14 +58,17 @@ _FALLBACK_ERROR_TOKENS = (
|
||||
class FallbackProvider(LLMProvider):
|
||||
"""Wrap a primary provider and transparently failover to fallback models.
|
||||
|
||||
When the primary model returns an error and no content has been streamed yet,
|
||||
the wrapper tries each fallback model in order. Each fallback model may
|
||||
reside on a different provider — a factory callable creates the underlying
|
||||
provider on-the-fly.
|
||||
When the primary model returns a fallbackable error before content has been
|
||||
streamed, the wrapper tries each fallback model in order. Streamed timeout
|
||||
errors are the recovery exception: the caller may close the current stream
|
||||
segment, then the wrapper continues failover with later deltas in a new
|
||||
segment. Each fallback model may reside on a different provider — a factory
|
||||
callable creates the underlying provider on-the-fly.
|
||||
|
||||
Key design:
|
||||
- Failover is request-scoped (the wrapper itself is stateless between turns).
|
||||
- Skipped when content was already streamed to avoid duplicate output.
|
||||
- Skipped when content was already streamed to avoid duplicate output,
|
||||
except timeout recovery can resume in a new stream segment.
|
||||
- Recursive failover is prevented by the factory returning plain providers.
|
||||
- Primary provider is circuit-broken after repeated failures to avoid
|
||||
wasting requests on a known-bad endpoint.
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user