mirror of
https://github.com/HKUDS/nanobot.git
synced 2026-06-15 15:24:06 +00:00
docs: clarify streamed timeout fallback behavior
maintainer edit: update fallback docs and provider docstring to describe the new stream-stall timeout recovery exception.
This commit is contained in:
parent
bc4bb508a1
commit
c00371c761
@ -1268,7 +1268,7 @@ Inline fallback object:
|
|||||||
|
|
||||||
Use inline objects only when a fallback is not worth naming as a reusable preset. `fallbackModels` belongs under `agents.defaults`, not inside individual `modelPresets` entries.
|
Use inline objects only when a fallback is not worth naming as a reusable preset. `fallbackModels` belongs under `agents.defaults`, not inside individual `modelPresets` entries.
|
||||||
|
|
||||||
Failover only runs when the primary provider returns a retryable model/provider error before any answer text has been streamed. Typical fallback cases include timeouts, connection errors, 5xx server errors, 429 rate limits, overloads, and quota/balance exhaustion. It does not run for malformed requests, authentication/permission errors, content filtering/refusals, or context-length/message-format errors.
|
Failover normally runs when the primary provider returns a retryable model/provider error before any answer text has been streamed. Stream-stall timeouts are the recovery exception: if the provider already emitted partial answer text and then stalls, nanobot closes the current stream segment and retries/fails over in a new segment. Typical fallback cases include timeouts, connection errors, 5xx server errors, 429 rate limits, overloads, and quota/balance exhaustion. It does not run for malformed requests, authentication/permission errors, content filtering/refusals, or context-length/message-format errors.
|
||||||
|
|
||||||
If fallback candidates use smaller `contextWindowTokens` values, nanobot builds context using the smallest window in the active chain so every candidate can receive the same prompt.
|
If fallback candidates use smaller `contextWindowTokens` values, nanobot builds context using the smallest window in the active chain so every candidate can receive the same prompt.
|
||||||
|
|
||||||
|
|||||||
@ -58,14 +58,17 @@ _FALLBACK_ERROR_TOKENS = (
|
|||||||
class FallbackProvider(LLMProvider):
|
class FallbackProvider(LLMProvider):
|
||||||
"""Wrap a primary provider and transparently failover to fallback models.
|
"""Wrap a primary provider and transparently failover to fallback models.
|
||||||
|
|
||||||
When the primary model returns an error and no content has been streamed yet,
|
When the primary model returns a fallbackable error before content has been
|
||||||
the wrapper tries each fallback model in order. Each fallback model may
|
streamed, the wrapper tries each fallback model in order. Streamed timeout
|
||||||
reside on a different provider — a factory callable creates the underlying
|
errors are the recovery exception: the caller may close the current stream
|
||||||
provider on-the-fly.
|
segment, then the wrapper continues failover with later deltas in a new
|
||||||
|
segment. Each fallback model may reside on a different provider — a factory
|
||||||
|
callable creates the underlying provider on-the-fly.
|
||||||
|
|
||||||
Key design:
|
Key design:
|
||||||
- Failover is request-scoped (the wrapper itself is stateless between turns).
|
- Failover is request-scoped (the wrapper itself is stateless between turns).
|
||||||
- Skipped when content was already streamed to avoid duplicate output.
|
- Skipped when content was already streamed to avoid duplicate output,
|
||||||
|
except timeout recovery can resume in a new stream segment.
|
||||||
- Recursive failover is prevented by the factory returning plain providers.
|
- Recursive failover is prevented by the factory returning plain providers.
|
||||||
- Primary provider is circuit-broken after repeated failures to avoid
|
- Primary provider is circuit-broken after repeated failures to avoid
|
||||||
wasting requests on a known-bad endpoint.
|
wasting requests on a known-bad endpoint.
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user