docs: clarify streamed timeout fallback behavior

maintainer edit: update fallback docs and provider docstring to describe the new stream-stall timeout recovery exception.
This commit is contained in:
chengyongru 2026-06-10 16:21:52 +08:00 committed by Xubin Ren
parent bc4bb508a1
commit c00371c761
2 changed files with 9 additions and 6 deletions

View File

@ -1268,7 +1268,7 @@ Inline fallback object:
Use inline objects only when a fallback is not worth naming as a reusable preset. `fallbackModels` belongs under `agents.defaults`, not inside individual `modelPresets` entries.
Failover only runs when the primary provider returns a retryable model/provider error before any answer text has been streamed. Typical fallback cases include timeouts, connection errors, 5xx server errors, 429 rate limits, overloads, and quota/balance exhaustion. It does not run for malformed requests, authentication/permission errors, content filtering/refusals, or context-length/message-format errors.
Failover normally runs when the primary provider returns a retryable model/provider error before any answer text has been streamed. Stream-stall timeouts are the recovery exception: if the provider already emitted partial answer text and then stalls, nanobot closes the current stream segment and retries/fails over in a new segment. Typical fallback cases include timeouts, connection errors, 5xx server errors, 429 rate limits, overloads, and quota/balance exhaustion. It does not run for malformed requests, authentication/permission errors, content filtering/refusals, or context-length/message-format errors.
If fallback candidates use smaller `contextWindowTokens` values, nanobot builds context using the smallest window in the active chain so every candidate can receive the same prompt.

View File

@ -58,14 +58,17 @@ _FALLBACK_ERROR_TOKENS = (
class FallbackProvider(LLMProvider):
"""Wrap a primary provider and transparently failover to fallback models.
When the primary model returns an error and no content has been streamed yet,
the wrapper tries each fallback model in order. Each fallback model may
reside on a different provider a factory callable creates the underlying
provider on-the-fly.
When the primary model returns a fallbackable error before content has been
streamed, the wrapper tries each fallback model in order. Streamed timeout
errors are the recovery exception: the caller may close the current stream
segment, then the wrapper continues failover with later deltas in a new
segment. Each fallback model may reside on a different provider a factory
callable creates the underlying provider on-the-fly.
Key design:
- Failover is request-scoped (the wrapper itself is stateless between turns).
- Skipped when content was already streamed to avoid duplicate output.
- Skipped when content was already streamed to avoid duplicate output,
except timeout recovery can resume in a new stream segment.
- Recursive failover is prevented by the factory returning plain providers.
- Primary provider is circuit-broken after repeated failures to avoid
wasting requests on a known-bad endpoint.