Adds ProviderConfig.extra_query, threaded into AsyncOpenAI(default_query)
so that Azure-style gateways requiring query params like api-version can
be configured without URL hacks.
Also updates provider_signature to track extra_query changes so per-turn
refresh rebuilds the provider when the value changes.
Addresses the extra_query portion of #4204. The max_completion_tokens
model-awareness enhancement is intentionally left separate.
Resolve fallbackModels as preset references or explicit inline provider configs so failover uses complete model settings without exposing fallback logic to the agent loop.
Co-authored-by: Cursor <cursoragent@cursor.com>
Bind fallback model chains to the active model configuration so defaults and presets do not inherit or merge fallback behavior implicitly. Require explicit fallback providers while preserving per-fallback generation overrides and context-window safety.
Co-authored-by: Cursor <cursoragent@cursor.com>
When the primary model returns a non-transient error and no content
has been streamed yet, the runner now tries each model listed in the
active preset's fallback_models in order. Each fallback model may
reside on a different provider — a temporary provider instance is
created on-the-fly via make_provider(config, model=...).
Key design:
- Failover is request-scoped (does not affect subagents/dream/consolidator)
- Provider is restored via try/finally after each fallback attempt
- Skipped when content was already streamed to avoid duplicate output
- Recursive failover prevented by clearing fallback_models on fallback spec
- Circuit breaker trips open after 3 consecutive primary failures (60s cooldown)
- Cross-provider routing: fallback model prefix (e.g. groq/) determines provider
Fixes: cross-provider fallback was broken because the factory passed the
original preset (with provider forced to primary's provider) when creating
fallback providers. Now uses provider="auto" so the model string prefix
correctly routes to the right provider.
Also fixes: log messages now distinguish between primary-failed,
previous-fallback-failed, and circuit-open scenarios.
closes: https://github.com/HKUDS/nanobot/issues/3376