Complete the symmetry left by #3214: ChannelManager._resolve_transcription_base
already resolves providers.openai.api_base, but BaseChannel.transcribe_audio
instantiated OpenAITranscriptionProvider without forwarding it, and the provider
__init__ did not accept the parameter. Self-hosted OpenAI-compatible Whisper
endpoints (LiteLLM, vLLM, etc.) configured via config.json were therefore
ignored for the OpenAI backend.
- OpenAITranscriptionProvider.__init__ now accepts api_base with env fallback
(OPENAI_TRANSCRIPTION_BASE_URL) matching the Groq pattern.
- BaseChannel.transcribe_audio forwards self.transcription_api_base to OpenAI.
- Tests mirror the existing Groq coverage: manager propagation for provider
"openai", BaseChannel-to-provider argument passing, and provider default vs
override for api_url.
Fully backward-compatible: when api_base is None and the env var is unset,
the default https://api.openai.com/v1/audio/transcriptions is used.
Refs #3213, follow-up to #3214.
Follow-up to #3212, fully backward compatible:
- Extract the 14-day staleness threshold as `_STALE_THRESHOLD_DAYS` module
constant and pass it into the Phase 1 prompt template as
`{{ stale_threshold_days }}`. The number lived in three places before
(code threshold, prompt instruction, docstring); now there is one.
- Add `DreamConfig.annotate_line_ages` (default True = current behavior)
and propagate it through `Dream.__init__` and the gateway wiring in
cli/commands.py. Gives users a knob to disable the feature without a
code patch if an LLM reacts poorly to the `← Nd` suffix.
- Harden `_annotate_with_ages` against dirty working trees: when HEAD
blob line count disagrees with the working-tree content length, skip
annotation entirely instead of assigning ages to the wrong lines. The
previous `i >= len(ages)` guard only handled one direction of the
mismatch.
- Inline-comment the `max_iterations` 10→15 bump with a pointer to
exp002 so future blame has context.
- Add 4 regression tests: end-to-end `← 30d` reaches prompt, 14/15
threshold boundary, `annotate_line_ages=False` bypasses git entirely
(verified via `assert_not_called`), length-mismatch defense, and
template-var rendering.
Made-with: Cursor
Three improvements to Dream's memory consolidation:
1. Per-line git-blame age annotations: MEMORY.md lines get `← Nd` suffixes
(N>14) from dulwich annotate. SOUL.md/USER.md excluded as permanent.
LLM uses content judgment, not just age, to decide what to prune.
2. Dedup-aware Phase 1 prompt: reframed as dual-task (extract facts +
deduplicate existing files) with explicit redundancy patterns to scan for.
Validated through 20 experiments (exp-002 prompt + max_iter=15 was best,
averaging -1643 chars/5.4% compression per run).
3. Phase 1 analysis as commit body: dream git commits now include the full
Phase 1 analysis for transparency via /dream-log.
4. max_iterations raised from 10 to 15: 30% improvement over 10 with no
risk; 20 showed diminishing returns (exp-020: -701 vs exp-017: -1643).
Locks in the two key boundaries of the new channel-based filter:
1. When an incoming channel id is in allow_channels, messages are forwarded.
2. When an incoming channel id is not in allow_channels, messages are
silently dropped.
The empty-list backward-compatible path is already covered by every
existing test that omits allow_channels (default_factory=list).
Made-with: Cursor
Add a built-in tool that lets the agent inspect and modify its own
runtime state (model, iterations, context window, etc.).
Key features:
- inspect: view current config, usage stats, and subagent status
- modify: adjust parameters at runtime (protected by type/range validation)
- Subagent observability: inspect running subagent tasks (phase,
iteration, tool events, errors) — subagents are no longer a black box
- Watchdog corrects out-of-bounds values on each iteration
- Enabled by default in read-only mode (self_modify: false)
- All changes are in-memory only; restart restores defaults
- Comprehensive test suite (90 tests)
Includes a self-awareness skill (always-on) with progressive disclosure:
SKILL.md for core rules, references/examples.md for detailed scenarios.
get_definitions() sorts tools on every LLM iteration for prompt cache
stability. Cache the sorted result and invalidate on register/unregister
so the sort only runs when the tool set actually changes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Pass resolved self.context_window_tokens to Consolidator instead of
raw parameter that could be None, preventing consolidation failures
- Calculate percentage against input budget (ctx - max_completion - 1024)
instead of raw context window, consistent with Consolidator/snip formulas
- Pass actual max_completion_tokens from provider to build_status_content
- Cap percentage display at 999 to prevent runaway values
- Add tests for budget-based percentage and cap behavior
Default Microsoft Teams inbound auth validation to enabled, update the README to match, and prevent denied senders from persisting conversation refs before allowlist checks pass.
Made-with: Cursor
- Check both jwt and cryptography in MSTEAMS_AVAILABLE guard so
partial installs fail early with a clear message instead of at runtime
- Add aclose() to test FakeHttpClient so stop() won't crash
- Move MSTEAMS.md into README.md following the same details/summary
pattern used by every other channel
- Note in README that validateInboundAuth defaults to false
PyJWT and cryptography are optional msteams deps; they should not be
bundled into the generic dev install. Tests now skip the entire file
when the deps are missing, following the dingtalk pattern.
Document why MiniMax thinking mode uses a separate Anthropic-compatible provider and list the matching base URLs. Add a small registry test so the new provider stays wired to the expected backend and API key.
Made-with: Cursor
- Convert skills summary from verbose XML (4-5 lines/skill) to compact
markdown list (1 line/skill) with inline path for read_file lookup
- Exclude always-loaded skills (e.g. memory) from the skills index to
avoid duplicating content already in the Active Skills section
- Skip injecting the Memory section when MEMORY.md still matches the
bundled template (i.e. Dream hasn't populated it yet)
- Inject `thinking={"type": "enabled|disabled"}` via extra_body for
Kimi thinking-capable models (kimi-k2.5, k2.6-code-preview).
- Add _is_kimi_thinking_model helper to handle both bare slugs and
OpenRouter-style prefixed names (e.g. moonshotai/kimi-k2.5).
- reasoning_effort="minimal" maps to disabled; any other value enables it.
- Add tests for enabled/disabled states and OpenRouter prefix handling.
Lock the new interaction-channel retry termination hints so both exhausted standard retries and persistent identical-error stops keep emitting the final progress message.
Made-with: Cursor
Lock the /status task counter to the actual stop scope by asserting it sums unfinished dispatch tasks with running subagents for the current session.
Made-with: Cursor
Lock the strict-provider sanitization path so assistant tool calls without function.arguments are normalized to {} instead of being forwarded as missing values.
Made-with: Cursor
Ensure assistant tool-call function.arguments is always emitted as valid JSON text so strict OpenAI-compatible backends (including Alibaba code models) do not reject requests. Add regressions for dict and malformed-string argument payloads in message sanitization.
Made-with: Cursor
Keep dict-backed channel configs compatible with both allow_from and allowFrom without losing empty-list semantics, and add focused regression coverage for the allow-list boundary.
Made-with: Cursor
Bug 1: _drain_pending did not call extract_documents on follow-up
messages arriving mid-turn. Documents attached to queued messages were
silently dropped because _build_user_content only handles images.
Fix: call extract_documents before _build_user_content in _drain_pending.
Bug 2: extract_documents read the entire file into memory (up to 50 MB)
just to check 16 bytes of magic header for MIME detection.
Fix: read only the first 16 bytes via open()+read(16) instead of
Path.read_bytes().
Added regression tests for both bugs.
Made-with: Cursor
Move extract_documents() to nanobot.utils.document as a reusable helper
and call it once in AgentLoop._process_message, the single entry point
for all message processing (API + all channels).
This replaces the previous API-only _extract_documents() in server.py,
ensuring Telegram, Feishu, Slack, WeChat, and all other channels also
benefit from automatic document text extraction.
Adds a configurable max_file_size guard (default 50 MB) to skip
oversized files gracefully, preventing unbounded memory/CPU usage
from channel-downloaded attachments.
- server.py: removed _extract_documents and related imports
- document.py: added extract_documents() with size limit
- loop.py: calls extract_documents() at the top of _process_message
- Tests updated: 70 related tests pass
Made-with: Cursor
ContextBuilder._build_user_content now only handles images (its original
responsibility). Document text extraction (PDF, DOCX, XLSX, PPTX) is
performed by the new _extract_documents() helper in server.py, called
before process_direct(). This keeps the core context builder free of
format-specific dependencies and makes the API boundary the single place
where uploaded files are pre-processed.
Tests updated to reflect the new responsibility boundary.
Made-with: Cursor
Keep the API file upload branch current with main, enforce the documented JSON base64 per-file limit, and avoid leaking document extraction error strings into user prompts.
Made-with: Cursor
When Slack resolves a named target to another conversation, do not reuse the origin thread timestamp on the destination send, and keep reaction cleanup anchored to the source conversation.
Made-with: Cursor
Feishu streaming cards auto-close after 10 minutes from creation,
regardless of update activity. With resuming enabled, a single card
lives across multiple tool-call rounds and can exceed this limit,
causing the final response to be silently lost.
Remove the _resuming logic from send_delta so each tool-call round
gets its own short-lived streaming card (well under 10 min). Add a
fallback that sends a regular interactive card when the final
streaming update fails.
The hand-rolled line-by-line YAML parser treated each line independently,
so YAML multiline scalars (folded `>` and literal `|`) were captured as
the literal characters ">" or "|" instead of the actual text content.
Bind the gateway health listener to localhost by default and reduce the probe response to a minimal status payload so accidental public exposure leaks less information.
Made-with: Cursor
Keep the gateway health endpoint patch current with the latest gateway runtime changes, and lock the new HTTP routes in with CLI regression coverage and README guidance.
Made-with: Cursor