nanobot

mirror of https://github.com/HKUDS/nanobot.git synced 2026-05-19 16:12:30 +00:00

Go to file

feat(goal): /goal command & long-running tasks (long_task)

* feat(long-task): add LongTaskTool for multi-step agent tasks

Implements a meta-ReAct loop where long-running tasks are broken into
sequential subagent steps, each starting fresh with the original goal
and progress from the previous step. This prevents context drift when
agents work on complex, multi-step tasks.

- Extract build_tool_registry() from SubagentManager for reuse
- Add run_step() for synchronous subagent execution (no bus announcement)
- Add HandoffTool and CompleteTool as signal mechanisms via shared dict
- Add LongTaskTool orchestrator with simplified prompt (8 iterations/step)
- Register LongTaskTool in main agent loop
- Add _extract_handoff_from_messages fallback for robustness

* fix(long-task): add debug logging for step-level observability

* feat(long-task): major overhaul with structured handoffs, validation, and observability

- Structured HandoffState: HandoffTool now accepts files_created,
  files_modified, next_step_hint, and verification fields instead of
  a plain string. Progress is passed between steps as structured data.

- Completion validation round: After complete() is called, a dedicated
  validator step runs to verify the claim against the original goal.
  If validation fails, the task continues rather than returning
  a false completion.

- Dynamic prompt system: 3 Jinja2 templates (step_start, step_middle,
  step_final) selected based on step number. Final steps get tighter
  budget and stronger "wrap up" guidance.

- Automatic file change tracking: Extracts write_file/edit_file events
  from tool_events and injects them into the next step's context if
  the subagent forgot to report them explicitly.

- Budget tracking & adaptive strategy: Cumulative token usage is tracked
  across steps. Per-step tool budget drops from 8 to 4 in the last
  two steps to force handoff/completion.

- Crash retry with graceful degradation: A step that crashes is retried
  once. Persistent crashes terminate the task and return partial progress.

- Full observability hooks for future WebUI integration:
  - set_hooks() with on_step_start, on_step_complete, on_handoff,
    on_validation_started, on_validation_passed, on_validation_failed,
    on_task_complete, on_task_error, and catch-all on_event.
  - Readable state properties: current_step, total_steps, status,
    last_handoff, cumulative_usage, goal.
  - inject_correction() allows external code to send user corrections
    that are injected into the next step's prompt.

- run_step() accepts optional max_iterations for dynamic budget control.

All 27 long-task tests and 11 subagent tests pass.

* test(long-task): add boundary tests and fix race conditions

- Add 7 edge-case tests: validation crash resilience, hook exception safety, mid-run correction injection, FIFO correction ordering, explicit file changes overriding auto-detection, final budget for max_steps=1, and dynamic budget switching boundaries

- Fix assertion in test_long_task_completes_after_multiple_handoffs to match exact prompt format

- Remove asyncio timing hack from test_state_exposure

- Add asyncio.sleep(0) yield in test_inject_correction_during_execution to prevent race between signal injection and step continuation

- All 34 tests passing

* fix(long-task): address code review findings

- Declare _scopes = {"core"} explicitly to prevent recursive nesting in subagent scope
- Document fragile coupling in _extract_file_changes: path extraction depends on
  write_file/edit_file detail format; add debug log for unexpected formats
- Align final-template threshold (max_steps - 2) with budget switch threshold
- Eliminate hasattr(self, "_state") in _reset_state by initializing in __init__

* fix(long-task): honor final signal and file tracking

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(long-task): improve prompt structure and agent contract

- Expand LongTaskTool.description to instruct parent agent on goal
  construction, return value semantics, and how to handle results.
- Expand CompleteTool.description to emphasize that the summary IS the
  final answer returned to the parent agent.
- Prefix validated return value with an explicit "final answer" directive
  to stop parent agent from re-running work.
- Redesign step_start.md: Step 1 is now explicitly for exploration,
  planning, and skeleton-building. complete() is discouraged.
- Remove bulky payload debug logging from _emit(); add targeted
  info/warning/error logs at key state transitions instead.
- Add signal_type to HandoffState for cleaner signal detection.

* test(long-task): expect wrapped completion message after validation

Align assertions with LongTaskTool final return shape on main.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(webui): turn timing strip, latency, and session-switch restore

- Agent loop: publish goal_status run/idle for WebSocket turns; attach
  wall-clock latency_ms on turn_end and persisted assistant metadata.
- WebSocket channel: forward goal_status and latency fields to clients.
- NanobotClient: track goal_status started_at per chat without requiring
  onChat; useNanobotStream restores run strip when returning to a chat.
- Thread UI: composer/shell viewport hooks for run duration and latency;
  format helpers and i18n strings.
- MessageBubble: drop trailing StreamCursor (layout artifact vs block markdown).
- Builtin / tests: model command coverage, websocket and loop tests.

Covers multi-session UX and round-trip timing visibility for the WebUI.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix: keep message-tool file attachments after canonical history hydrate

- MessageTool records per-turn media paths delivered to the active chat.
- nanobot.utils.session_attachments stages out-of-media-root files and
  merges into the last assistant message before save (loop stays a thin call).
- WebUI MediaCell: use a signed URL as a real download link when present.

Fixes attachments flashing then vanishing on turn_end when paths lived
outside get_media_dir (e.g. workspace files).

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(webui): agent activity cluster, stable keys, LTR sheen labels

- Group reasoning and tool traces in AgentActivityCluster with i18n summaries
- Stabilize React list keys for activity clusters (first message id anchor)
- Replace background-clip shimmer with overlay sheen for streaming labels
- ThreadMessages/MessageList integration and locale strings

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(webui): render assistant reasoning with Markdown + deferred stream

- Use MarkdownText for ReasoningBubble body (same GFM/KaTeX path as replies)
- Apply muted/italic prose tokens so thinking stays visually subordinate
- useDeferredValue while reasoningStreaming to ease parser work during deltas
- Preload markdown chunk when trace opens; add regression test with preloaded renderer

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(webui): default-collapse agent activity cluster while Working

Outer fold no longer auto-expands during isTurnStreaming; user opens to see traces.
Header sheen and live summary unchanged.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(long_task): cumulative run history, file union, and prompt tuning

Inject cross-step summaries and merged file paths into middle/final step
templates so chains do not lose early context. Strip the last run-history
block when it duplicates Previous Progress to save tokens. Add optional
cumulative_prompt_max_chars and cumulative_step_body_max_chars parameters
with clamped defaults.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(webui): session switch keeps in-flight thread and replays buffered WS

Save the prior chat message list to the per-chat cache in a layout effect
when chatId changes (before stale writes could corrupt another chat).
Skip one post-switch layout cache tick so we do not snapshot the wrong tab.

Buffer inbound events per chat_id when no onChat subscriber is registered
(e.g. user focused another session) and drain on resubscribe up to a cap,
so streaming deltas are not lost while off-tab.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(webui): snap thread scroll to bottom on session open (no smooth glide)

Use scroll-behavior auto on the viewport, instant programmatic scroll when
following new messages and on scrollToBottomSignal. Keep smooth only for
the explicit scroll-to-bottom button.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(webui): respect manual scroll-up after opening a session

Track when the user leaves the bottom with a ref and skip ResizeObserver
and deferred bottom snaps until they return or the conversation is reset.
Remove the time-based force-bottom window that overrode atBottom.

Multi-frame scrollToBottom honours the same guard unless force (scroll button).

Co-authored-by: Cursor <cursoragent@cursor.com>

* Publish long_task UI snapshots on outbound metadata

- Add OUTBOUND_META_AGENT_UI (_agent_ui) for channel-agnostic structured state
- LongTaskTool publishes {kind: long_task, data: snapshot} on the bus with _progress
- WebSocket send forwards metadata as agent_ui for WebUI clients
- Tests for bus payload, WS frame, and progress assertions
- Fix loop progress tests: ignore _goal_status in streaming final filter and
  avoid brittle outbound[-1] ordering after goal status idle messages

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat: WebUI long_task activity card and resilient history merge

Add optional ui_summary to the long_task tool for one-line UI labels. Stream
long_task agent_ui into a dedicated message row with timeline, markdown peek,
and a right sheet for details. Merge canonical history after turn_end while
re-inserting long_task rows before the final assistant reply. Collapse
duplicate task_start/step_start steps in the timeline and extend i18n.

Co-authored-by: Cursor <cursoragent@cursor.com>

* refactor: align long_task with thread_goal and drop orchestrator UI

- Persist sustained objectives via session metadata (long_task / complete_goal); no subagent wiring or tool-driven agent_ui payloads.\n- Remove WebUI long-task activity UI, types, and translations; history merge preserves trace replay only, with legacy long_task rows normalized to traces.\n- Drop long_task prompt templates and get_long_task_run_dir; add webui thread disk helper for gateway persistence tests.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(agent): thread goal runtime context, tools, and skill

- Add thread_goal_state helper and mirror active objectives into Runtime Context
- Wire loop/context/memory/events as needed for goal metadata in turns
- Expand long_task / complete_goal semantics (pivot/cancel/honest recap)
- Add always-on thread-goal SKILL.md; align /goal command prompt
- Tests for context builder and thread goal state
- Remove unused webui ChatPane component

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(thread-goal): add websocket snapshot helper and publish goal updates from long_task

Introduce thread_goal_ws_blob for bounded JSON snapshots, attach snapshots to
websocket turn_end metadata in AgentLoop, and let long_task fan-out dedicated
thread_goal frames on the websocket channel after persisting session metadata.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(channels): websocket thread_goal frames, turn_end replay, and session API scrub for subagent inject

Emit thread_goal events and optional thread_goal on turn_end; scrub persisted
subagent announce blobs on GET /api/sessions/.../messages and shorten session
list previews so WebUI does not surface full Task/Summarize scaffolding.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(webui): merge ephemeral traces per user turn when reconciling canonical history

Preserve disk/live trace rows inside the matching user–assistant segment instead
of stacking every trace before the final assistant reply (fixes inflated tool
counts after refresh or session switch).

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(webui): show assistant reply copy only on the last slice before the next user turn

Avoid duplicate copy affordances on intermediate assistant bubbles that precede
more agent activity in the same turn (tools or further assistant text).

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(webui): thread_goal stream plumbing, composer goal strip, sky glow, and client-side subagent scrub projection

Track thread_goal and turn_goal snapshots in NanobotClient, hydrate React state
from thread_goal frames and turn_end, surface objective/elapsed in the composer,
add breathing sky halo CSS while goals are active, mirror server scrub logic on
history hydration and webui_thread snapshots, and extend tests/client mocks.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(channels): add Slack Socket Mode connect timeout with actionable timeout errors

Abort hung websockets.connect handshakes after a bounded wait, log REST-vs-WSS
guidance, surface RuntimeError to channel startup, and log successful WSS setup.

Co-authored-by: Cursor <cursoragent@cursor.com>

* webui: expand thread goal in composer bottom sheet

Add ChevronUp control on the run/goal strip that opens a bottom Sheet
with full ui_summary and objective. Inline preview logic in RunElapsedStrip,
add i18n strings across locales, and a composer unit test.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(webui): widen dedupeToolCallsForUi input for session API typing

fetchSessionMessages types tool_calls as unknown; accept unknown so tsc
build passes when passing message.tool_calls through.

Co-authored-by: Cursor <cursoragent@cursor.com>

* refactor(agent): extract WebSocket turn run status to webui_turn_helpers

* refactor(skills): rename thread-goal to long-task and document idempotent goals

* feat(skills): rename sustained-goal skill to long-goal and tighten long_task guidance

* chore: remove unused subagent/context/router helpers

* feat(session): rename sustained goal to goal_state and align WS/WebUI

- Move helpers from agent/thread_goal_state to session/goal_state:
  GOAL_STATE_KEY, goal_state_runtime_lines, goal_state_ws_blob, parse_goal_state.
- Session metadata now uses "goal_state"; still read legacy "thread_goal";
  long_task writes drop the legacy key after save.
- WebSocket: event/field goal_state, _goal_state_sync; turn_end carries goal_state;
  accept legacy _thread_goal_sync/thread_goal inbound metadata for dispatch.
- WebUI: GoalStateWsPayload, goalState hook/client props, i18n keys goalState*.
- Runtime Context copy uses "Goal (active):" instead of "Thread goal".

* feat(agent): stream Anthropic thinking deltas and fix stream idle timeout

* refactor(webui): transcript jsonl as sole timeline source

* fix(agent): reject mismatched WS message chat_id and stream reasoning deltas

* feat(webui): hydrate sustained goal and run timer after websocket subscribe

* chore(webui,websocket): remove unused fetch helpers and legacy thread_goal WS paths

* Raise default max_tokens and context window in agent schema.

Align AgentDefaults and ModelPresetConfig with typical Claude-scale usage
(32k completion budget, 256k context window) and update migration tests.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(gateway): bootstrap prefers in-memory model; clarify websocket naming

* fix(websocket): websocket _handle_message passes is_dm; refresh /status test expectations

---------

Co-authored-by: chengyongru <2755839590@qq.com>
Co-authored-by: chengyongru <chengyongru.ai@gmail.com>
Co-authored-by: Cursor <cursoragent@cursor.com>

2026-05-16 01:14:11 +08:00

.agent

docs: refine AI contributor guidance

2026-05-09 14:00:32 +08:00

.github

ci: optimize Test Suite workflow (safe subset)

2026-05-09 08:27:46 +00:00

bridge

fix: support WhatsApp voice message download

2026-05-04 11:44:25 +08:00

case

fix filename

2026-02-04 14:09:43 -05:00

docs

docs(pairing): consolidate access control docs — MECE allowFrom + pairing

2026-05-15 15:46:44 +08:00

images

docs: update README and webui documentation for WebUI development workflow

2026-04-19 13:10:36 +00:00

nanobot

feat(goal): /goal command & long-running tasks (long_task)

2026-05-16 01:14:11 +08:00

tests

feat(goal): /goal command & long-running tasks (long_task)

2026-05-16 01:14:11 +08:00

webui

feat(goal): /goal command & long-running tasks (long_task)

2026-05-16 01:14:11 +08:00

.dockerignore

feat: add Dockerfile with uv-based installation

2026-02-02 08:55:21 +00:00

.gitattributes

fix(docker): strip Windows CRLF from entrypoint.sh

2026-04-07 13:32:01 +08:00

.gitignore

feat(tools): introduce plugin-based tool discovery and runtime context protocol

2026-05-12 11:28:20 +08:00

CLAUDE.md

docs: add CLAUDE.md and .agent/ guides for AI contributors

2026-05-09 14:00:32 +08:00

COMMUNICATION.md

add feishu & wechat group

2026-02-01 18:17:56 +00:00

CONTRIBUTING.md

ci: optimize Test Suite workflow and document free-tier rule

2026-05-09 08:15:27 +00:00

core_agent_lines.sh

feat(memory): harden legacy history migration and Dream UX

2026-04-04 08:41:46 +00:00

docker-compose.yml

security: bind api port to localhost by default

2026-04-06 16:20:20 +08:00

Dockerfile

fix(docker): strip Windows CRLF from entrypoint.sh

2026-04-07 13:32:01 +08:00

entrypoint.sh

fix(docker): fix volume mount path and add permission error guidance

2026-04-06 13:15:40 +00:00

LICENSE

docs: clarify maintainer and contribution licensing

2026-04-26 18:01:55 +00:00

pyproject.toml

feat(tools): introduce plugin-based tool discovery and runtime context protocol

2026-05-12 11:28:20 +08:00

README.md

chore: update README with news for v0.1.5.post4 release

2026-04-29 11:12:50 +00:00

SECURITY.md

docs: clarify bwrap sandbox is Linux-only

2026-04-05 19:28:46 +00:00

THIRD_PARTY_NOTICES.md

build: ship THIRD_PARTY_NOTICES and fix webui packaging in wheel

2026-04-20 08:22:10 +00:00

README.md

🐈 nanobot is an open-source and ultra-lightweight AI agent in the spirit of OpenClaw, Claude Code, and Codex. It keeps the core agent loop small and readable while still supporting chat channels, memory, MCP and practical deployment paths, so you can go from local setup to a long-running personal agent with minimal overhead.

📢 News

2026-04-29 🚀 Released v0.1.5.post3 — Smarter threads on Feishu, Discord, Slack, and Teams; DeepSeek-V4; Hugging Face & Olostep; choices, /history, and steadier long chats. Please see release notes for details.
2026-04-28 🌐 Olostep web search, Hugging Face provider, safer workspace-tool interruptions.
2026-04-27 💬 /history command, smarter session replay caps, smoother Discord / Slack threads.
2026-04-26 🧭 Natural cron reminders, thread-aware restarts, safer local provider and shell behavior.
2026-04-25 🧩 ask_user choices, macOS LaunchAgent deployment, MSTeams stale-reference cleanup.
2026-04-24 🎥 Video attachments for channels, DeepSeek thinking control, faster document startup.
2026-04-23 🧵 Discord thread sessions, Telegram inline buttons, structured tool progress updates.
2026-04-22 🔎 GitHub Copilot GPT-5 / o-series support, configurable web fetch, WebUI image uploads.
2026-04-21 🚀 Released v0.1.5.post2 — Windows & Python 3.14 support, Office document reading, SSE streaming for the OpenAI-compatible API, and stronger reliability across sessions, memory, and channels. Please see release notes for details.
2026-04-20 🎨 Kimi K2.6 support, Telegram long-message split, WebUI typography & dark-mode polish.
2026-04-19 🌐 WebUI i18n locale switcher, atomic session writes with auto-repair.
2026-04-18 🧪 Initial WebUI chat, smarter setup wizard menus, WebSocket multi-chat multiplexing.
2026-04-17 🪟 Windows & Python 3.14 CI, Dream line-age memory, email self-loop guard.
2026-04-16 📡 SSE streaming for OpenAI-compatible API, Discord channel allow-list.
2026-04-15 🎛️ LM Studio & nullable API keys, MiniMax thinking endpoint, runtime SelfTool.
2026-04-14 🚀 Released v0.1.5.post1 — Dream skill discovery, mid-turn follow-up injection, WebSocket channel, and deeper channel integrations. Please see release notes for details.
2026-04-13 🛡️ Agent turn hardened — user messages persisted early, auto-compact skips active tasks.
2026-04-12 🔒 Lark global domain support, Dream learns discovered skills, shell sandbox tightened.
2026-04-11 ⚡ Context compact shrinks sessions on the fly; Kagi web search; QQ & WeCom full media.

Earlier news

2026-04-10 📓 Notebook editing tool, multiple MCP servers, Feishu streaming & done-emoji.
2026-04-09 🔌 WebSocket channel, unified cross-channel session, disabled_skills config.
2026-04-08 📤 API file uploads, OpenAI reasoning auto-routing with Responses fallback.
2026-04-07 🧠 Anthropic adaptive thinking, MCP resources & prompts exposed as tools.
2026-04-06 🛰️ Langfuse observability, unified Whisper transcription, email attachments.
2026-04-05 🚀 Released v0.1.5 — sturdier long-running tasks, Dream two-stage memory, production-ready sandboxing and programming Agent SDK. Please see release notes for details.
2026-04-04 🚀 Jinja2 response templates, Dream memory hardened, smarter retry handling.
2026-04-03 🧠 Xiaomi MiMo provider, chain-of-thought reasoning visible, Telegram UX polish.
2026-04-02 🧱 Long-running tasks run more reliably — core runtime hardening.
2026-04-01 🔑 GitHub Copilot auth restored; stricter workspace paths; OpenRouter Claude caching fix.
2026-03-31 🛰️ WeChat multimodal alignment, Discord/Matrix polish, Python SDK facade, MCP and tool fixes.
2026-03-30 🧩 OpenAI-compatible API tightened; composable agent lifecycle hooks.
2026-03-29 💬 WeChat voice, typing, QR/media resilience; fixed-session OpenAI-compatible API.
2026-03-28 📚 Provider docs refresh; skill template wording fix.
2026-03-27 🚀 Released v0.1.4.post6 — architecture decoupling, litellm removal, end-to-end streaming, WeChat channel, and a security fix. Please see release notes for details.
2026-03-26 🏗️ Agent runner extracted and lifecycle hooks unified; stream delta coalescing at boundaries.
2026-03-25 🌏 StepFun provider, configurable timezone, Gemini thought signatures.
2026-03-24 🔧 WeChat compatibility, Feishu CardKit streaming, test suite restructured.
2026-03-23 🔧 Command routing refactored for plugins, WhatsApp/WeChat media, unified channel login CLI.
2026-03-22 ⚡ End-to-end streaming, WeChat channel, Anthropic cache optimization, /status command.
2026-03-21 🔒 Replace litellm with native openai + anthropic SDKs. Please see commit.
2026-03-20 🧙 Interactive setup wizard — pick your provider, model autocomplete, and you're good to go.
2026-03-19 💬 Telegram gets more resilient under load; Feishu now renders code blocks properly.
2026-03-18 📷 Telegram can now send media via URL. Cron schedules show human-readable details.
2026-03-17 ✨ Feishu formatting glow-up, Slack reacts when done, custom endpoints support extra headers, and image handling is more reliable.
2026-03-16 🚀 Released v0.1.4.post5 — a refinement-focused release with stronger reliability and channel support, and a more dependable day-to-day experience. Please see release notes for details.
2026-03-15 🧩 DingTalk rich media, smarter built-in skills, and cleaner model compatibility.
2026-03-14 💬 Channel plugins, Feishu replies, and steadier MCP, QQ, and media handling.
2026-03-13 🌐 Multi-provider web search, LangSmith, and broader reliability improvements.
2026-03-12 🚀 VolcEngine support, Telegram reply context, /restart, and sturdier memory.
2026-03-11 🔌 WeCom, Ollama, cleaner discovery, and safer tool behavior.
2026-03-10 🧠 Token-based memory, shared retries, and cleaner gateway and Telegram behavior.
2026-03-09 💬 Slack thread polish and better Feishu audio compatibility.
2026-03-08 🚀 Released v0.1.4.post4 — a reliability-packed release with safer defaults, better multi-instance support, sturdier MCP, and major channel and provider improvements. Please see release notes for details.
2026-03-07 🚀 Azure OpenAI provider, WhatsApp media, QQ group chats, and more Telegram/Feishu polish.
2026-03-06 🪄 Lighter providers, smarter media handling, and sturdier memory and CLI compatibility.
2026-03-05 ⚡️ Telegram draft streaming, MCP SSE support, and broader channel reliability fixes.
2026-03-04 🛠️ Dependency cleanup, safer file reads, and another round of test and Cron fixes.
2026-03-03 🧠 Cleaner user-message merging, safer multimodal saves, and stronger Cron guards.
2026-03-02 🛡️ Safer default access control, sturdier Cron reloads, and cleaner Matrix media handling.
2026-03-01 🌐 Web proxy support, smarter Cron reminders, and Feishu rich-text parsing improvements.
2026-02-28 🚀 Released v0.1.4.post3 — cleaner context, hardened session history, and smarter agent. Please see release notes for details.
2026-02-27 🧠 Experimental thinking mode support, DingTalk media messages, Feishu and QQ channel fixes.
2026-02-26 🛡️ Session poisoning fix, WhatsApp dedup, Windows path guard, Mistral compatibility.
2026-02-25 🧹 New Matrix channel, cleaner session context, auto workspace template sync.
2026-02-24 🚀 Released v0.1.4.post2 — a reliability-focused release with a redesigned heartbeat, prompt cache optimization, and hardened provider & channel stability. See release notes for details.
2026-02-23 🔧 Virtual tool-call heartbeat, prompt cache optimization, Slack mrkdwn fixes.
2026-02-22 🛡️ Slack thread isolation, Discord typing fix, agent reliability improvements.
2026-02-21 🎉 Released v0.1.4.post1 — new providers, media support across channels, and major stability improvements. See release notes for details.
2026-02-20 🐦 Feishu now receives multimodal files from users. More reliable memory under the hood.
2026-02-19 ✨ Slack now sends files, Discord splits long messages, and subagents work in CLI mode.
2026-02-18 ⚡️ nanobot now supports VolcEngine, MCP custom auth headers, and Anthropic prompt caching.
2026-02-17 🎉 Released v0.1.4 — MCP support, progress streaming, new providers, and multiple channel improvements. Please see release notes for details.
2026-02-16 🦞 nanobot now integrates a ClawHub skill — search and install public agent skills.
2026-02-15 🔑 nanobot now supports OpenAI Codex provider with OAuth login support.
2026-02-14 🔌 nanobot now supports MCP! See MCP section for details.
2026-02-13 🎉 Released v0.1.3.post7 — includes security hardening and multiple improvements. Please upgrade to the latest version to address security issues. See release notes for more details.
2026-02-12 🧠 Redesigned memory system — Less code, more reliable. Join the discussion about it!
2026-02-11 ✨ Enhanced CLI experience and added MiniMax support!
2026-02-10 🎉 Released v0.1.3.post6 with improvements! Check the updates notes and our roadmap.
2026-02-09 💬 Added Slack, Email, and QQ support — nanobot now supports multiple chat platforms!
2026-02-08 🔧 Refactored Providers—adding a new LLM provider now takes just 2 simple steps! Check here.
2026-02-07 🚀 Released v0.1.3.post5 with Qwen support & several key improvements! Check here for details.
2026-02-06 ✨ Added Moonshot/Kimi provider, Discord integration, and enhanced security hardening!
2026-02-05 ✨ Added Feishu channel, DeepSeek provider, and enhanced scheduled tasks support!
2026-02-04 🚀 Released v0.1.3.post4 with multi-provider & Docker support! Check here for details.
2026-02-03 ⚡ Integrated vLLM for local LLM support and improved natural language task scheduling!
2026-02-02 🎉 nanobot officially launched! Welcome to try 🐈 nanobot!

💡 Key Features of nanobot

Ultra-lightweight: stable long-running agent behavior with a small, readable core.
Research-ready: the codebase is intentionally simple enough to study, modify, and extend.
Practical: chat channels, API, memory, MCP, and deployment paths are already built in.
Hackable: you can start fast, then go deeper through repo docs instead of a monolithic landing page.

📦 Install

Important

If you want the newest features and experiments, install from source.

If you want the most stable day-to-day experience, install from PyPI or with uv.

Install from source

git clone https://github.com/HKUDS/nanobot.git
cd nanobot
pip install -e .

Install with uv

uv tool install nanobot-ai

Install from PyPI

pip install nanobot-ai

🚀 Quick Start

1. Initialize

nanobot onboard

2. Configure (~/.nanobot/config.json)

Configure these two parts in your config (other options have defaults). Add or merge the following blocks into your existing config instead of replacing the whole file.

Set your API key (e.g. OpenRouter, recommended for global users):

{
  "providers": {
    "openrouter": {
      "apiKey": "sk-or-v1-xxx"
    }
  }
}

Set your model (optionally pin a provider — defaults to auto-detection):

{
  "agents": {
    "defaults": {
      "provider": "openrouter",
      "model": "anthropic/claude-opus-4-6"
    }
  }
}

3. Chat

nanobot agent

Want different LLM providers, web search, MCP, security settings, or more config options? See Configuration
Want to run nanobot in chat apps like Telegram, Discord, WeChat or Feishu? See Chat Apps
Want Docker or Linux service deployment? See Deployment

🧪 WebUI (Development)

Note

The WebUI development workflow currently requires a source checkout and is not yet shipped together with the official packaged release. See WebUI Document for full WebUI development docs and build steps.

1. Enable the WebSocket channel in ~/.nanobot/config.json

{ "channels": { "websocket": { "enabled": true } } }

2. Start the gateway

nanobot gateway

3. Start the webui dev server

cd webui
bun install
bun run dev

🏗️ Architecture

🐈 nanobot stays lightweight by centering everything around a small agent loop: messages come in from chat apps, the LLM decides when tools are needed, and memory or skills are pulled in only as context instead of becoming a heavy orchestration layer. That keeps the core path readable and easy to extend, while still letting you add channels, tools, memory, and deployment options without turning the system into a monolith.

✨ Features

📈 24/7 Real-Time Market Analysis	🚀 Full-Stack Software Engineer	📅 Smart Daily Routine Manager	📚 Personal Knowledge Assistant

Discovery • Insights • Trends	Develop • Deploy • Scale	Schedule • Automate • Organize	Learn • Memory • Reasoning

📚 Docs

Browse the repo docs for the latest features and GitHub development version, or visit nanobot.wiki for the stable release documentation.

Talk to your nanobot with familiar chat apps: Chat Apps
Configure providers, web search, MCP, and runtime behavior: Configuration
Integrate nanobot with local tools and automations: OpenAI-Compatible API · Python SDK
Run nanobot with Docker or as a Linux service: Deployment

🤝 Contribute & Roadmap

PRs welcome! The codebase is intentionally small and readable. 🤗

Branching Strategy

Branch	Purpose
`main`	Stable releases — bug fixes and minor improvements
`nightly`	Experimental features — new features and breaking changes

Unsure which branch to target? See CONTRIBUTING.md for details.

Roadmap — Pick an item and open a PR!

Multi-modal — See and hear (images, voice, video)
Long-term memory — Never forget important context
Better reasoning — Multi-step planning and reflection
More integrations — Calendar and more
Self-improvement — Learn from feedback and mistakes

Contact

This project was started by Xubin Ren as a personal open-source project and continues to be maintained in an individual capacity using personal resources, with contributions from the open-source community. Feel free to contact xubinrencs@gmail.com for questions, ideas, or collaboration.

Contributors

⭐ Star History

Thanks for visiting ✨ nanobot!

Languages

Python 84.9%

TypeScript 14.5%

Shell 0.2%

CSS 0.1%

HTML 0.1%

README.md Unescape Escape

📢 News

💡 Key Features of nanobot

📦 Install

🚀 Quick Start

🧪 WebUI (Development)

🏗️ Architecture

✨ Features

📚 Docs

🤝 Contribute & Roadmap

Branching Strategy

Contact

Contributors

⭐ Star History

README.md