nanobot/docs/concepts.md
chengyongru 4a58b83acc
docs: make onboarding friendlier for beginners (#4177)
* docs: make onboarding friendlier for beginners

* docs: build clearer documentation paths

Maintainer edit: turn the onboarding follow-up into a layered docs structure for first-time setup, provider selection, troubleshooting, CLI reference, and source-level architecture. This keeps quick start focused while giving advanced users precise reference paths.

* docs: render architecture flow with mermaid

Maintainer edit: replace the ASCII architecture sketch with a GitHub-rendered Mermaid flowchart so the core runtime path is easier to scan in the PR and README docs.

* docs: recommend model presets for model config

Maintainer edit: make named modelPresets the primary model configuration path and expand fallback preset examples so string fallbacks are clearly preset names, not raw model IDs.

* docs: document api base urls and langfuse setup

Maintainer edit: explain when users need apiBase/base URL in quick start and provider docs, and add Langfuse tracing setup with troubleshooting links.

* docs: use python module pip consistently

Maintainer edit: keep install commands tied to the active Python interpreter by using python -m pip in the Azure optional dependency notes too.

* docs: add non-technical getting started path

Maintainer edit: add a wizard-first guide for users without terminal or JSON background, including a text TUI menu example and links from the main docs entrypoints.

* docs: avoid hard-wrapped prose in user docs

Maintainer edit: unwrap ordinary prose across user-facing documentation while preserving markdown structure, code blocks, tables, lists, and prompt/template files.

* docs: keep desktop list continuations nested

Maintainer edit: preserve list nesting after unwrapping prose in the desktop WebUI sync guide.

* docs: add one-command installer

Maintainer edit: add auditable macOS/Linux and Windows install scripts that install nanobot-ai and start the onboarding wizard, then document the commands in the main onboarding entrypoints.

* docs: add installer dry run mode

Maintainer edit: add --dry-run to the one-command installer scripts so users can preview Python detection, install source, pip command, and wizard behavior without changing their environment.

* docs: clean installer error output

Maintainer edit: make PowerShell installer failures print a concise Error: message instead of Write-Error call-site details.

* docs: add provider setup cookbook

Maintainer edit: add pasteable provider recipes for common hosted, local, fallback, runtime switching, and Langfuse setups, then link the cookbook from onboarding and troubleshooting entrypoints.

* docs: address review feedback

* docs: clarify reader paths

* docs: explain terminal basics for beginners

* docs: clarify wizard navigation

* docs: avoid duplicate onboarding steps

* docs: add setup status check

* docs: explain status output

* docs: remove provider recommendation wording

* docs: explain status diagnostics

* docs: reduce hard-wrapped guidance

* docs: migrate config examples to presets

* docs: clarify python command fallbacks

* docs: improve installer failure recovery

* docs: expand install troubleshooting

* docs: cover installer download failures

* docs: put stable install paths first

* docs: add bundled webui quick path

* docs: clarify provider-neutral setup

* docs: clarify gateway setup for chat surfaces

* docs: improve docs navigation paths

* docs: add configuration quick jump

* docs: clarify provider secret variables

* chore: request PR review acknowledgement

Empty commit: please read the PR review comments and reply on the PR to confirm that you have received them.

This commit intentionally changes no files; it exists only to notify the remote Codex run so it can end its active goal.

* docs: add README start here guide

* docs: avoid provider recommendation wording

* docs: guide next steps after first reply

* docs: explain merging JSON snippets

* docs: add CLI command chooser

* docs: add configuration task map

* docs: add deployment readiness guide

* docs: simplify WebUI entry paths

* docs: add provider recipe chooser

* docs: fix provider factual references

Update OpenRouter and LongCat model examples, align Bedrock guidance, and make fallback snippets schema-valid.

Also correct group policy wording and image-generation provider lists to match the current code.

* fix: keep PowerShell installer from closing caller shell

* docs: mention self-guided configuration
2026-06-10 00:36:22 +08:00

152 lines
7.5 KiB
Markdown

# Concepts
Use this page when you want to understand nanobot before changing advanced settings. It explains the moving parts without requiring you to read the source first.
If you want source-file ownership and extension points, read [`architecture.md`](./architecture.md) after this page.
## Runtime Shape
nanobot has one small core loop and several ways to enter it:
| Part | What it does |
|---|---|
| Agent loop | Builds context, selects the session, calls the provider, runs tools, and publishes replies |
| Providers | LLM backends such as OpenRouter, Anthropic, OpenAI, Bedrock, Ollama, vLLM, and other OpenAI-compatible APIs |
| Channels | User-facing transports such as CLI, WebUI/WebSocket, Telegram, Discord, Slack, Feishu, WeChat, Email, and others |
| Tools | Capabilities the model may call, including files, shell, web search/fetch, MCP, cron, image generation, and subagents |
| Memory | Workspace files and session history that keep useful context across turns |
| Gateway | Long-running process that connects enabled channels and serves the health endpoint |
The simplest path is `nanobot agent -m "Hello!"`: one inbound message goes through the agent loop and prints the reply in your terminal. The long-running path is `nanobot gateway`: channels receive messages from chat apps or the WebUI, publish them to the same agent loop, and send replies back to the originating channel.
## Config vs Workspace
The default instance lives under `~/.nanobot/`:
| Path | Meaning |
|---|---|
| `~/.nanobot/config.json` | Instance configuration: providers, model defaults, channels, tools, gateway, API, and runtime options |
| `~/.nanobot/workspace/` | Agent workspace: memory, sessions, heartbeat tasks, cron jobs, skills, and generated artifacts |
You can override both with command flags:
```bash
nanobot onboard --config ./bot-a/config.json --workspace ./bot-a/workspace
nanobot agent --config ./bot-a/config.json --workspace ./bot-a/workspace -m "Hello"
nanobot gateway --config ./bot-a/config.json --workspace ./bot-a/workspace
```
The config file controls what nanobot may use. The workspace is where nanobot keeps state for that instance.
## Config Format
`config.json` accepts both camelCase and snake_case keys. The docs use camelCase because nanobot writes config back to disk with camelCase aliases, for example `apiKey`, `modelPresets`, `intervalS`, and `maxToolResultChars`.
Most examples are partial snippets. Merge them into the existing file created by `nanobot onboard`; do not replace the whole file unless you want to reset the instance.
## One Agent Turn
A normal turn follows this flow:
1. A channel receives a user message and publishes it to the message bus.
2. The agent loop chooses a session key and builds context from the workspace, skills, memory, recent messages, channel metadata, and runtime settings.
3. The provider receives the model request.
4. If the model asks for tools, the runner executes them and feeds results back to the model.
5. The final reply is saved to the session and sent back through the channel.
That flow is the same whether the message starts in the CLI, WebUI, Telegram, Discord, or another channel.
## CLI, Gateway, API, and WebUI
| Entry point | Command | Use it for |
|---|---|---|
| CLI one-shot | `nanobot agent -m "..."` | First-run checks, scripts, and quick local questions |
| CLI interactive | `nanobot agent` | Terminal chat with persistent session history |
| Gateway | `nanobot gateway` | Chat apps, WebUI, heartbeat, Dream, and long-running service mode |
| OpenAI-compatible API | `nanobot serve` | Programmatic access through `/v1/chat/completions` |
| WebUI | `nanobot gateway` plus WebSocket channel | Browser workbench served by the WebSocket channel on port `8765` |
The gateway health endpoint is on `gateway.port` (`18790` by default). The browser WebUI is served by the WebSocket channel (`8765` by default), not by the health endpoint.
## Provider and Model Selection
The active model should normally come from a named `modelPresets` entry selected by `agents.defaults.modelPreset`. Direct `agents.defaults.provider` and `agents.defaults.model` still form the implicit `default` preset for older or minimal configs. The active provider is resolved in this order:
1. If the active preset provider or implicit default provider is not `"auto"`, nanobot uses that provider.
2. If provider is `"auto"`, nanobot tries to infer the provider from the model name, configured API keys, local provider base URLs, or gateway providers.
3. OAuth providers such as OpenAI Codex and GitHub Copilot require explicit login and explicit provider/model selection inside the active preset.
Pin the provider inside the preset when setting up for the first time. It is easier to debug:
```json
{
"modelPresets": {
"primary": {
"provider": "openrouter",
"model": "anthropic/claude-opus-4.5"
}
},
"agents": {
"defaults": {
"modelPreset": "primary"
}
}
}
```
See [`providers.md`](./providers.md) for practical examples and [`configuration.md#providers`](./configuration.md#providers) for the full provider reference.
## Channels and Sessions
Each channel maps inbound messages to a session key. That lets independent conversations keep separate history. The WebUI also supports multiple chats and workspace-scoped metadata for project workspaces.
`agents.defaults.unifiedSession` can intentionally share one session across channels for a single-user multi-device setup. Leave it off if you expect separate people, groups, channels, or projects to keep separate context.
## Memory, Sessions, and Dream
nanobot uses two related stores:
| Store | Location | Purpose |
|---|---|---|
| Sessions | `<workspace>/sessions/*.jsonl` | Recent conversation turns replayed into context |
| Memory | `<workspace>/memory/MEMORY.md` and `<workspace>/memory/history.jsonl` | Long-term facts and consolidated history |
Dream is a periodic consolidation job. It reads accumulated history and updates workspace memory so useful context can survive beyond short session replay.
See [`memory.md`](./memory.md) for the detailed design.
## Tools and Safety
Tools are discovered automatically from built-in modules and plugin entry points. Common tool groups include:
- file read/write/edit and patching;
- shell execution with configurable sandboxing;
- web search and web fetch with SSRF checks;
- MCP servers;
- cron reminders and heartbeat tasks;
- image generation;
- subagents and runtime self-inspection.
Security-sensitive controls live in [`configuration.md#security`](./configuration.md#security). For production or shared chat apps, also configure channel access controls such as `allowFrom`, pairing, or WebSocket tokens.
## Background Jobs
When `nanobot gateway` starts, it creates workspace-scoped cron storage at `<workspace>/cron/jobs.json` and registers system jobs:
- `dream`, when `agents.defaults.dream.enabled` is true;
- `heartbeat`, when `gateway.heartbeat.enabled` is true.
Heartbeat reads `<workspace>/HEARTBEAT.md`. If the file has tasks under `## Active Tasks`, nanobot executes them and sends useful results to the most recently active chat target.
User-created reminders use the same cron service but are not the same as the protected heartbeat system job.
## Where to Go Next
| Need | Read |
|---|---|
| First working install | [`quick-start.md`](./quick-start.md) |
| Provider/model setup | [`providers.md`](./providers.md) |
| Chat app setup | [`chat-apps.md`](./chat-apps.md) |
| Complete config reference | [`configuration.md`](./configuration.md) |
| Runtime debugging | [`troubleshooting.md`](./troubleshooting.md) |