nanobot/docs/openai-api.md
chengyongru 4a58b83acc
docs: make onboarding friendlier for beginners (#4177)
* docs: make onboarding friendlier for beginners

* docs: build clearer documentation paths

Maintainer edit: turn the onboarding follow-up into a layered docs structure for first-time setup, provider selection, troubleshooting, CLI reference, and source-level architecture. This keeps quick start focused while giving advanced users precise reference paths.

* docs: render architecture flow with mermaid

Maintainer edit: replace the ASCII architecture sketch with a GitHub-rendered Mermaid flowchart so the core runtime path is easier to scan in the PR and README docs.

* docs: recommend model presets for model config

Maintainer edit: make named modelPresets the primary model configuration path and expand fallback preset examples so string fallbacks are clearly preset names, not raw model IDs.

* docs: document api base urls and langfuse setup

Maintainer edit: explain when users need apiBase/base URL in quick start and provider docs, and add Langfuse tracing setup with troubleshooting links.

* docs: use python module pip consistently

Maintainer edit: keep install commands tied to the active Python interpreter by using python -m pip in the Azure optional dependency notes too.

* docs: add non-technical getting started path

Maintainer edit: add a wizard-first guide for users without terminal or JSON background, including a text TUI menu example and links from the main docs entrypoints.

* docs: avoid hard-wrapped prose in user docs

Maintainer edit: unwrap ordinary prose across user-facing documentation while preserving markdown structure, code blocks, tables, lists, and prompt/template files.

* docs: keep desktop list continuations nested

Maintainer edit: preserve list nesting after unwrapping prose in the desktop WebUI sync guide.

* docs: add one-command installer

Maintainer edit: add auditable macOS/Linux and Windows install scripts that install nanobot-ai and start the onboarding wizard, then document the commands in the main onboarding entrypoints.

* docs: add installer dry run mode

Maintainer edit: add --dry-run to the one-command installer scripts so users can preview Python detection, install source, pip command, and wizard behavior without changing their environment.

* docs: clean installer error output

Maintainer edit: make PowerShell installer failures print a concise Error: message instead of Write-Error call-site details.

* docs: add provider setup cookbook

Maintainer edit: add pasteable provider recipes for common hosted, local, fallback, runtime switching, and Langfuse setups, then link the cookbook from onboarding and troubleshooting entrypoints.

* docs: address review feedback

* docs: clarify reader paths

* docs: explain terminal basics for beginners

* docs: clarify wizard navigation

* docs: avoid duplicate onboarding steps

* docs: add setup status check

* docs: explain status output

* docs: remove provider recommendation wording

* docs: explain status diagnostics

* docs: reduce hard-wrapped guidance

* docs: migrate config examples to presets

* docs: clarify python command fallbacks

* docs: improve installer failure recovery

* docs: expand install troubleshooting

* docs: cover installer download failures

* docs: put stable install paths first

* docs: add bundled webui quick path

* docs: clarify provider-neutral setup

* docs: clarify gateway setup for chat surfaces

* docs: improve docs navigation paths

* docs: add configuration quick jump

* docs: clarify provider secret variables

* chore: request PR review acknowledgement

Empty commit: please read the PR review comments and reply on the PR to confirm that you have received them.

This commit intentionally changes no files; it exists only to notify the remote Codex run so it can end its active goal.

* docs: add README start here guide

* docs: avoid provider recommendation wording

* docs: guide next steps after first reply

* docs: explain merging JSON snippets

* docs: add CLI command chooser

* docs: add configuration task map

* docs: add deployment readiness guide

* docs: simplify WebUI entry paths

* docs: add provider recipe chooser

* docs: fix provider factual references

Update OpenRouter and LongCat model examples, align Bedrock guidance, and make fallback snippets schema-valid.

Also correct group policy wording and image-generation provider lists to match the current code.

* fix: keep PowerShell installer from closing caller shell

* docs: mention self-guided configuration
2026-06-10 00:36:22 +08:00

3.9 KiB

OpenAI-Compatible API

nanobot can expose a minimal OpenAI-compatible endpoint for local integrations:

python -m pip install "nanobot-ai[api]"
nanobot agent -m "Hello!"
nanobot serve

Run the CLI check first. If nanobot agent -m "Hello!" fails, fix provider or config setup before debugging the API server. By default, the API binds to 127.0.0.1:8900. You can change this in config.json.

For setup help, see quick-start.md, providers.md, and troubleshooting.md.

Behavior

  • Session isolation: pass "session_id" in the request body to isolate conversations; omit for a shared default session (api:default)
  • Single-message input: each request must contain exactly one user message
  • Fixed model: omit model, or pass the same model shown by /v1/models
  • Streaming: set stream=true to receive Server-Sent Events (text/event-stream) with OpenAI-compatible delta chunks, terminated by data: [DONE]; omit or set stream=false for a single JSON response
  • File uploads: supports images, PDF, Word (.docx), Excel (.xlsx), PowerPoint (.pptx) via JSON base64 or multipart/form-data (max 10MB per file)
  • API requests run in the synthetic api channel, so the message tool does not automatically deliver to Telegram/Discord/etc. To proactively send to another chat, call message with an explicit channel and chat_id for an enabled channel.

Example tool call for cross-channel delivery from an API session:

{
  "content": "Build finished successfully.",
  "channel": "telegram",
  "chat_id": "123456789"
}

If channel points to a channel that is not enabled in your config, nanobot will queue the outbound event but no platform delivery will occur.

Endpoints

  • GET /health
  • GET /v1/models
  • POST /v1/chat/completions

curl

curl http://127.0.0.1:8900/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "hi"}],
    "session_id": "my-session"
  }'

File Upload (JSON base64)

Send images inline using the OpenAI multimodal content format:

curl http://127.0.0.1:8900/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": [
      {"type": "text", "text": "Describe this image"},
      {"type": "image_url", "image_url": {"url": "data:image/png;base64,iVBOR..."}}
    ]}]
  }'

File Upload (multipart/form-data)

Upload any supported file type (images, PDF, Word, Excel, PPT) via multipart:

# Single file
curl http://127.0.0.1:8900/v1/chat/completions \
  -F "message=Summarize this report" \
  -F "files=@report.docx"

# Multiple files with session isolation
curl http://127.0.0.1:8900/v1/chat/completions \
  -F "message=Compare these files" \
  -F "files=@chart.png" \
  -F "files=@data.xlsx" \
  -F "session_id=my-session"

Supported file types:

  • Images: PNG, JPEG, GIF, WebP (sent to AI as base64 for vision analysis)
  • Documents: PDF, Word (.docx), Excel (.xlsx), PowerPoint (.pptx) (text extracted and sent to AI)
  • Text: TXT, Markdown, CSV, JSON, etc. (read directly)

Python (requests)

import requests

resp = requests.post(
    "http://127.0.0.1:8900/v1/chat/completions",
    json={
        "messages": [{"role": "user", "content": "hi"}],
        "session_id": "my-session",  # optional: isolate conversation
    },
    timeout=120,
)
resp.raise_for_status()
print(resp.json()["choices"][0]["message"]["content"])

Python (openai)

from openai import OpenAI

client = OpenAI(
    base_url="http://127.0.0.1:8900/v1",
    api_key="dummy",
)

resp = client.chat.completions.create(
    model="MiniMax-M2.7",
    messages=[{"role": "user", "content": "hi"}],
    extra_body={"session_id": "my-session"},  # optional: isolate conversation
)
print(resp.choices[0].message.content)