diff --git a/docs/image-generation.md b/docs/image-generation.md index cd1ac2c89..dc6f270d2 100644 --- a/docs/image-generation.md +++ b/docs/image-generation.md @@ -48,6 +48,26 @@ AIHubMix example: } ``` +MiniMax example: + +```json +{ + "providers": { + "minimax": { + "apiKey": "${MINIMAX_API_KEY}" + } + }, + "tools": { + "imageGeneration": { + "enabled": true, + "provider": "minimax", + "model": "image-01", + "defaultAspectRatio": "1:1" + } + } +} +``` + Gemini example (Imagen 4): ```json @@ -91,7 +111,7 @@ The WebUI hides provider storage details from the user. The agent sees the saved | Option | Type | Default | Description | |--------|------|---------|-------------| | `tools.imageGeneration.enabled` | boolean | `false` | Register the `generate_image` tool | -| `tools.imageGeneration.provider` | string | `"openrouter"` | Image provider name. Supported values: `openrouter`, `aihubmix`, `gemini` | +| `tools.imageGeneration.provider` | string | `"openrouter"` | Image provider name. Supported values: `openrouter`, `aihubmix`, `minimax`, `gemini` | | `tools.imageGeneration.model` | string | `"openai/gpt-5.4-image-2"` | Provider model name | | `tools.imageGeneration.defaultAspectRatio` | string | `"1:1"` | Default ratio when the prompt/tool call does not specify one | | `tools.imageGeneration.defaultImageSize` | string | `"1K"` | Default size hint, for example `1K`, `2K`, `4K`, or `1024x1024` | @@ -161,6 +181,28 @@ Configure: `quality: low` is optional. It can make free image models faster and less likely to time out, but it is not required for correctness. +### MiniMax + +MiniMax `image-01` supports text-to-image and reference-image (subject reference) edits. Supported aspect ratios are `1:1`, `16:9`, `4:3`, `3:2`, `2:3`, `3:4`, `9:16`, and `21:9`. + +```json +{ + "providers": { + "minimax": { + "apiKey": "${MINIMAX_API_KEY}" + } + }, + "tools": { + "imageGeneration": { + "enabled": true, + "provider": "minimax", + "model": "image-01", + "defaultAspectRatio": "1:1" + } + } +} +``` + ### Gemini nanobot supports two Gemini image generation model families via Google's Generative Language API: @@ -245,7 +287,7 @@ Use the reference image. Keep the same robot and composition, change the palette |---------|-------| | `generate_image` is not available | Set `tools.imageGeneration.enabled` to `true` and restart the gateway | | Missing API key error | Configure `providers..apiKey`; if using `${VAR_NAME}`, confirm the environment variable is visible to the gateway process | -| `unsupported image generation provider` | Use `openrouter`, `aihubmix`, or `gemini` | +| `unsupported image generation provider` | Use `openrouter`, `aihubmix`, `minimax`, or `gemini` | | AIHubMix says `Incorrect model ID` | Use `model: "gpt-image-2-free"`; nanobot expands it to the required `openai/gpt-image-2-free` model path internally | | Generation times out | Try a smaller/default image size, set AIHubMix `extraBody.quality` to `"low"`, or retry later | | Reference image rejected | Reference image paths must be inside the workspace or nanobot media directory and must be valid image files | diff --git a/nanobot/skills/image-generation/SKILL.md b/nanobot/skills/image-generation/SKILL.md index f0309e68b..0559651f6 100644 --- a/nanobot/skills/image-generation/SKILL.md +++ b/nanobot/skills/image-generation/SKILL.md @@ -42,73 +42,6 @@ For follow-up edits, pass the prior artifact `path` to `reference_images`. If th Do not include internal replay markers such as `[Message Time: ...]`, `[image: /local/path]`, `generate_image(...)`, or `message(...)` in user-facing replies. -## Provider Notes - -Do not ask users to paste API keys into chat. If configuration is needed, describe the fields; LLM provider and BYOK changes are hot-reloaded for new turns. - -For OpenRouter, the image tool expects: - -```json -{ - "providers": { - "openrouter": { - "apiKey": "sk-or-..." - } - }, - "tools": { - "imageGeneration": { - "enabled": true, - "provider": "openrouter", - "model": "openai/gpt-5.4-image-2" - } - } -} -``` - -For AIHubMix, the image tool expects: - -```json -{ - "providers": { - "aihubmix": { - "apiKey": "sk-..." - } - }, - "tools": { - "imageGeneration": { - "enabled": true, - "provider": "aihubmix", - "model": "gpt-image-2-free" - } - } -} -``` - -AIHubMix `gpt-image-2-free` uses AIHubMix's unified predictions endpoint internally (`/v1/models/openai/gpt-image-2-free/predictions`), not the OpenAI Images `/v1/images/generations` endpoint. If it fails with "Incorrect model ID", do not assume the key lacks permission until the provider config, model name, and gateway restart have been checked. - -`providers.aihubmix.extraBody` can be used for provider-specific options. For example, `"extraBody": {"quality": "low"}` is optional but can make `gpt-image-2-free` faster and less likely to time out. - -For Gemini, the image tool supports two model families. Imagen 4 (`imagen-4.0-generate-001`) supports text-to-image only. Gemini Flash (`gemini-2.5-flash-image`) also supports reference-image edits. Configuration: - -```json -{ - "providers": { - "gemini": { - "apiKey": "AIza..." - } - }, - "tools": { - "imageGeneration": { - "enabled": true, - "provider": "gemini", - "model": "imagen-4.0-generate-001" - } - } -} -``` - -For Gemini models, `defaultImageSize` has no effect; use `defaultAspectRatio` instead. Imagen 4 supports `1:1`, `9:16`, `16:9`, `3:4`, and `4:3`. - ## Examples Generate a new image: