The iLink sendmessage API frequently returns ret=-2 (parameter error / rate
limit / expired token) even when HTTP status is 200. The openclaw reference
plugin ignores the JSON body for sendmessage entirely and only checks HTTP
status. Our previous strict ret checking turned ret=-2 into RuntimeError,
causing ChannelManager retries which only made things worse.
Changes:
- _send_text: swallow ret=-2 after one retry without context_token.
Log request body + response at warning level for diagnostics.
- _send_media_file: same ret=-2 swallowing.
- _generate_client_id: change format to ``nanobot:{timestamp}-{hex}`` to
match openclaw-weixin ``{prefix}:{Date.now()}-{hex}``.
- Update tests to expect swallowing instead of raising for ret=-2.
When the iLink API returns ret=-2 (parameter error), it is often caused
by an expired context_token rather than a malformed payload. After a
gateway restart, the cached token can become stale within ~90 seconds if
no new inbound message refreshes it, causing all outbound replies to fail
silently.
Changes:
- _send_text: retry once without context_token when ret=-2 and a token
was present; if the retry succeeds, clear the expired token from cache.
- Remove leftover @staticmethod on _check_response_error so self.logger
and the body parameter work correctly.
- Bump WEIXIN_CHANNEL_VERSION from 2.1.1 -> 2.1.7 to match the reference
openclaw-weixin plugin.
- Add tests covering the ret=-2 retry path, failure path, and no-token
path.
References:
- openclaw/openclaw#61174 (context_token expiry after long agent turns)
- hermes-agent#21011 (ret=-2 rate limiting / parameter error)
The iLink API signals failures through either `ret` or `errcode`.
`_poll_once` already checked both, but `_send_text` and `_send_media_file`
only checked `errcode`. When the API returned `ret != 0` with
`errcode == 0`, the send appeared successful but the message was never
delivered, causing the "still losing messages" issue.
- Add `_check_response_error` helper that validates both fields
- Use it in `_send_text` and `_send_media_file`
- Add debug log after successful text send for observability
- Add test for nonzero ret with zero errcode
Refs: previous inbound fix (suppress -> explicit try/except)
Replace `with suppress(Exception)` in `_poll_once` message processing
and the `start()` poll loop with explicit `try/except` blocks that
log errors via `logger.exception`. Previously, any exception during
message processing (e.g. in `_handle_message`) was swallowed silently,
causing inbound messages to disappear without a trace.
Also add tests verifying that:
- `_poll_once` logs and continues when `_process_message` fails
- the poll loop logs and continues when `_poll_once` fails
_send_text() swallowed API errors (non-zero errcode) with just a
warning log, and send() had three silent return paths (no client,
session paused, no context_token). Neither triggered ChannelManager's
retry logic, causing persistent message loss until a new inbound
message refreshed the context_token.
Now all failure paths raise RuntimeError, matching BaseChannel's
contract and enabling proper retry behavior.
Audited all channel implementations for overly broad exception handling
that causes retry amplification or silent message loss during network
errors. This is the same class of bug as #3050 (Telegram _send_text).
Fixes by channel:
Telegram (send_delta):
- _stream_end path used except Exception for HTML edit fallback
- Network errors (TimedOut, NetworkError) triggered redundant plain
text edit, doubling connection demand during pool exhaustion
- Changed to except BadRequest, matching the _send_text fix
Discord:
- send() caught all exceptions without re-raising
- ChannelManager._send_with_retry() saw successful return, never retried
- Messages silently dropped on any send failure
- Added raise after error logging
DingTalk:
- _send_batch_message() returned False on all exceptions including
network errors — no retry, fallback text sent unnecessarily
- _read_media_bytes() and _upload_media() swallowed transport errors,
causing _send_media_ref() to cascade through doomed fallback attempts
- Added except httpx.TransportError handlers that re-raise immediately
WeChat:
- Media send failure triggered text fallback even for network errors
- During network issues: 3×(media + text) = 6 API calls per message
- Added specific catches: TimeoutException/TransportError re-raise,
5xx HTTPStatusError re-raises, 4xx falls back to text
QQ:
- _send_media() returned False on all exceptions
- Network errors triggered fallback text instead of retry
- Added except (aiohttp.ClientError, OSError) that re-raises
Tests: 331 passed (283 existing + 48 new across 5 channel test files)
Fixes: #3054
Related: #3050, #3053
1. Fix full_url path for non-image media to require AES key and skip download when missing,
instead of persisting encrypted bytes as valid media.
2. Restrict quoted media fallback trigger to only when no top-level media item exists,
not when top-level media download/decryption fails.
Fetch and cache typing tickets so the Weixin channel shows typing while nanobot is processing and clears it after the final reply.
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
- Prevent repeated retries on expired sessions in the polling thread
- Stop sending messages to invalid agent sessions to eliminate noise logs and unnecessary requests