Xubin Ren 92d6fca323 refactor: centralize document extraction in AgentLoop._process_message
Move extract_documents() to nanobot.utils.document as a reusable helper
and call it once in AgentLoop._process_message, the single entry point
for all message processing (API + all channels).

This replaces the previous API-only _extract_documents() in server.py,
ensuring Telegram, Feishu, Slack, WeChat, and all other channels also
benefit from automatic document text extraction.

Adds a configurable max_file_size guard (default 50 MB) to skip
oversized files gracefully, preventing unbounded memory/CPU usage
from channel-downloaded attachments.

- server.py: removed _extract_documents and related imports
- document.py: added extract_documents() with size limit
- loop.py: calls extract_documents() at the top of _process_message
- Tests updated: 70 related tests pass

Made-with: Cursor
2026-04-14 13:10:03 +00:00
..