nanobot

mirror of https://github.com/HKUDS/nanobot.git synced 2026-05-21 17:12:32 +00:00

History

Matt Van Horn ee14e2df56 perf(document): lazy-import heavy document parsers

Move pypdf, python-docx, openpyxl, and python-pptx imports from module
level into the _extract_pdf / _extract_docx / _extract_xlsx /
_extract_pptx functions that actually use them. These four libraries
became core dependencies in v0.1.5.post2 (~25 MB combined) and were
paying the import cost on every nanobot startup even when no document
parsing was needed for the session.

The module-level SUPPORTED_EXTENSIONS set and the extract_text()
dispatch stay as-is; the "[error: <lib> not installed]" branches move
from the old module-level None sentinels into the corresponding
extractor's try/except ImportError block. Behavior for the error
message and for successful parses is identical.

All 20 tests in tests/test_document_parsing.py pass unchanged.

Fixes #3422

2026-04-25 02:10:30 +08:00

__init__.py

fix(agent): address code review findings for tool hint enhancement

2026-04-07 15:15:07 +08:00

document.py

perf(document): lazy-import heavy document parsers

2026-04-25 02:10:30 +08:00

evaluator.py

style: revert unrelated Black-style formatting churn (#3220 )

2026-04-17 20:39:46 +08:00

gitstore.py

fix: handle git worktrees in GitStore nested repo protection