Cherry-pick c4c0ac8 from nightly-26-03-29 which adds InputLimitsConfig
(max_input_images, max_input_image_bytes), image size/existence checks,
and wiring through AgentLoop/CLI. Merged with existing audio/video
multimodal handling, timezone support, and supports_* capability flags.