Add comprehensive audio and video support across the agent pipeline:
- Generalize media placeholder system: _strip_image_content → _strip_media_content,
_media_placeholder with type-specific labels, unified across providers
- Add detect_audio_mime with magic-byte detection and filename fallback
- Add _AUDIO_FORMAT_MAP for correct MIME-to-API-format conversion
- Add InputLimitsConfig with count limits (max_input_audios/videos) and byte limits
- Support input_audio blocks in context builder with OpenAI-compatible format
- Support video_url blocks with base64 inline data
- Add audio/video passthrough in Codex provider, placeholder fallback in Anthropic provider
- Thread supports_vision/audio/video capability flags through AgentLoop
- Unify placeholder format: [audio: path]/[video: path] instead of generic [file: path]
- Optimize file I/O: single read_bytes() instead of header+full double reads
- Extract _STRIP_MEDIA_TYPES as class constant to avoid per-call allocation
Merge the three retry-after header parsers (base, OpenAI, Anthropic)
into a single _extract_retry_after_from_headers on LLMProvider that
handles retry-after-ms, case-insensitive lookup, and HTTP date.
Remove the per-provider _parse_retry_after_headers duplicates and
their now-unused email.utils / time imports. Add test for retry-after-ms.
Made-with: Cursor