I use yt-dlp on Windows writing to a Linux system via SMB over a
10GbE connection and downloading via 400 Mbps cable internet. I
have observed that downloads often seem to start very fast (40+
MiB/sec) but then throttle down to 8-20 MiB/sec. I also observed
a large amount of disk thrashing for such a large array and small
amount of data that's supposedly being written sequentially.
The problem is two-fold. Downloaded fragments are stored using a
very short-lived *-FragX file, then immediately appended to the
stream upon fragment completion, and deleted. Both operations use
small write buffers. When the OS write buffers start to flush, the
two sets of writes plus the large number of writes start to force
competition to complete the queued writes in different areas of
the volume.
Python defaults to sending writes at the underlying device's
"block size" or a fallback to io.DEFAULT_BUFFER_SIZE. In practical
terms, this means a write buffer of 4096 or 8192 bytes. This
commit increases most write buffers to 65536 (64 KiB) using the
open() buffering=X option, significantly speeding up writes of
larger chunks of data and reducing potential fragmentation in low
disk space conditions. With these changes, I consistently see fast
downloads and the array thrashing is noticeably lessened.
The shell escape function now properly escapes `%`, `\\` and `\n`. `utils.Popen` as well as `%q` output template expansion have been patched accordingly.
Prior to this fix using `--exec` together with `%q` when on Windows could cause remote code to execute. See https://github.com/yt-dlp/yt-dlp/security/advisories/GHSA-hjq6-52gw-2g7p for more details.
Authored by: Grub4K
Reverts 22e4dfacb61f62dfbb3eb41b31c7b69ba1059b80
Despite being documented as `Kbit/s`, the extractors/manifests were returning bitrates in SI units of kilobits/sec.
Authored by: seproDev, pukkandan
The shell escape function is now using `""` instead of `\"`. `utils.Popen` has been patched to properly quote commands.
Prior to this fix using `--exec` together with `%q` when on Windows could cause remote code to execute. See https://github.com/yt-dlp/yt-dlp/security/advisories/GHSA-42h4-v29r-42qg for reference.
Authored by: Grub4K
This also adds the following test runners:
- `3.12-dev` on `ubuntu-latest`
- `3.12-dev` on `windows-latest`
- `pypy-3.10` on `ubuntu-latest`
Authored by: Grub4K
New networking interface consists of a `RequestDirector` that directs
each `Request` to appropriate `RequestHandler` and returns the
`Response` or raises `RequestError`. The handlers define adapters to
transform its internal Request/Response/Errors to our interfaces.
User-facing changes:
- Fix issues with per request proxies on redirects for urllib
- Support for `ALL_PROXY` environment variable for proxy setting
- Support for `socks5h` proxy
- Closes https://github.com/yt-dlp/yt-dlp/issues/6325, https://github.com/ytdl-org/youtube-dl/issues/22618, https://github.com/ytdl-org/youtube-dl/pull/28093
- Raise error when using `https` proxy instead of silently converting it to `http`
Authored by: coletdjnz