Comments (5)
Hi @amulepeweichan
thank you for the kind words about dlm
, I am happy it is handling your use case properly!
as I am downloading many files from the same host, so it is a lot faster let reqwest keep it's connection pool
Is it something you have actually witnessed when changing the code locally?
AFAIK, having pool_max_idle_per_host
set to zero does not disable the connection pooling per se.
It is instructing reqwest
to not keep idle requests around.
An idle request is defined via the following configuration knob:
/// Set an optional timeout for idle sockets being kept-alive.
///
/// Pass `None` to disable timeout.
///
/// Default is 90 seconds.
pub fn pool_idle_timeout<D>(mut self, val: D) -> ClientBuilder
So the current behavior is, I believe, that all connections idle for at least 90 seconds are terminated.
The number of concurrent downloads is set at the application level, so there is always the same amount of connections used, with no time to becoming idle. A connection is reused right away at the end of a download for the next file.
This is my mental model for the current internals of dlm
.
I am happy to change things if this appears to not reflect your experience.
from dlm.
I realized that my answer does not cover the case where multiple hosts are targeted.
In that case, depending on the order of the links in the input files, connections could be recreated.
However, keeping a potentially unbounded number of idle connections open is not something desirable at scale.
A practical workaround is to sort the links in the input file by host to ensure a best utilization of the warm connections.
from dlm.
You're right I should definitely benchmark & test it to confirm. I ran some tests locally, just by adding to my nginx config outside the server block:
log_format connections '[$time_local] "$request" $connection $connection_requests';
and inside the server block:
access_log /var/log/nginx/connections.log connections;
Now the last number in each log line is the number of times the connection has been reused. I then in my document root did for i in `seq 1 256`; do echo $i > $i.txt; done
.
Then I made input file for dlm: for i in `seq 1 256`; do echo http://localhost/$i.txt >> filelist.txt; done
Then I ran time dlm -i filelist.txt -o out/ -M 1 2>/dev/null > /dev/null
In the nginx log it shows it is making a new connection for every request.
I also did the same with my build of dlm without .pool_max_idle_per_host(0)
Now in the nginx log it shows it is reusing each connection for 100 requests before making a new one. I don't know if the limit of 100 is from nginx or reqwest.
I ran each several times (doing rm out/*
) between runs, and the official build takes consistently 1 second, while the build without .pool_max_idle_per_host(0)
takes consistently 0.5 seconds.
And that is for a server running on localhost, I assume for a webserver across the internet where there is more latency in reconnecting, the speed difference will be more.
The number of concurrent downloads is set at the application level, so there is always the same amount of connections used, with no time to becoming idle. A connection is reused right away at the end of a download for the next file.
I think with idle time of 0, it is closed as soon as the request ends, before the next request is made, even if the next request is made immediately after. Maybe an idle time of 1 would keep it open for the next request if it's to the same host, while keeping the open connection pool small if making requests to different hosts.
Another unrelated thing I did, that's small so I'll just mention here rather than open another issue, is enable compression.
I added to Cargo.toml reqwest = { version = "0.11.11", features = ["gzip"] }
Probably brotli compression is better and faster but in my case the server doesn't support it. It's not needed to make any changes to the code, when reqwest is built with that feature it sends accept-encoding gzip header by default.
For my current downloads it has sped it up by 20%.
from dlm.
Thank you for your investigation
Given the time you have spent on this issue, I have decided to remove the .pool_max_idle_per_host(0)
constraint and rely on Reqwest's defaults.
Regarding the gzip
feature, I am happy to enable it if it helps.
EDIT: I took the liberty to edit your messages due to formatting issues
from dlm.
Fixed in https://github.com/agourlay/dlm/releases/tag/v0.3.0
from dlm.
Related Issues (15)
- Segfaults HOT 10
- [Feature Request] Custom User-Agent HOT 5
- [Feature Request] Auto Retry HOT 3
- [Feature Request] Multi connections download HOT 6
- [Feature Request] ARM binaries HOT 4
- [Feature Request] Native in-app Proxy Support HOT 2
- [bug] Progress-Bar makes a mess HOT 2
- [Bug] dlm doesn't use the original file name of redirected URL HOT 13
- [Bug] Speed miscalculation on download resume HOT 4
- Potential issue with finish_all method ? HOT 2
- [Feature Request] Recursive download HOT 4
- Proxy Support ? HOT 7
- [Feature Requests] HOT 2
- [Feature Request] Receive and process URL directly from user and command-line instead of file. HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dlm.