Git Product home page Git Product logo

Comments (5)

agourlay avatar agourlay commented on June 6, 2024

Hi @amulepeweichan 👋 ,

thank you for the kind words about dlm, I am happy it is handling your use case properly!

as I am downloading many files from the same host, so it is a lot faster let reqwest keep it's connection pool

Is it something you have actually witnessed when changing the code locally?

AFAIK, having pool_max_idle_per_host set to zero does not disable the connection pooling per se.
It is instructing reqwest to not keep idle requests around.

An idle request is defined via the following configuration knob:

    /// Set an optional timeout for idle sockets being kept-alive.
    ///
    /// Pass `None` to disable timeout.
    ///
    /// Default is 90 seconds.
    pub fn pool_idle_timeout<D>(mut self, val: D) -> ClientBuilder

So the current behavior is, I believe, that all connections idle for at least 90 seconds are terminated.

The number of concurrent downloads is set at the application level, so there is always the same amount of connections used, with no time to becoming idle. A connection is reused right away at the end of a download for the next file.

This is my mental model for the current internals of dlm.

I am happy to change things if this appears to not reflect your experience.

from dlm.

agourlay avatar agourlay commented on June 6, 2024

I realized that my answer does not cover the case where multiple hosts are targeted.

In that case, depending on the order of the links in the input files, connections could be recreated.

However, keeping a potentially unbounded number of idle connections open is not something desirable at scale.

A practical workaround is to sort the links in the input file by host to ensure a best utilization of the warm connections.

from dlm.

 avatar commented on June 6, 2024

You're right I should definitely benchmark & test it to confirm. I ran some tests locally, just by adding to my nginx config outside the server block:

log_format connections '[$time_local] "$request" $connection $connection_requests';

and inside the server block:

access_log /var/log/nginx/connections.log connections;

Now the last number in each log line is the number of times the connection has been reused. I then in my document root did for i in `seq 1 256`; do echo $i > $i.txt; done.

Then I made input file for dlm: for i in `seq 1 256`; do echo http://localhost/$i.txt >> filelist.txt; done
Then I ran time dlm -i filelist.txt -o out/ -M 1 2>/dev/null > /dev/null

In the nginx log it shows it is making a new connection for every request.
I also did the same with my build of dlm without .pool_max_idle_per_host(0)

Now in the nginx log it shows it is reusing each connection for 100 requests before making a new one. I don't know if the limit of 100 is from nginx or reqwest.

I ran each several times (doing rm out/*) between runs, and the official build takes consistently 1 second, while the build without .pool_max_idle_per_host(0) takes consistently 0.5 seconds.

And that is for a server running on localhost, I assume for a webserver across the internet where there is more latency in reconnecting, the speed difference will be more.

The number of concurrent downloads is set at the application level, so there is always the same amount of connections used, with no time to becoming idle. A connection is reused right away at the end of a download for the next file.

I think with idle time of 0, it is closed as soon as the request ends, before the next request is made, even if the next request is made immediately after. Maybe an idle time of 1 would keep it open for the next request if it's to the same host, while keeping the open connection pool small if making requests to different hosts.

Another unrelated thing I did, that's small so I'll just mention here rather than open another issue, is enable compression.
I added to Cargo.toml reqwest = { version = "0.11.11", features = ["gzip"] }

Probably brotli compression is better and faster but in my case the server doesn't support it. It's not needed to make any changes to the code, when reqwest is built with that feature it sends accept-encoding gzip header by default.

For my current downloads it has sped it up by 20%.

from dlm.

agourlay avatar agourlay commented on June 6, 2024

Thank you for your investigation 👍

Given the time you have spent on this issue, I have decided to remove the .pool_max_idle_per_host(0) constraint and rely on Reqwest's defaults.

Regarding the gzip feature, I am happy to enable it if it helps.

EDIT: I took the liberty to edit your messages due to formatting issues

from dlm.

agourlay avatar agourlay commented on June 6, 2024

Fixed in https://github.com/agourlay/dlm/releases/tag/v0.3.0

from dlm.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.