Git Product home page Git Product logo

Comments (7)

jsha avatar jsha commented on July 24, 2024 2

At the HTTP protocol level, something has to read the full response body before the TCP connection is ready to send connections on again. In ureq, that's done with into_reader, into_string, etc. Reading diem/diem#4735 it looks like you do read the body, but only on 200's. That means 403's, 500s, etc will not get returned to the pool and get closed when the Response is dropped. If you're seeing something other than that, please let us know. It seems like what you most likely want is to unconditionally call into_string() and then check the status code.

We've talked recently about a possible breaking change to the API to use Result. I'm planning to write up a detailed proposal soon, but I think that returning an Err for connection-breaking errors would perhaps make it clearer that anything receiving an Ok needs to read the stream.

Also on diem/diem#4735 I see someone mention implementing drop. I assume that means "read all outstanding response bytes when Response is dropped." I think that's not the right approach, since that can wind up performing large amounts of work or even blocking indefinitely.

from ureq.

algesten avatar algesten commented on July 24, 2024 1

The connection is auto-put back in the pool on drop (if possible), we could make some explicit API for that.

On the Response object we could add something like this:

  fn is_poolable(&self) -> bool {
    // check if response is exhausted
  }

  fn return_to_pool(self) {
    // 1. exhaust response into void
    // 2. by virtue of taking `self`, the rest just happens
  }

The idea is that it would aid workflows where pooling is desirable.

from ureq.

tv42 avatar tv42 commented on July 24, 2024 1

For what it's worth, in a similar scenario of wanting to reuse a socket with unread response leftovers, Go's net/http consumes & discards up to 2 KiB of data, abandoning the socket if it would have to read more. If that doesn't result in the end of response, the socket is thrown out. It also checks Content-Length and doesn't bother reading at all if it looks like there'd be too much to consume.

https://github.com/golang/go/blob/go1.15.6/src/net/http/client.go#L693-L701

(Tangent: this is only true for HTTP/1, HTTP/2, /3 are different beasts.. Not that that would matter for ureq as it is today.)

from ureq.

davidiw avatar davidiw commented on July 24, 2024

Yeah, in my case, it is obvious that recycling should happen without any issues, but I can totally see that as @jsha alluded this cannot be the default behavior -- if someone attempts to download a file just to see if it exists, for instance, a naive approach would end up downloading the entire file /just/ to reuse a connection.

from ureq.

jsha avatar jsha commented on July 24, 2024

I'm not keen on adding an explicit return_to_pool, when we already have various functions that accomplish the same goal, like into_string(). I would rather make the API more obvious / easier to use in a way that maximizes efficiency.

Here are a couple of other interventions that could increase connection pooling in cases where the user isn't necessarily calling into_string() or into reader():

  • If there is no response body (that is, the response headers do not contain Content-Length or Transfer-Encoding), go ahead and return the connection to the pool immediately without waiting for into_string() / into_reader(). Store some internal state to remember that we gave up the Stream and return a zero-length reader if asked.
  • If the response body is known to be zero-length (Content-Length: 0), do the same thing.
  • Similarly, if the response body is known to be small (Content-Length: N for N < 1024), proactively read the body into an internal buffer, and return a Cursor over that buffer when asked for into_reader().

By the way, thinking about this made me realize another trickiness of the API. Since into_reader() / into_string() consume the Response, the status code and headers become unavailable to users once those methods are called. I think in many cases the user may want to do something like:

  let resp = ureq::get("http://example.com").call();
  let resp_body = resp.into_string()?;
  println!("response status {}, body {}", resp.status(), resp_body);

The above fails because resp was moved / consumed by into_string(). A user can work around it by storing a copy of the status code and headers before consuming the body, but it's a bit awkward.

I don't have a particular recommendation for how to change this API right now but will think about it some more.

from ureq.

algesten avatar algesten commented on July 24, 2024

easier to use in a way that maximizes efficiency.

I like these ideas. Repooling early is not a thought that crossed my mind.

If we're considering breaking changes, one idea could be to use the informal rust standardized http crate. In hreq I made that a design goal, and it has some advantages for the user.

  • It's standard.
  • It's agnostic wrt body.
  • Body is separate.

A common approach with http would be into_parts(). Here's some fantasy API:

use ureq::Body;
let resp: http::Response<Body> = ureq::get("http://example.com").call();
let (parts, body) = resp.into_parts();

println!("response status {}, body {}", parts.status, body.into_string()?);

And we could use extension traits to make it even more ergonomic.

use ureq::ResponseExt;
use ureq::Body;
let resp: http::Response<Body> = ureq::get("http://example.com").call();

println!("response status {}, body {}", resp.status(), resp.read_string()?);

from ureq.

jsha avatar jsha commented on July 24, 2024

That's a very helpful reference. Thank you!

from ureq.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.