Comments (7)
At the HTTP protocol level, something has to read the full response body before the TCP connection is ready to send connections on again. In ureq, that's done with into_reader, into_string, etc. Reading diem/diem#4735 it looks like you do read the body, but only on 200's. That means 403's, 500s, etc will not get returned to the pool and get closed when the Response is dropped. If you're seeing something other than that, please let us know. It seems like what you most likely want is to unconditionally call into_string()
and then check the status code.
We've talked recently about a possible breaking change to the API to use Result. I'm planning to write up a detailed proposal soon, but I think that returning an Err for connection-breaking errors would perhaps make it clearer that anything receiving an Ok needs to read the stream.
Also on diem/diem#4735 I see someone mention implementing drop. I assume that means "read all outstanding response bytes when Response is dropped." I think that's not the right approach, since that can wind up performing large amounts of work or even blocking indefinitely.
from ureq.
The connection is auto-put back in the pool on drop (if possible), we could make some explicit API for that.
On the Response
object we could add something like this:
fn is_poolable(&self) -> bool {
// check if response is exhausted
}
fn return_to_pool(self) {
// 1. exhaust response into void
// 2. by virtue of taking `self`, the rest just happens
}
The idea is that it would aid workflows where pooling is desirable.
from ureq.
For what it's worth, in a similar scenario of wanting to reuse a socket with unread response leftovers, Go's net/http
consumes & discards up to 2 KiB of data, abandoning the socket if it would have to read more. If that doesn't result in the end of response, the socket is thrown out. It also checks Content-Length and doesn't bother reading at all if it looks like there'd be too much to consume.
https://github.com/golang/go/blob/go1.15.6/src/net/http/client.go#L693-L701
(Tangent: this is only true for HTTP/1, HTTP/2, /3 are different beasts.. Not that that would matter for ureq as it is today.)
from ureq.
Yeah, in my case, it is obvious that recycling should happen without any issues, but I can totally see that as @jsha alluded this cannot be the default behavior -- if someone attempts to download a file just to see if it exists, for instance, a naive approach would end up downloading the entire file /just/ to reuse a connection.
from ureq.
I'm not keen on adding an explicit return_to_pool
, when we already have various functions that accomplish the same goal, like into_string()
. I would rather make the API more obvious / easier to use in a way that maximizes efficiency.
Here are a couple of other interventions that could increase connection pooling in cases where the user isn't necessarily calling into_string() or into reader():
- If there is no response body (that is, the response headers do not contain Content-Length or Transfer-Encoding), go ahead and return the connection to the pool immediately without waiting for into_string() / into_reader(). Store some internal state to remember that we gave up the Stream and return a zero-length reader if asked.
- If the response body is known to be zero-length (Content-Length: 0), do the same thing.
- Similarly, if the response body is known to be small (Content-Length: N for N < 1024), proactively read the body into an internal buffer, and return a Cursor over that buffer when asked for
into_reader()
.
By the way, thinking about this made me realize another trickiness of the API. Since into_reader() / into_string() consume the Response, the status code and headers become unavailable to users once those methods are called. I think in many cases the user may want to do something like:
let resp = ureq::get("http://example.com").call();
let resp_body = resp.into_string()?;
println!("response status {}, body {}", resp.status(), resp_body);
The above fails because resp was moved / consumed by into_string(). A user can work around it by storing a copy of the status code and headers before consuming the body, but it's a bit awkward.
I don't have a particular recommendation for how to change this API right now but will think about it some more.
from ureq.
easier to use in a way that maximizes efficiency.
I like these ideas. Repooling early is not a thought that crossed my mind.
If we're considering breaking changes, one idea could be to use the informal rust standardized http crate. In hreq I made that a design goal, and it has some advantages for the user.
- It's standard.
- It's agnostic wrt body.
- Body is separate.
A common approach with http would be into_parts()
. Here's some fantasy API:
use ureq::Body;
let resp: http::Response<Body> = ureq::get("http://example.com").call();
let (parts, body) = resp.into_parts();
println!("response status {}, body {}", parts.status, body.into_string()?);
And we could use extension traits to make it even more ergonomic.
use ureq::ResponseExt;
use ureq::Body;
let resp: http::Response<Body> = ureq::get("http://example.com").call();
println!("response status {}, body {}", resp.status(), resp.read_string()?);
from ureq.
That's a very helpful reference. Thank you!
from ureq.
Related Issues (20)
- `impl From<http::request::Parts> for ureq::Request` HOT 2
- Peer misbehaved: `ResumptionOfferedWithVariedCipherSuite`
- Feature request: read `http_proxy` environment variable as proxy config HOT 1
- WebAssembly support HOT 4
- Respect `NO_PROXY` environment variable. HOT 3
- Power9 support HOT 2
- AgentBuilder should be clonable HOT 3
- Sometimes fails to retrieve a response when the server doesn't read the body HOT 9
- Export "cookie" crate
- Updating `http` was a breaking change HOT 3
- How to handle "native_tls::HandshakeError::WouldBlock"? HOT 3
- Replace httpbin.org in all examples and tests HOT 5
- Update rustls for RISCV64 support HOT 5
- Trust invalid certificate convenience function HOT 3
- Set local address to bind a specific interface HOT 2
- [Bug] In some case return empty "Content-Length" response header HOT 3
- Timeout not working when having packet losses HOT 1
- Support Multipart Forms HOT 1
- Timeout not always respected HOT 5
- Invalid transformation to http::Response<Vec<u8>> for non utf8 response bodies HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ureq.