Caching retrieved content into a file,about socketry/falcon

Comments (18)

janko commented on May 22, 2024

I would also like to provide an option to disable caching of the read content, making the input non-rewindable (like Unicorn's rewindable_input setting). This is because in my particular Rack application, rewindability is not needed unless checksums are used for uploaded content. So disabling caching in this case would reduce CPU usage.

Unicorn uses a StringIO on small bodies (up to 16KB), and a Tempfile otherwise. We could do the same thing here, to avoid creating additional file descriptors when they're not needed.

from falcon.

ioquatix commented on May 22, 2024

I am open to this idea.

Currently Falcon::Adapters::Input saves read chunks into memory to make it rewindable.

Yes, an unfortunate design decision on the part of rack, one that is only leveraged in a few places and of questionable benefit with huge implications for memory use/performance.

I'm not sure if you are familiar with this: rack/rack#1148

Rather than making this a setting, I suggest making it possible to inject a different Input adapter into the Rack adapter. I'd prefer to avoid settings.

from falcon.

ioquatix commented on May 22, 2024

I had another idea.

Pending Rack v3, why don't we try to figure out an API that actually works, and avoid rack.input and all the complexities that brings to the table.

Perhaps something as simple as rack.body which can be nil, or the input body, as defined by Async::HTTP::Body::Readable. We could then make a proposal for Rack v3 that is the same or similar to that API.

from falcon.

ioquatix commented on May 22, 2024

It's also tricky considering multi-part.

I have been experimenting with adding Async::HTTP::Body::Multipart which actually ends up as a stream of multiple bodies.

If you have any thoughts or ideas about this, please let me know. Do you use multipart in tus?

from falcon.

janko commented on May 22, 2024

I'm not sure if you are familiar with this: rack/rack#1148

Now I am 😃

Rather than making this a setting, I suggest making it possible to inject a different Input adapter into the Rack adapter. I'd prefer to avoid settings.

Ok, that makes sense. As long as there is a way to somehow change rack.input to a non-rewindable object before the Rack app is called. That's my idea at least, because in that case I wouldn't need to modify tus-ruby-server. That also makes sense considering that in future Rack versions rack.input won't have to be rewindable if rack/rack#1148 is addressed.

Pending Rack v3, why don't we try to figure out an API that actually works, and avoid rack.input and all the complexities that brings to the table.

Perhaps something as simple as rack.body which can be nil, or the input body, as defined by Async::HTTP::Body::Readable. We could then make a proposal for Rack v3 that is the same or similar to that API.

I'm not sure if I understood you correctly, but if web servers would be required to implement all methods of Async::HTTP::Body::Readable, I think that would make things more complex than the 4 methods they need to implement now (3 without #rewind).

I would first like to understand what is the downside of the current #read(length = nil, outbuf = nil) Rack specification. I don't see any usage limitations with it so far, and the benefit is that you can choose how large chunks you want to retrieve (e.g. Rack::Multipart::Parser was sped up significantly by using larger chunk sizes). Obivously, the implementation is a bit more involved, as IO#read has quite specific behaviour, but I find it a good tradeoff for the benefits and don't see why it should change (especially in a backwards incompatible way).

It's also tricky considering multi-part. [...] If you have any thoughts or ideas about this, please let me know. Do you use multipart in tus?

No thoughts at the moment, I've been happy with Rack::Multipart so far (except the performance issues for larger files but that was fixed if I understood correctly).

The tus specification doesn't use multipart, the request body holds only file data and nothing more, additional data is provided via request headers and only one file is being uploaded per request. That way it nicely avoids implementations having to parse the request body.

from falcon.

ioquatix commented on May 22, 2024

I'm not sure if I understood you correctly, but if web servers would be required to implement all methods of Async::HTTP::Body::Readable, I think that would make things more complex than the 4 methods they need to implement now (3 without #rewind).

It's more complex because it needs to be be more complex.

It turns out for streaming input, if you don't care about the input any more, you need some way of signalling this to the protocol (in my case #stop). In HTTP/2, this maps to the server calling stream.close. In HTTP/1, it ends up being socket.close. You could easily rename this to #close if that makes more sense, but it's not so clear how it should work in the context of readable/writable streams.

Keep in mind this requirement is symmetric. If the client stops receiving data (e.g. it closes the socket/stream) the server needs some way to signal to the client code to stop generating data...

It also turns out to be pretty useful to know if the body is empty. Although you could remove this and just use nil. Maybe I'll refactor this.

I'm not really expecting all servers to implement that API, but streaming request and response bodies is more complex than the semantic model given by Rack.

I would first like to understand what is the downside of the current #read(length = nil, outbuf = nil) Rack specification. I don't see any usage limitations with it so far, and the benefit is that you can choose how large chunks you want to retrieve (e.g. Rack::Multipart::Parser was sped up significantly by using larger chunk sizes). Obivously, the implementation is a bit more involved, as IO#read has quite specific behaviour, but I find it a good tradeoff for the benefits and don't see why it should change (especially in a backwards incompatible way).

HTTP already implements buffering in the form of chunked transfer encoding or HTTP/2.0 data frames. HTTP/2.0 can negotiate frame size and priority.

If you want to minimise latency, you will read these chunks as they become available. It's simply not possible with #read as currently defined and that's why #readpartial exists in Ruby (but I don't like it much). It would have made more sense for #read length to be an upper limit.

Additionally, the length argument of #read ensures that somewhere in the implementation, buffering must be done. Because if you read a chunk bigger than the requested size, you must hold on to the remainder, and then concatenate it with the next call. It's a lot of overhead. It increases latency, it increases memory usage, it increases buffering, etc.

The tus specification doesn't use multipart, the request body holds only file data and nothing more, additional data is provided via request headers and only one file is being uploaded per request. That way it nicely avoids implementations having to parse the request body.

That makes total sense to me.

from falcon.

janko commented on May 22, 2024

It's more complex because it needs to be be more complex.

It turns out for streaming input, if you don't care about the input any more, you need some way of signalling this to the protocol (in my case #stop). In HTTP/2, this maps to the server calling stream.close. In HTTP/1, it ends up being socket.close. You could easily rename this to #close if that makes more sense, but it's not so clear how it should work in the context of readable/writable streams.

Ok, that makes sense. But could this be handled by the web server directly? It makes sense to me that Falcon has access to the Async::HTTP::Body::Readable input with necessary interface, while the Rack application only has the limited Falcon::Adapters::Input providing only the interface that the end user needs. Is there a scenario where the user might want to send a "stop" directly, instead of letting Falcon do it?

Keep in mind this requirement is symmetric. If the client stops receiving data (e.g. it closes the socket/stream) the server needs some way to signal to the client code to stop generating data...

I'm almost understanding this, but not quite, probably because my knowledge of HTTP/2 is very limiting. Can't at the moment both client and server signal to each other whether they're interested in continuing the request/response by closing their end of the socket?

I'm not really expecting all servers to implement that API, but streaming request and response bodies is more complex than the semantic model given by Rack.

Yeah, understood. I followed a bit the discussion at rack/rack#1272, but I didn't have any async/Falcon knowledge at that point.

If you want to minimise latency, you will read these chunks as they become available. It's simply not possible with #read as currently defined and that's why #readpartial exists in Ruby (but I don't like it much). It would have made more sense for #read length to be an upper limit.

Additionally, the length argument of #read ensures that somewhere in the implementation, buffering must be done. Because if you read a chunk bigger than the requested size, you must hold on to the remainder, and then concatenate it with the next call. It's a lot of overhead. It increases latency, it increases memory usage, it increases buffering, etc.

I'm not really sure I understand why does it increase latency. Let's say that the TCP buffer size is set to 16KB, and one Socket#readpartial retrieves 16KB. Wouldn't input.read(8*1024); input.read(8*1024) have equal latency to input.read(16*1024)?

I'm almost 100% sure it doesn't need to increase memory usage if one is being diligent with string allocations internally. Yeah, it is extra work, but I really appreciate that in tus-ruby-server I don't need to do this myself. I can stream request body directly to disk via IO.copy_stream(input, "/path/to/destination"), or I can retrieve 5MB of content at a time for AWS S3's multipart upload API (5MB is the minimum part size) by calling input.read(5*1024*1024). And I can rely on the web server doing the internal concatenation of chunks efficiently to allow me to return the amount of data I need.

from falcon.

ioquatix commented on May 22, 2024

Ok, that makes sense. But could this be handled by the web server directly? It makes sense to me that Falcon has access to the Async::HTTP::Body::Readable input with necessary interface, while the Rack application only has the limited Falcon::Adapters::Input providing only the interface that the end user needs. Is there a scenario where the user might want to send a "stop" directly, instead of letting Falcon do it?

The canonical example is: https://github.com/socketry/falcon/blob/master/examples/beer/config.ru

If no one calls body.finish, the response will never terminate. body.finish ensures that a call to read will eventually trigger eof.

In practice, we are generating a response over a few minutes. If the user closes the browser after 30 seconds, do we continue to write to body? It's silly to do so, so calling #write after the client has stopped reading the body should terminate the generation of output. Whether you implement this in a thread or in an async task, it needs to be handled.

There are two ways a stream can stop: The producer finishes producing (either normally or crash), or the consumer stops consuming (it should consume everything or it disconnected for some reason). Both need to be handled.

I'm almost understanding this, but not quite, probably because my knowledge of HTTP/2 is very limiting. Can't at the moment both client and server signal to each other whether they're interested in continuing the request/response by closing their end of the socket?

Yes, that's sort of how it works. But instead of at the socket level, it's at the stream level. HTTP/2 is a very light encapsulation layer on top of TCP, which handles multiple bi-directional streams with HTTP semantics.

from falcon.

ioquatix commented on May 22, 2024

One more symmetry to be aware of is that on the client you have write request body and read response body. On the server you have to read request body and write response body.

from falcon.

janko commented on May 22, 2024

@ioquatix Thanks for the explanation, I understand now what you meant, and also learned about Async::HTTP::Body::Writable 😃.

So, you can use Async::HTTP::Body::Writable as the response body, in which case you need to #finish yourself. But if you use a standard #each-able object, then Falcon will still finish it for you when iteration finishes, so the Rack application doesn't actually need any explicit "close" API, not for this example at least.

I still don't see why the Rack input object would need to have the "stop API", as Falcon itself can invoke it when the Rack app returns a response. But I really don't have strong opinions about it, and I don't think this discussion is necessary here, as the goal of this ticket is just to replace the current in-memory caching with Tempfile caching, without changing the Rack input API.

from falcon.

ioquatix commented on May 22, 2024

stop is required if the client disconnects eg browser is closed while still delivering response.

from falcon.

janko commented on May 22, 2024

I agree, but that's something that the web server takes care of, not the Rack application. To be clear, I think that the Async::HTTP::Body abstraction is great for Falcon to work with, I just don't see any problems in only exposing the Falcon::Adapters::Input wrapper that implements the Rack input specification to the Rack application, at least for the tus-ruby-server application.

Yeah, having to implement rewindability is definitely seems pointless after reading rack/rack#1148, but Falcon::Adapters::Input already implements it anyway, I just want to make it cache the content to disk instead of memory. And when/if the rewindability requirement gets removed from Rack, we can make changes accordingly (removing the caching for that Rack version).

For me this is independent from any potential rack.input/rack.body specification changes that could be added to Rack v3.

from falcon.

ioquatix commented on May 22, 2024

I just don't see any problems in only exposing the Falcon::Adapters::Input wrapper that implements the Rack input specification to the Rack application

How do you inform the client you don't want any more input? E.g. you've run out of disk space or S3 has disconnected, etc.

Yeah, having to implement rewindability is definitely seems pointless

It's not just pointless, it ensures buffering and excessive memory usage. It's a bad design. Saving data to disk doesn't solve this problem (and infect makes the API much more complex and error prone)..

I'm still interested to see what you come up with, but, teeing the input to a file to implement rewind doesn't seem like a good idea: if /tmp is a RAM disk, and the request body is sufficiently large, you'll run into issues either way - and being able to rewind is totally pointless and will never be used (in this case). I'd be more inclined to have Falcon::Adapters::Input and Falcon::Adapters::BufferedInput which is used only if explicitly requested. #rewind in it's current usage could be a no-op with barely any obvious changes.

The only case it would make sense is if you were actually trying to be robust - as in, you were half way through uploading the data to S3, which took a dump, and you could rewind and start trying to upload to S3 again..?

from falcon.

janko commented on May 22, 2024

How do you inform the client you don't want any more input? E.g. you've run out of disk space or S3 has disconnected, etc.

I return a response. For example, when uploading data to tus-ruby-server, it will get streamed directly to S3. If S3 raises an exception, I catch it, note how many bytes were uploaded so far, and return a response saying the current "offset" (that's part of the tus protocol). Then I expect the web server to return the response and immediately close the socket, which marks the end of the input.

It's not just pointless, it ensures buffering and excessive memory usage. It's a bad design. Saving data to disk doesn't solve this problem (and infect makes the API much more complex and error prone)..

I think you're right. I don't actually need rewindability in tus-ruby-server. The only place where it's currently used are checksums (which is opt-in functionality), where the whole request body with file content is read, then it's verified against the checksum provided in a request header, afterwards it's rewinded and sent off to be uploaded to the specified storage. But I can easily rewrite that and do my own caching to disk.

And I think for other use cases it's fine to buffer in memory. I will probably then just add a way to disable buffering/rewindability, which should also be inline with Rack v3 anyway.

from falcon.

ioquatix commented on May 22, 2024

Okay... while I'm not expecting this to change over night, here is a potential solution.

I've implemented Falcon::Adapters::Input#clear which removes data which has already been read.

It's possible to chop out the start of a file at the file system level, according to https://stackoverflow.com/questions/24239916/remove-beginning-of-file-without-rewriting-the-whole-file - So, rack.input that is based on a file on disk can "efficiently" implement this operation too.

In theory at least.

The next part of this, is that I've exposed Falcon::Adapters::Input#body which is the underlying streaming body. This could be exposed as part of the rack env and I'm fine with either approach.

Until we update the rack spec, this is the best we can do without either a/ violating the rack spec or b/ making the server horribly inefficient.

BTW, rewind still works after calling clear, but it only rewinds as far back as the last unread data.

from falcon.

ioquatix commented on May 22, 2024

Unicorn uses a StringIO on small bodies (up to 16KB), and a Tempfile otherwise. We could do the same thing here, to avoid creating additional file descriptors when they're not needed.

This is actually really interesting. Sorry, brain dump follows.

In a way, this policy shouldn't need to be implemented by the web server. It would make more sense to implement this in the user code - like, some systems definitely don't want to cache body to disk (if you are expecting 32Mbytes of JSON, the Ruby JSON parser requires it all in memory anyway). I've been thinking about this, but most APIs are expecting to have access to a complete buffer in memory. Is this stupid? Maybe. We expect the OS to swap to disk if the data is not useful. Is the the same as using a temp file? Maybe.

I was playing around with #gets as you raised that issue. It turns out #gets doesn't need to work the same way as IO#gets according to the rack SPEC. So, I made #gets work like read_partial. It's great because it's far more efficient than #read which forces the user of an intermediate buffer.

As an example #gets operates at the chunk level for HTTP/1 transfer-encoding: chunked and HTTP/2 (it's basically always framed/chunked). So, this is ideal really.

I played around with the idea of having an input.rewindable = false. Not sure if this is good idea. I initially implemented Input#clear but honestly, the use case for that seems a bit stupid (#gets followed by #clear).

I am going to have a go and clarifying the rack spec. Once I've done that, I'd love to have your input on it. This isn't really something I believe I can do myself because I think we need a consensus from the people who are trying to make a streaming request/response a reality and that covers a wide range of use cases, many of which I'll be unfamiliar with.

from falcon.

ioquatix commented on May 22, 2024

BTW, if it wasn't clear, I'd like your input on having Input#clear vs Input#rewindable = false. The more I think about it, the more I think the latter is better, but can you give me some thoughts too?

from falcon.

ioquatix commented on May 22, 2024

I started trying to implement the above and I realise the interactions start to get very complicated.. e.g. what happens when someone calls #rewind followed by #clear. Right now, it's broken.

from falcon.

Caching retrieved content into a file about falcon HOT 18 CLOSED

Comments (18)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent