Git Product home page Git Product logo

Comments (4)

srijs avatar srijs commented on June 9, 2024

A few thoughts about your problem:

  1. resumable.js uploads the chunks in a kind-of first-to-last order, so the whole issue is not that bad.
  2. It is theoretically impossible to guarantee an absolute first-to-last order with parallel uploading.
  3. Amazon S3 does exactly the same thing with their multi-part upload and it seems this is not too bad performance-wise.

To avoid the performance-overhead of concatenating all the received chunks, you have some possibilities:

  1. Make the chunk-size a multiple of the server-side file system block size (which it should be in most cases, already). That way, the file system does not have to concatenate anything, just combine the correct i-nodes in order to combine the received chunks (of course, this doesn't work with a simple cat, but with special fs utilities).
  2. Allocate a large-enough file upfront to which you write the received chunks in the right places. resumable.js sends you the total file size and the chunk's number with each chunk, so no matter which chunk comes in first, you'll be able to allocate the file-size upfront.

from resumable.js.

frank-fan avatar frank-fan commented on June 9, 2024

@srijs sounds good.
Thanks.

from resumable.js.

steffentchr avatar steffentchr commented on June 9, 2024

Adding a bit onto this the optimal strategy for how to handle this varies a bit by which server side software you're using. For example, we're using AOLserver for some if our core application server. On upload, the contents of a multipart form is automatically spooled to either ram or to a temp file -- and then made available to our code in a single, final chunk. In the ram scenario, nothing is on disk yet and the full buffer is available, so a good strategy is probably to assign a single file for the entire upload and seek-write the content with an exclusive lock. In the latter, the small chunk file is already in disk -- so the better strategy seems to be background concatenation once the upload is complete. (For weird reasons we need to use the disk method, which was originally a motivation for writing Resumable)

These thoughts also illustrate other scenarios. In Node, you might be reviving data on 10 chunks of the same file simultaneously and seeks on a multi-GB file might be expensive. So there multiple small files could be good - and you could even choose never to concat the files, just abstracting the file open and constructing the buffer from chunks.

Finally, a number of reverse proxies (nginx, pound from my own experience) do not offer steaming uploads and instead buffers the entire request before forwarding it (again, a good motivation for node). In those cases, you'll have full data immediately upon request -- and the big file + seek approach is probably good.

(Please correct me if these assumptions don't hold, but it is how I've been thinking about this up to now)

from resumable.js.

faller avatar faller commented on June 9, 2024

thanks @steffentchr , this help me a lot
"Finally, a number of reverse proxies (nginx, pound from my own experience) do not offer steaming uploads and instead buffers the entire request before forwarding it (again, a good motivation for node). In those cases, you'll have full data immediately upon request -- and the big file + seek approach is probably good."

from resumable.js.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.