Git Product home page Git Product logo

Comments (4)

argosphil avatar argosphil commented on August 20, 2024

You're passing a windowBits parameter value of -15.

From the zlib documentation at https://www.zlib.net/manual.html:

The windowBits parameter is the base two logarithm of the window size (the size of the history buffer). It should be in the range 8..15 for this version of the library. Larger values of this parameter result in better compression at the expense of memory usage. The default value is 15 if deflateInit is used instead.

Internally, a negative value of windowBits is used to switch between gzip and zlib compression modes. We should catch negative values in opts.windowBits and throw an error, but that's about all we can do.

from bun.

MarByteBeep avatar MarByteBeep commented on August 20, 2024

That then creates a new problem: my original binary is compressed and starts with a78 DA header, which is a perfectly valid and common zlib header. I need to be able to recompress said data and end up with that header again. The only setting that did that was windowBits -15. How am I able to recreate the original if you simply prevent that value from being passed?

from bun.

MarByteBeep avatar MarByteBeep commented on August 20, 2024

Also, any idea why inflateSync fails to decompress a zlib binary with said header, while gunzipSync works? That feels a bit counterintuitive to me 😊

from bun.

argosphil avatar argosphil commented on August 20, 2024

Internally, a negative value of windowBits is used to switch between gzip and zlib compression modes. We should catch negative values in opts.windowBits and throw an error, but that's about all we can do.

I was entirely and completely wrong. My apologies.

The actual problem is that inflateSync calls zlib like this:

        var reader = zlib.ZlibReaderArrayList.initWithOptions(compressed, &list, allocator, .{
            .windowBits = -15,
        }) catch |err| {

and you'd like to call it with .windowBits = 15. So this almost looks like my favorite kind of bugfix, those that involve removing or changing a single character.

But it's not: the existing code is right for the "raw deflate" format that zlib produces. Per the guidance of the zlib manual, we should default to expecting a zlib header, and assume it is absent only when explicitly told so. But right now, there's no option for that. We could simply check whether there's a valid zlib header, but that would still have many false positives.

So, technically, everything is as it should be, except for the function names. inflateSync should be inflateRawDeflatedDataSync, and used very rarely. Inflating zlib data and decompressing gzip files can safely share a function, since in those very rare cases where you want to accept only one of those formats, it's an acceptable burden to just check for the gzip magic number yourself.

Of course there's no way to guess that without improved API documentation.

Again, I'm sorry I was wrong. In summary, there are three compressed formats only two of which are known to be disjoint. Both (may) overlap with the third.

from bun.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.