thejoshwolfe / yazl Goto Github PK

View Code? Open in Web Editor NEW

324.0 9.0 44.0 102 KB

yet another zip library for node

License: MIT License

JavaScript 100.00%

yazl's Introduction

yazl

yet another zip library for node. For unzipping, see yauzl.

Design principles:

Don't block the JavaScript thread. Use and provide async APIs.
Keep memory usage under control. Don't attempt to buffer entire files in RAM at once.
Prefer to open input files one at a time than all at once. This is slightly suboptimal for time performance, but avoids OS-imposed limits on the number of simultaneously open file handles.

Usage

var yazl = require("yazl");

var zipfile = new yazl.ZipFile();
zipfile.addFile("file1.txt", "file1.txt");
// (add only files, not directories)
zipfile.addFile("path/to/file.txt", "path/in/zipfile.txt");
// pipe() can be called any time after the constructor
zipfile.outputStream.pipe(fs.createWriteStream("output.zip")).on("close", function() {
  console.log("done");
});
// alternate apis for adding files:
zipfile.addReadStream(process.stdin, "stdin.txt");
zipfile.addBuffer(Buffer.from("hello"), "hello.txt");
// call end() after all the files have been added
zipfile.end();

API

Class: ZipFile

new ZipFile()

No parameters. Nothing can go wrong.

addFile(realPath, metadataPath, [options])

Adds a file from the file system at realPath into the zipfile as metadataPath. Typically metadataPath would be calculated as path.relative(root, realPath). Unzip programs would extract the file from the zipfile as metadataPath. realPath is not stored in the zipfile.

A valid metadataPath must not be blank. If a metadataPath contains "\\" characters, they will be replaced by "/" characters. After this substitution, a valid metadataPath must not start with "/" or /[A-Za-z]:\//, and must not contain ".." path segments. File paths must not end with "/", but see addEmptyDirectory(). After UTF-8 encoding, metadataPath must be at most 0xffff bytes in length.

options may be omitted or null and has the following structure and default values:

{
  mtime: stats.mtime,
  mode: stats.mode,
  compress: true,
  forceZip64Format: false,
  fileComment: "", // or a UTF-8 Buffer
}

Use mtime and/or mode to override the values that would normally be obtained by the fs.Stats for the realPath. The mode is the unix permission bits and file type. The mtime and mode are stored in the zip file in the fields "last mod file time", "last mod file date", and "external file attributes". yazl does not store group and user ids in the zip file.

If compress is true, the file data will be deflated (compression method 8). If compress is false, the file data will be stored (compression method 0).

If forceZip64Format is true, yazl will use ZIP64 format in this entry's Data Descriptor and Central Directory Record regardless of if it's required or not (this may be useful for testing.). Otherwise, yazl will use ZIP64 format where necessary.

If fileComment is a string, it will be encoded with UTF-8. If fileComment is a Buffer, it should be a UTF-8 encoded string. In UTF-8, fileComment must be at most 0xffff bytes in length. This becomes the "file comment" field in this entry's central directory file header.

Internally, fs.stat() is called immediately in the addFile function, and fs.createReadStream() is used later when the file data is actually required. Throughout adding and encoding n files with addFile(), the number of simultaneous open files is O(1), probably just 1 at a time.

addReadStream(readStream, metadataPath, [options])

Adds a file to the zip file whose content is read from readStream. See addFile() for info about the metadataPath parameter. options may be omitted or null and has the following structure and default values:

{
  mtime: new Date(),
  mode: 0o100664,
  compress: true,
  forceZip64Format: false,
  fileComment: "", // or a UTF-8 Buffer
  size: 12345, // example value
}

See addFile() for the meaning of mtime, mode, compress, forceZip64Format, and fileComment. If size is given, it will be checked against the actual number of bytes in the readStream, and an error will be emitted if there is a mismatch.

Note that yazl will .pipe() data from readStream, so be careful using .on('data'). In certain versions of node, .on('data') makes .pipe() behave incorrectly.

addBuffer(buffer, metadataPath, [options])

Adds a file to the zip file whose content is buffer. See below for info on the limitations on the size of buffer. See addFile() for info about the metadataPath parameter. options may be omitted or null and has the following structure and default values:

{
  mtime: new Date(),
  mode: 0o100664,
  compress: true,
  forceZip64Format: false,
  fileComment: "", // or a UTF-8 Buffer
}

See addFile() for the meaning of mtime, mode, compress, forceZip64Format, and fileComment.

This method has the unique property that General Purpose Bit 3 will not be used in the Local File Header. This doesn't matter for unzip implementations that conform to the Zip File Spec. However, 7-Zip 9.20 has a known bug where General Purpose Bit 3 is declared an unsupported compression method (note that it really has nothing to do with the compression method.). See issue #11. If you would like to create zip files that 7-Zip 9.20 can understand, you must use addBuffer() instead of addFile() or addReadStream() for all entries in the zip file (and addEmptyDirectory() is fine too).

Note that even when yazl provides the file sizes in the Local File Header, yazl never uses ZIP64 format for Local File Headers due to the size limit on buffer (see below).

Size limitation on buffer

In order to require the ZIP64 format for a local file header, the provided buffer parameter would need to exceed 0xfffffffe in length. Alternatively, the buffer parameter might not exceed 0xfffffffe in length, but zlib compression fails to compress the buffer and actually inflates the data to more than 0xfffffffe in length. Both of these scenarios are not allowed by yazl, and those are enforced by a size limit on the buffer parameter.

According to this zlib documentation, the worst case compression results in "an expansion of at most 13.5%, plus eleven bytes". Furthermore, some configurations of Node.js impose a size limit of 0x3fffffff on every Buffer object. Running this size through the worst case compression of zlib still produces a size less than 0xfffffffe bytes,

Therefore, yazl enforces that the provided buffer parameter must be at most 0x3fffffff bytes long.

addEmptyDirectory(metadataPath, [options])

Adds an entry to the zip file that indicates a directory should be created, even if no other items in the zip file are contained in the directory. This method is only required if the zip file is intended to contain an empty directory.

See addFile() for info about the metadataPath parameter. If metadataPath does not end with a "/", a "/" will be appended.

options may be omitted or null and has the following structure and default values:

{
  mtime: new Date(),
  mode: 040775,
}

See addFile() for the meaning of mtime and mode.

end([options], [finalSizeCallback])

Indicates that no more files will be added via addFile(), addReadStream(), or addBuffer(), and causes the eventual close of outputStream.

options may be omitted or null and has the following structure and default values:

{
  forceZip64Format: false,
  comment: "", // or a CP437 Buffer
}

If forceZip64Format is true, yazl will include the ZIP64 End of Central Directory Locator and ZIP64 End of Central Directory Record regardless of whether or not they are required (this may be useful for testing.). Otherwise, yazl will include these structures if necessary.

If comment is a string, it will be encoded with CP437. If comment is a Buffer, it should be a CP437 encoded string. comment must be at most 0xffff bytes in length and must not include the byte sequence [0x50,0x4b,0x05,0x06]. This becomes the ".ZIP file comment" field in the end of central directory record. Note that in practice, most zipfile readers interpret this field in UTF-8 instead of CP437. If your string uses only codepoints in the range 0x20...0x7e (printable ASCII, no whitespace except for sinlge space ' '), then UTF-8 and CP437 (and ASCII) encodings are all identical. This restriction is recommended for maxium compatibility. To use UTF-8 encoding at your own risk, pass a Buffer into this function; it will not be validated.

If specified and non-null, finalSizeCallback is given the parameters (finalSize) sometime during or after the call to end(). finalSize is of type Number and can either be -1 or the guaranteed eventual size in bytes of the output data that can be read from outputStream.

Note that finalSizeCallback is usually called well before outputStream has piped all its data; this callback does not mean that the stream is done.

If finalSize is -1, it means means the final size is too hard to guess before processing the input file data. This will happen if and only if the compress option is true on any call to addFile(), addReadStream(), or addBuffer(), or if addReadStream() is called and the optional size option is not given. In other words, clients should know whether they're going to get a -1 or a real value by looking at how they are using this library.

The call to finalSizeCallback might be delayed if yazl is still waiting for fs.Stats for an addFile() entry. If addFile() was never called, finalSizeCallback will be called during the call to end(). It is not required to start piping data from outputStream before finalSizeCallback is called. finalSizeCallback will be called only once, and only if this is the first call to end().

outputStream

A readable stream that will produce the contents of the zip file. It is typical to pipe this stream to a writable stream created from fs.createWriteStream().

Internally, large amounts of file data are piped to outputStream using pipe(), which means throttling happens appropriately when this stream is piped to a slow destination.

Data becomes available in this stream soon after calling one of addFile(), addReadStream(), or addBuffer(). Clients can call pipe() on this stream at any time, such as immediately after getting a new ZipFile instance, or long after calling end().

This stream will remain open while you add entries until you end() the zip file.

As a reminder, be careful using both .on('data') and .pipe() with this stream. In certain versions of node, you cannot use both .on('data') and .pipe() successfully.

dateToDosDateTime(jsDate)

jsDate is a Date instance. Returns {date: date, time: time}, where date and time are unsigned 16-bit integers.

Regarding ZIP64 Support

yazl automatically uses ZIP64 format to support files and archives over 2^32 - 2 bytes (~4GB) in size and to support archives with more than 2^16 - 2 (65534) files. (See the forceZip64Format option in the API above for more control over this behavior.) ZIP64 format is necessary to exceed the limits inherent in the original zip file format.

ZIP64 format is supported by most popular zipfile readers, but not by all of them. Notably, the Mac Archive Utility does not understand ZIP64 format (as of writing this), and will behave very strangely when presented with such an archive.

Output Structure

The Zip File Spec leaves a lot of flexibility up to the zip file creator. This section explains and justifies yazl's interpretation and decisions regarding this flexibility.

This section is probably not useful to yazl clients, but may be interesting to unzip implementors and zip file enthusiasts.

Disk Numbers

All values related to disk numbers are 0, because yazl has no multi-disk archive support. (The exception being the Total Number of Disks field in the ZIP64 End of Central Directory Locator, which is always 1.)

Version Made By

Always 0x033f == (3 << 8) | 63, which means UNIX (3) and made from the spec version 6.3 (63).

Note that the "UNIX" has implications in the External File Attributes.

Version Needed to Extract

Usually 20, meaning 2.0. This allows filenames and file comments to be UTF-8 encoded.

When ZIP64 format is used, some of the Version Needed to Extract values will be 45, meaning 4.5. When this happens, there may be a mix of 20 and 45 values throughout the zipfile.

General Purpose Bit Flag

Bit 11 is always set. Filenames (and file comments) are always encoded in UTF-8, even if the result is indistinguishable from ascii.

Bit 3 is usually set in the Local File Header. To support both a streaming input and streaming output api, it is impossible to know the crc32 before processing the file data. When bit 3 is set, data Descriptors are given after each file data with this information, as per the spec. But remember a complete metadata listing is still always available in the central directory record, so if unzip implementations are relying on that, like they should, none of this paragraph will matter anyway. Even so, some popular unzip implementations do not follow the spec. The Mac Archive Utility requires Data Descriptors to include the optional signature, so yazl includes the optional data descriptor signature. When bit 3 is not used, the Mac Archive Utility requires there to be no data descriptor, so yazl skips it in that case. Additionally, 7-Zip 9.20 does not seem to support bit 3 at all (see issue #11).

All other bits are unset.

Internal File Attributes

Always 0. The "apparently an ASCII or text file" bit is always unset meaning "apparently binary". This kind of determination is outside the scope of yazl, and is probably not significant in any modern unzip implementation.

External File Attributes

Always stats.mode << 16. This is apparently the convention for "version made by" = 0x03xx (UNIX).

Note that for directory entries (see addEmptyDirectory()), it is conventional to use the lower 8 bits for the MS-DOS directory attribute byte. However, the spec says this is only required if the Version Made By is DOS, so this library does not do that.

Directory Entries

When adding a metadataPath such as "parent/file.txt", yazl does not add a directory entry for "parent/", because file entries imply the need for their parent directories. Unzip clients seem to respect this style of pathing, and the zip file spec does not specify what is standard in this regard.

In order to create empty directories, use addEmptyDirectory().

Size of Local File and Central Directory Entry Metadata

The spec recommends that "The combined length of any directory record and [the file name, extra field, and comment fields] should not generally exceed 65,535 bytes". yazl makes no attempt to respect this recommendation. Instead, each of the fields is limited to 65,535 bytes due to the length of each being encoded as an unsigned 16 bit integer.

Change History

2.5.1
- Fix support for old versions of Node and add official support for Node versions 0.10, 4, 6, 8, 10. pull #49
2.5.0
- Add support for comment and fileComment. pull #44
- Avoid new Buffer(). pull #43
2.4.3
- Clarify readme. pull #33
2.4.2
- Remove octal literals to make yazl compatible with strict mode. pull #28
2.4.1
- Fix Mac Archive Utility compatibility issue. issue #24
2.4.0
- Add ZIP64 support. issue #6
2.3.1
- Remove .npmignore from npm package. pull #22
2.3.0
- metadataPath can have \ characters now; they will be replaced with /. issue #18
2.2.2
- Fix 7-Zip compatibility issue. pull request #17
2.2.1
- Fix Mac Archive Utility compatibility issue. issue #14
2.2.0
- Avoid using general purpose bit 3 for addBuffer() calls. issue #13
2.1.3
- Fix bug when only addBuffer() and end() are called. issue #12
2.1.2
- Fixed typo in parameter validation. pull request #10
2.1.1
- Fixed stack overflow when using addBuffer() in certain ways. issue #9
2.1.0
- Added addEmptyDirectory().
- options is now optional for addReadStream() and addBuffer().
2.0.0
- Initial release.

yazl's People

Contributors

Stargazers

Watchers

yazl's Issues

api to predict final zipfile size

for the purpose of providing streaming zip file downloads (for example), it would be useful to know how big the resulting zipfile would be when it's all done. the details of this api are not clear yet.

the only way to kown finalSize is `compress = false` ?

Support for adding directories

I believe this is a classic use case and it would be nice to have it directly in this module.

Here's what I currently do:

var fs = require("fs");
var path = require("path");
var yazl = require("yazl");

var noop = Function.prototype;

function addDirectory(zip, realPath, metadataPath, cb) {
  fs.readdir(realPath, function(error, files) {
    if (error == null) {
      var i = files.length;
      var resolve = function(error) {
        if (error != null) {
          resolve = noop;
          cb(error);
        } else if (--i === 0) {
          resolve = noop;
          cb();
        }
      };
      files.forEach(function(file) {
        addDirectory(
          zip,
          path.join(realPath, file),
          metadataPath + "/" + file,
          resolve
        );
      });
    } else if (error.code === "ENOTDIR") {
      zip.addFile(realPath, metadataPath);
      cb();
    } else {
      cb(error);
    }
  });
}

var zip = new yazl.ZipFile();

addDirectory(zip, "./my-directory", "my-directory", function(error) {
  if (error) {
    return console.error(error);
  }

  zip.end();
  zip.outputStream.pipe(fs.createWriteStream("./archive.zip"));
});

[Feature Request] Support for AES-256 encryption

It would be awesome to see support for AES-256 encryption in yazl. This would be the last thing I need before I can completely stop calling 7zip via execa, and solve all my zipping needs in pure Node.

I have zero knowledge on how to implement this, is this very hard to implement?

Thank you very much for this library!!

I have also opened a mirror issue for yauzl: thejoshwolfe/yauzl#121

Issues when extracting with OSX Archive

When extracting zip files generated with yazl 2.4.0 using the Archive app, a filename.zip.cpgz is generated instead of the contents. Zipfiles generated with 2.3.1 work as expected.

I'm using yazl through gulp-zip so it might not be an issue with yazl itself.

output zips cannot be opened by Mac Archive Utility

I noticed this because Groove Basin multi-file downloads were not working for me.

I create a zip like this:

$ echo THIS IS A TEST. > test.txt
$ node test/zip.js test.txt -o test.zip

Then when I try to open the zip with Archive Utility I get this error:

Unable to expand "test.zip" into "yazl".
(Error 2 - No such file or directory.)

Trying to repair the file gives some diagnostic information:

$ zip -FF test.zip --out test-fixed.zip
Fix archive (-FF) - salvage what can
 Found end record (EOCDR) - says expect single disk archive
Scanning for entries...
 copying: test.txt 
    zip warning: no end of stream entry found: test.txt
    zip warning: rewinding and scanning for later entries
    zip warning: zip file empty

Add ability to set mtime field to exact value (a number or UTC time or Unix timestamp)

Currently mtime takes in a Date, but JS Date objects are notoriously quirky. One major problem is the fact that JS Date has time zone information, but archive file formats typically do not. It would be nice if we could pass to mtime an actual exact value (as a JS integer/number) or an exact to be placed in the archive.

7-zip 9.20 fails to extract zip produced by yazl

My script creates the simplest zip file possible:

var yazl = require('yazl');
var fs = require('fs');

var zipfile = new yazl.ZipFile();

var ostream = fs.createWriteStream('out.zip');
zipfile.outputStream.pipe(ostream);

zipfile.addBuffer(new Buffer('hello'), 'hello.txt');
zipfile.end();

I cannot extract this file with 7-zip 9.20 (its latest stable release). This application has a feature that verifies an archive and with the produced zip I get:

I can successfully extract it using the native Windows Extract... context menu action. I can also extract it using 7-zip 9.38 (their latest beta release).

Comment option

Might be useful for some people. Something like JSZip has https://stuk.github.io/jszip/documentation/api_jszip/file_data.html.

[Question] .xapk support for zipping/unzipping?

G'day guys, I came across your repository and it seems great, exactly what I'm looking for, just have a quick question, I want to know if this supports unzipping for .xapk files? hope I'm not wasting time by asking 🙂

support ZIP64 for zipfiles larger than 4GB

Error: file data stream has unexpected number of bytes

Hi!

For some reason, I occasionally get the following error when trying to create a zip file. I'd be really grateful if you were able to provide any insight.

Uncaught Exception:
Error: file data stream has unexpected number of bytes
    at ByteCounter.<anonymous> (/Applications/Castbridge.app/Contents/Resources/app.asar/node_modules/yazl/index.js:144:99)
    at emitNone (events.js:91:20)
    at ByteCounter.emit (events.js:185:7)
    at endReadableNT (_stream_readable.js:974:12)
    at _combinedTickCallback (internal/process/next_tick.js:74:11)
    at process._tickCallback (internal/process/next_tick.js:98:9)

Thanks!

Windows path issues

I'm using path.normalize() basically everywhere, but yazl is throwing me the following error:

Error: invalid characters in path: fonts\fontawesome-webfont.eot

Is yazl cross-platform/compatible with Windows?

Avoid using general purpose bit 3 when possible

7-zip 9.20 seems to not support general purpose bit 3. It wouldn't be too difficult to avoid relying on general purpose bit 3 sometimes, for example when a file is added with addBuffer().

Enhancing addBuffer() to avoid general purpose bit 3 will enable users to have compatibility with 7-zip 9.20 by buffering their files in ram and using addBuffer() exclusively.

new Buffer() is deprecated

(node:6537) [DEP0005] DeprecationWarning: Buffer() is deprecated due to security and usability issues. Please use the Buffer.alloc(), Buffer.allocUnsafe(), or Buffer.from() methods instead.

Please fix this. Thanks!!!

passing a string to addBuffer almost works

Specifically, all the content seems to come through, but the zipfile size metadata is wrong if the string contains Unicode characters (because it uses the string length.)

Not really a bug, since you shouldn't pass a string to that function, but it does make debugging harder if you do it by accident. I'd suggest a Buffer.isBuffer(...) check on that parameter.

in-place zip editing support

We have been using yazl and yauzl for over a year and we very much like its stability and pure-JS-ness.
Yet, we face quite a lot use cases of editing an zip in-place. For now, we did what's done in #30: create a temp file with yazl, open the original zip with yauzl, transport all entries (and data) to the temp file and overwrite the original one with the temp file. This has quite a few drawbacks:

performance: memory, IO
sometimes it would be tricky to get a temp file path (original_file+'.tmp' / global temp folder are both not reliable for some cases
quite a lot pitfalls (on macOS, empty folder shall be preserved; externalAttributes blablabla) and lengthy non-reusable code snippet.

We are searching for equivalence of zip -d, zip old.zip new.file, zip -ur. Delete, add, update in-place.
Can you share some thoughts?

OS X Archive Utility fails to extract zip file

With the latest version 2.2.0 the following code produces a zip file that OS X's Archive Utility cannot open:

var yazl = require('yazl');
var fs = require('fs');

var zipfile = new yazl.ZipFile();

var ostream = fs.createWriteStream('out.zip');
zipfile.outputStream.pipe(ostream);

zipfile.addBuffer(new Buffer('hello'), 'hello.txt');
zipfile.end();

System log

26/03/15 20:51:23.972 Archive Utility[29825]: bomCopierFatalError:Couldn't read pkzip signature.
26/03/15 20:51:23.973 Archive Utility[29825]: bomCopierFatalError:Not a central directory signature

Memory leak!

It seems the memory is leaking, especially when called addBuffer.

Setting last modified year to older than 1980 produces unexpected results

I'm new to the developer role, please forgive me if I ask questions where the answers may seem obvious.

I'm using gulp-zip, which depends on yazl, and I'm including a modifiedTime option. I need to match the existing code that gulp-zip is replacing by setting the last-modified time to epoch 0. I'm finding that adding a modifiedTime option works fine so long as the time stamp is newer than "1980-01-01T24:00:00Z". If it's anything older, I get a far future date. For example, if I use 0 or "1970-01-01T00:00:00Z", I get a last modified of 1/1/2098.

It looks like the cause is in the dateToDosDateTime function that begins at line 624.

function dateToDosDateTime(jsDate) {
  var date = 0;
  date |= jsDate.getDate() & 0x1f; // 1-31
  date |= ((jsDate.getMonth() + 1) & 0xf) << 5; // 0-11, 1-12
  date |= ((jsDate.getFullYear() - 1980) & 0x7f) << 9; // 0-128, 1980-2108

Is using 1980 instead of 1970 a bug or intentional?

truncated entry

I'm using a variation on the code at #30 (comment) to unzip, edit an entry, then zip again.

I'm having trouble with:

               originalZipFile.on('entry', function(entry) {
                    if (entry.fileName === mdpFileName) {
                       originalZipFile.openReadStream(entry, function(err, readStream) {
                            if (err) {
                                console.error('Failed to read mdp:', err);
                                callback(err);
                            } else {
                                // transformDocument() provides the updated mdp as a string
                                readStream.pipe(
                                    through.obj(transformDocument))
                                    .on('data', function(data) {
                                        console.log("Adding " + data)  // shows the correct data
// but either of the following results in the entry containing truncated data 
                                        //newZipFile.addBuffer(Buffer.from(data), entry.fileName);
                                        newZipFile.addReadStream(intoStream(data), entry.fileName);
                                    }).on('end', () => {
                                        console.info(".. finished mdp")
                                        checkIfDone();
                                      });
                            }
                        });
                    } else {
                        console.info('encountered ' + entry.fileName) // this starts before the previous entry is completed
                        originalZipFile.openReadStream(entry, function(err, readStream) {
                            newZipFile.addReadStream(readStream, entry.fileName);
                            checkIfDone();
                        });
                    }
                });

The code goes on to write the next entry before this entry is finished, and this entry is truncated at the point where the code starts writing the next entry.

I guess there's an easy fix for this, but I'm a novice js/node dev, so any pointers appreciated!

The entry in question here is about 20 KB.

Progress event

Hello. Thank you for the wonderful library!

Is it possible to know how much of compression is completed (maybe only for the {compress: false} case) ? Some kind of 'progress' event.

Maintained ?

Hi @thejoshwolfe , Is this library still being maintained ?

support for deflate compression

Maximum call stack size exceeded on OSX

I'm using gulp-atom-shell, which is in turn using gulp-vinyl-zip, which depends on yazl. I'm running into the following problem (OSX 10.10.2):

RangeError: Maximum call stack size exceeded
    at new Buffer (buffer.js)
    at Function.Buffer.concat (buffer.js:199:16)
    at Entry.getLocalFileHeader (.../node_modules/.../yazl/index.js:302:17)
    at pumpEntries (.../node_modules/.../yazl/index.js:166:33)
    at Entry.doFileDataPump (.../node_modules/.../yazl/index.js:81:7)
    at pumpEntries (.../node_modules/.../yazl/index.js:168:11)
    at Entry.doFileDataPump (.../node_modules/.../yazl/index.js:81:7)
    at pumpEntries (.../node_modules/.../yazl/index.js:168:11)
    at Entry.doFileDataPump (.../node_modules/.../yazl/index.js:81:7)
    at pumpEntries (.../node_modules/.../yazl/index.js:168:11)

As best I can tell, it seems to be due to recursion in pumpEntries() (by calling entry.doFileDataPump(), which in turn (possibly?) calls pumpEntries(), etc).

Maybe it's caused by a very deeply nested file structure, but I'm not sure. I do know that pushing the call to entry.doFileDataPump() to the next frame seems to work, but it feels a little hacky, since i don't actually know why the recursion is occurring.

Thoughts?

What about CLI support

Offering shebanged binary version would be cool.
Globbing some files/folders and zipping them.

Documentation: error handling

Hi,

I noticed that you throw errors on the instance of ZipFile class. So should I catch errors like this?

var zipfile = new yazl.ZipFile();
//...
zipfile.on('error', error=>console.error(error));

This was only way how to prevent termination of the process when for example I tried to add to zip path, that does not exists.

If this is the correct way, could you please add it to documentation? Also if this is the correct way, typing on https://github.com/DefinitelyTyped/DefinitelyTyped/blob/master/types/yazl/index.d.ts does not allow this call.

RE: API to predict final zipfile size

This is a follow up of #1 .

I see that we can get the final zipfile size in the callback to .end. However, I would like to know that size way before adding files, so I can pass it along the pipeline before creating the archive.

Would it be possible to add another API to return the predicted size for a list of inputs whose size is provided?

For example:

yazl.size(
  fs.readdirSync("my/folder", { withFileTypes: true }).map(dirEntry => {
    return {
       ...dirEntry,
      size: fs.statSync(dirEntry.name).size,
    };
  }),
  {
    compress: false,
  },
);

Basically, it takes a list of file-like entries, with required name and size properties, and returns the final zipfile size.

It's fine to return -1 when the size can't be determined, but when it does (such as the size option is passed along the stream), the size should be computed. It's nicer that yazl exposes this number so that the calculation adheres to its method of archiving.

zipped files decompress much larger than the original files

There is a 105MB file within the lambda-lib.zip archive at node-oracledb-for-lambda. When I use gulp-zip (which uses yazl) to add it to a new zip archive, and then unzip that archive the 105MB file has grown to 140MB.

Also, when submitting a yazl-generated zip containing binary libraries to AWS Lambda, an error is thrown:

ELF file's phentsize not the expected size

(this was the cause of oracle/node-oracledb#468)

[info] Small irrelevant benchmark and thanks

Just had an issue with corrupted archives using easy-zip. It occurs that easy-zip2 simply fixes it but I used the occasion to compare up-to-date solutions and I want to thank you for your work because your lib came out clear winner.

Here is the analysis: byteclubfr/copycast#25 (comment)

It's not really relevant, running only twice the commands each time, on a very focused use-case, but I think the results will be reproductible for most cases.

Good job!

ZIP is valid on Linux, but not on Windows

Hi, I've encountered a strange issue with YAZL generated ZIPs. This is described in detail in zadam/trilium#1122 but here are the most relevant parts:

Example ZIP generated by YAZL: root.zip

On Linux it all seems fine:

$ unzip -t root.zip 
Archive:  root.zip
    testing: !!!meta.json             OK
    testing: root/                    OK
    testing: root/Trilium Demo.html   OK
    testing: root/Trilium Demo/       OK
    testing: root/Trilium Demo/Formatting examples/   OK
    testing: root/Trilium Demo/Formatting examples/School schedule.html   OK
...
    testing: root/Trilium Demo/Statistics/Note type count/template/js/renderTable.js   OK
    testing: root/Trilium Demo/Statistics/Note type count/template/js/renderPieChart.clone.html   OK
    testing: root/Trilium Demo/Statistics/Most cloned notes/   OK
    testing: root/Trilium Demo/Statistics/Most cloned notes/template.html   OK
    testing: root/Trilium Demo/Statistics/Most cloned notes/template/   OK
    testing: root/Trilium Demo/Statistics/Most cloned notes/template/js.js   OK
    testing: navigation.html          OK
    testing: index.html               OK
    testing: style.css                OK
No errors detected in compressed data of root.zip.

But on windows most apps seem to report checksum errors:

Any idea what might be causing this?

Any zip created by yazl has a 0kb size and is damaged/corrupted

The title says it all. I'm trying to do a very special task for zipping files into a zip, and this seems to be a big issue I'm having. When I'm zipping the file, it seems to do nothing and just create a output zip with no contents nor it even has a size. It also is a damaged/corrupted zip

Concurrent calls to addReadStream results in empty files

This standalone code demonstrates the issue.
generateFile returns a readable stream to a file. We use it to create files concurrently (concurrency determines how many in parallel).

If concurrency is 1 - all files are written, but if we use 2 or higher some file will be empty:

var Yazl = require('yazl'), Bluebird = require('bluebird'),
  Stream = require('stream'), Fs = require('fs');

var zipFile = new Yazl.ZipFile(),
  zipStream = zipFile.outputStream;

var concurrency = 5;

console.log('concurrency = ' + concurrency);

// return a readable stream to 1000 bytes file
var generateFile = function(idx) {
  var stream = new Stream.Readable();
  for (var i = 0; i < 1000; i++) {
    stream.push('' + idx);
  }
  stream.push(null);
  return stream;
};

Bluebird.map([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ], function(idx) {

  var fileStream = generateFile(idx);

  zipFile.addReadStream(fileStream, idx + ".txt", { compress: false });

  var totalWritten = 0;
  fileStream.on('data', function(chunk) { totalWritten += chunk.length; });

  var deferred = Bluebird.defer();

  fileStream.on('error', function(err) {
    console.log('fileStream error');
    deferred.reject(err);
  });
  fileStream.on('end', function() {
    console.log("fileStream end: " + totalWritten + " written");
    deferred.resolve();
  });

  return deferred.promise.delay(0);

}, { concurrency: concurrency }).then(function() {
  zipFile.end(function(total) { console.log("zipFile end cb: " + total + " written"); });
}).catch(function(err) {
  zipStream.emit('error', err);
});

var total = 0;

zipStream.on('data', function(chunk) { total += chunk.length });
zipStream.on('end', function() { console.log("zipStream end: " + total + " written"); });

zipStream.pipe(Fs.createWriteStream('tmp.zip'));

>node tmp.js
concurrency = 1
fileStream end: 1000 written 
fileStream end: 1000 written 
fileStream end: 1000 written 
fileStream end: 1000 written 
fileStream end: 1000 written 
fileStream end: 1000 writtez
fileStream end: 1000 written
fileStream end: 1000 written
fileStream end: 1000 written
fileStream end: 1000 written
zipFile end cb: 11042 written
zipStream end: 11042 written

>node tmp.js
concurrency = 2
fileStream end: 1000 written 
fileStream end: 1000 written 
fileStream end: 1000 written 
fileStream end: 1000 written 
fileStream end: 1000 written 
fileStream end: 1000 written 
fileStream end: 1000 written 
fileStream end: 1000 written 
fileStream end: 1000 written 
fileStream end: 1000 written 
zipFile end cb: 6042 written 
zipStream end: 6042 written

>node tmp.js
concurrency = 5
fileStream end: 1000 written 
fileStream end: 1000 written 
fileStream end: 1000 written 
fileStream end: 1000 written 
fileStream end: 1000 written 
fileStream end: 1000 written 
fileStream end: 1000 written 
fileStream end: 1000 written 
fileStream end: 1000 written 
fileStream end: 1000 written 
zipFile end cb: 3042 written 
zipStream end: 3042 written

Manipulating with files in a exist zip archive made by yazl

First of all, thank u for this library, this is one of the best library for archive files.

But I have a question, do you have task to add functional for adding, delete and rename files to exist zip archive? I have some ideas how to create this, but I don't think that unzipping and zipping all files is the most optimized way to do this.

I know, that this library is very old, but the hope is steal alive.

Piping streams which come from third party servers

I use addReadStream API to create a ZIP file based on multiple streams. Works as expected when streams are locally created streams. But there is an issue when streams come from third party server and yazl even doesn't detect the issue. Current implementation takes 'first not done entry', pipes it, and on "end" event takes the next not done entry and pipes it ...
When there are many entries to pipe, might pass significant time interval when some next entry/stream is being started handled. Significant means in this case that the third party server decides that some stream is to long idle, and as a result the stream is just closed by the server. And as a result Zip isn't created. that such stream is already aborted. It's what happens when we try to create Zip by yazl based on streams from our S3 server. The solution for such situation is new API which doesn't takes streams like addReadStream does, but takes a function which creates stream just before to start piping this specific stream/file. Something like the following:

ZipFile.prototype.addStreamCreator = function(creator, metadataPath, options) {
   var self = this;
   metadataPath = validateMetadataPath(metadataPath, false);
   if (options == null) options = {};
   var entry = new Entry(metadataPath, false, options);
   self.entries.push(entry);
   entry.setFileDataPumpFunction(async function() {
      creator(metadataPath).then((stream) => {
         entry.state = Entry.FILE_DATA_IN_PROGRESS;
         console.log(`Starting to pump ${metadataPath}`);
         pumpFileDataReadStream(self, entry, stream);
         //pumpEntries(self);
      });
   });
 };

BTW, this is already working and tested function.
Thanks

notify clients when an fd is closed

this enables clients to throttle the number of open fd's when creating a zip file with a large number of entries.

Option for addBuffer to compress if smaller?

It would be nice to have an option to have addBuffer compress the buffer if that makes the resulting file smaller.

I'm not sure it would really work for addFile but it would for addBuffer because the entire buffer is available at once.

I can make the pull request if the API can be decided on. Maybe one of the following?

addBuffer(buffer, path, {
    compress: null
});

addBuffer(buffer, path, {
    compress: 'auto'
});

addBuffer(buffer, path, {
    compress: 'smaller'
});

file data stream as unexpected number of bytes

Zip file created by yazl not openable in Windows Explorer

I'm using Yazl 2.5.1 as the final step of a Node-based build process to package my application. We occasionally find that the files produced will not open in native Windows Explorer (usually Win10) with Windows complaining that 'the compressed (zipped) folder '***' is invalid'

I am, however, able to open the archive using 7zip (v16.02)

The snippet of code I'm using to zip is

    var yazl = require("yazl");
    var zipfile = new yazl.ZipFile();
    zipfile.addFile(outputFilePath, setupFilename);
    zipfile.outputStream.pipe(fs.createWriteStream(outputFilePath.replace(/\.exe$/,'.zip')));

where variables outputFilePath, setupFilename have been already defined.

fs is using fs-extra

API for random access output

Currently, output is provided by the client piping the outputStream field to the destination (file write stream, socket, etc.). This means yazl can't go back and correct local file headers once we know the file sizes and crc32's. Unfortunately, Mac's Archive Utility is so nonconformant that adding #6 ZIP64 support will break all compatibility with Mac's Archive Utility without correcting local file headers using random access output file creation. This is because of 4 bad decisions in the design of Mac's Archive Utility:

Mac's Archive Utility cares about the redundant information in the local file headers instead of just the authoritative information in the central directory.
Mac's Archive Utility requires the presence of the optional Data Descriptor section (see #7 ) sometimes.
Mac's Archive Utility requires that there be no dead space between entries where the optional Data Descriptor would go if Mac's Archive Utility thought that it was appropriate for it to be there.
Mac's Archive Utility doesn't recognize ZIP64 format for local file headers with unknown sizes and crc32. Instead, you get confusing and unpredictable behavior (not even an error message) when trying to extract the zip file.

All of these are in violation of the spec. 1-3 are simply stupid design, and I don't know what kind of logic would motivate anyone to write software like that (probably the same logic that designed the zip file spec in the first place). Number 4 above is the most directly relevant problem to this issue, and it isn't quite due to stupid design; it's just oversights and sloppy design.

So basically, Mac's Archive Utility is the Internet Explorer of zip file extractors; it is the notepad.exe of zip file extractors. Horribly designed, nonconformant, unhelpful pain in my ass.

That being ranted, I will probably not have motivation to implement this for a while.

thanks so much

Finally a zipping library where you can set the header. Thanks so much for making this

you can close this ofc

unzip, edit and re-zip

Hi, not so much of a bug, but a question/request for assistance... I'm trying to create a gulp plugin which edits an entry within a zip file.

I didn't have any luck with yauzl, (it complained of an invalid signature when provided with a zip file created by yazl) so I'm using the [unzip](https://github.com/EvanOxfeld/node-unzip) package.

I'd really appreciate it if you could take a quick look at the code below and help me figure out how to update the zip
"file" with the updated file, leaving all other files intact.

var through = require('through2');
var yazl = require('yazl');
var unzip = require('unzip');

module.exports = function() {
    function transformZipFile(file, encoding, callback) {

    file.pipe(unzip.Parse())
            .on('entry', function(entry) {
                if (entry.path === 'lib/env-config.js') {
                    entry.pipe(through.obj(transformEnvConfig, function(flushCallback) {
                        // not sure what to do here - I have 2 callbacks which need to be called (not sure what flushCallback does
                        // I'm not sure if the entry is writable, if I need to update a zip header if I do...
                        // If the entry is not writable, I suppose I need to copy across all of the other files as well as this one to the new output zip archive

                        flushCallback();
                        callback(null, file);
                    }));
                } else {
                    entry.autodrain();
                }
            })
            .on('close', function() {
                console.info('unzip closed');

            });
    }

   function transformEnvConfig(data, encoding, callback) {
        deployEnvConfig.updateEnvConfigData(data.toString()).then( function(newEnvConfigData) {
            console.info('env-config updated');
            // newEnvConfig is correctly updated, now I need to write it back to the zip "file"
            callback(null, newEnvConfigData);
        }, callback);
    }

    return through.obj(transformZipFile);
};

2.5.0 TypeError: this is not a typed array

We use gulp-zip which uses yazl and a few days ago our build broke because of changes in 2.5.0
Here's the error

node_modules/gulp-zip/node_modules/yazl/index.js:111
var eocdrSignatureBuffer = Buffer.from([0x50, 0x4b, 0x05, 0x06]);
                                  ^
TypeError: this is not a typed array.
  at Function.from (na  at Object.<anonymous> (node_modules/gulp-zip/node_modules/yazl/index.js:111:35)

I forked gulp-zip and hardcoded yazl to use version 2.4.3 till this is fixed. Hopefully will be fixed soon.

Uncaught TypeError: value is out of bounds

Hi,
We are faced with issue that throws this error.

At buffer.js:825

TypeError: value is out of bounds
    at checkInt (buffer.js:825:11)
    at Buffer.writeUInt16LE (buffer.js:883:5)
    at getEndOfCentralDirectoryRecord (/Users/adamclason/.atom/packages/Bart/node_modules/yazl/index.js:220:10)
    at pumpEntries (/Users/adamclason/.atom/packages/Bart/node_modules/yazl/index.js:185:33)
    at ByteCounter.<anonymous> (/Users/adamclason/.atom/packages/Bart/node_modules/yazl/index.js:142:5)
    at emitNone (events.js:72:20)
    at ByteCounter.emit (events.js:166:7)
    at endReadableNT (_stream_readable.js:905:12)
    at doNTCallback2 (node.js:465:9)
    at process._tickCallback (node.js:379:17)

Is there some way to increase count of files?

PS. Thanks for module. :)

$ node test/zip.js --buffer index.js -o /dev/null 

events.js:72
        throw er; // Unhandled 'error' event
              ^
Error: write after end
    at writeAfterEnd (_stream_writable.js:132:12)
    at PassThrough.Writable.write (_stream_writable.js:180:5)
    at writeToOutputStream (/home/josh/dev/yazl/index.js:115:21)
    at /home/josh/dev/yazl/index.js:180:9
    at Array.forEach (native)
    at pumpEntries (/home/josh/dev/yazl/index.js:178:20)
    at Object._onImmediate (/home/josh/dev/yazl/index.js:84:9)
    at processImmediate [as _immediateCallback] (timers.js:345:15)

thejoshwolfe / yazl Goto Github PK

yazl's Introduction

yazl

Usage

API

Class: ZipFile

new ZipFile()

addFile(realPath, metadataPath, [options])

addReadStream(readStream, metadataPath, [options])

addBuffer(buffer, metadataPath, [options])

Size limitation on buffer

addEmptyDirectory(metadataPath, [options])

end([options], [finalSizeCallback])

outputStream

dateToDosDateTime(jsDate)

Regarding ZIP64 Support

Output Structure

Disk Numbers

Version Made By

Version Needed to Extract

General Purpose Bit Flag

Internal File Attributes

External File Attributes

Directory Entries

Size of Local File and Central Directory Entry Metadata

Change History

yazl's People

Contributors

Stargazers

Watchers

Forkers

yazl's Issues

Recommend Projects

Recommend Topics

Recommend Org