archiverjs / node-archiver Goto Github PK
View Code? Open in Web Editor NEWa streaming interface for archive generation
Home Page: https://www.archiverjs.com
License: MIT License
a streaming interface for archive generation
Home Page: https://www.archiverjs.com
License: MIT License
planning to add the following archive types down the road.
When using Zip type with the store option set to true with more than 8 files I get the the warning below. I'm using Node v0.10.24 and Archiver v0.5.0-alpha (along with version Archiver v0.4.10).
(node) warning: possible EventEmitter memory leak detected. 11 listeners added. Use emitter.setMaxListeners() to increase limit.
Trace
at EventEmitter.addListener (events.js:160:15)
at Readable.on (_stream_readable.js:689:33)
at ChecksumStream.Readable.pipe (_stream_readable.js:491:8)
at ArchiverZip._processFile (streamer/node_modules/archiver/lib/archiver/zip.js:159:15)
at Archiver._processQueue (streamer/node_modules/archiver/lib/archiver/core.js:121:10)
at Archiver.append (streamer/node_modules/archiver/lib/archiver/core.js:182:8)
The error is not produced when setting the store option to 'false'.
I've traced this down to:
https://github.com/ctalkington/node-archiver/blob/master/lib/modules/zip/index.js/#L147
Right now, when addFile()
is called with a stream as source, the library just consumes it all. I suppose this is due to the CRC field of the local header, needed to be placed before the actual data.
We could leverage the data descriptor (see spec) in the zip format, and not wait to consume the whole source.
Bit 3: If this bit is set, the fields crc-32, compressed
size and uncompressed size are set to zero in the
local header. The correct values are put in the
data descriptor immediately following the compressed
data.
What do you think?
so I had some ideas about helpers to ease some confusion. looking for some feedback.
trackOutput(stream) - this would allow archiver to hook on to a streams end/close event rather than its own for finalize callback.
toFile(path) - this would tell archiver to create an internal write stream and output to it. stream would be exposed so one can still access its events if required.
also tar will have an internal gzip option configured through zlib just as with zip.
i'm using archiver to implement https://github.com/maxogden/dir-tar-stream but ran into what I think is a bug
basically if I .append
a fs.createReadStream('someFileOfLengthZero')
then archiver wont ever finish writing the tar.
here is my fix: https://github.com/maxogden/dir-tar-stream/blob/09eb540990aa6e4c82b852c71477a761b7408e82/index.js#L25-L31
long file paths cause issues with opening
I just recently started to have issues creating ZIP files. I use this module along with Knox to download files that has been previously uploaded to S3. It worked just fine in the past.
Here is the short version of my code:
// Any __variables__ means that I get them somewhere else, doesn't really make a difference
var archive = require('archiver')('zip');
res.contentType('application/octet');
res.header('Content-Disposition', 'attachment; filename="somezip.zip"');
archive.pipe(res);
var added = 0;
var callback = function(){
added++;
if (added >= __items__.length) {
archive.finalize(function(){
res.end();
});
} else {
add(__items__[added]);
}
};
var add = function(item){
__s3__.getFile(__path__, function(err, s3res){
// Some error check here.
// So here is my problem, the callback doesn't get called after a while, sometime it's after adding 40 items, other times it's after 5. That's testing using 126 items.
archive.append(s3res, {
name: __name__
}, callback);
});
};
add(__items__[added]);
// I listen for errors using archive.on('error'), but it doesn't throw out any errors.
So it's stopping randomly between 5 and 60 items that's been appended to the ZIP. However, if I try with a TAR, it works just fine. This method used to work just fine before. It was probably working fine on 0.8., but I've upgraded to 0.10..
See the following image, this is a few attempts of download a ZIP, and the same attempt but with a TAR file instead (by just changing the archiver first argument and the filename).
Is there anything I can do to provide more debugging information?
Hi,
Sometimes my zip file has errors like this:
file #17: bad zipfile offset (local header sig): 784439
file #18: bad zipfile offset (local header sig): 802437
testing: medias/images/q_444_256x256.jpg OK
testing: medias/images/q_445_256x256.jpg OK
testing: data.json OK
At least one error was detected in B1L12.zip.
See complete unzip -t here: https://gist.github.com/andrecaribe/6144876
I think that my problem is on archive.append
method (https://gist.github.com/andrecaribe/6144848) what is inside the loop. Is it? How to make the best way?
Saw that here and thought it's a nice to have.
Hi ctalkington,
first of all many thanks for your ongoing efforts to create one unified archiving module for Node.js.
I'm wondering in how far I can create directories in the archives using this library. From screening parts of the code, it doesn't look like it is supported at the moment - please correct me if I'm wrong.
Do you plan to add a feature to create directories within the archives? If yes, how is the roadmap for this feature? I'd add this feature by myself if I could, but unfortunately I don't have in-depth knowledge of the zip format.
Thanks,
bf
I am generating zips where a few of the files have generated files with block type 3 which is invalid. Even zlib.inflateRaw throws an error on it's own DeflateRaw generated data.
This could be an issue with zlib.DeflateRaw.
this maybe be moved to v0.7 but the plan is to allow one to import an existing stream that contains zip data to be appended to.
will be looking into lzma implementations in JS and making 7zip a supported archiver.
You changed the way archiver.file() works with a call to sanitizePath, and it seems to have broken it. When I pass an absolute path, like /path/to/foo
, sanitizePath strips off the leading slash, which breaks the code.
Error: invalid file: Users/skkinast/Desktop/foo.dust
at Archiver.file
Notice no leading slash.
I modified pack-zip.js to create the archive with:
var archive = archiver('zip', {zlib: {level: 0}});
Then I zipped two files of 100MB that I had created with dd from /dev/zero. The output file was 623 byte.
Doing the same operation with "zip -0 -r test.zip StuffToZip" gave me
a 200MB file as I expected.
@ampgcat moving conversation here as to not take over compress issues.
i'm looking for a handful of testers that work with archives and node on a regular basis and would be willing to help test new features. i really want to make this module the go-to for all things archiving in node.
add a parser
function that returns array of items within archive and various details about them.
First off, thanks for the hard work, it was great to find an archive module that functions well with node's streams.
However, after getting setup, I realized I was having an issue on one of my linode boxes where corrupt zip archives were being generated. I didn't have the time right now to track down why they were corrupt, but I did isolate what was causing it, and I stripped down part of the app to what was breaking:
https://gist.github.com/inolen/5789563
Most importantly, this was the culprit:
https://gist.github.com/inolen/5789563#file-gistfile1-js-L60
The eachLimit
functions exists to allow at max limit number of iterators running at any one time. I had originally throttled the each to avoid issues related to opening too many file descriptors, but while reviewing the code (in order to track down the corruption issue) I realized it was silly and just executed each append sequentially using async's eachSeries
.
To my surprise, this fixed the issue. Now again, this wasn't affecting my local machine, but was affecting my production linode box; it's 100% reproducible there.
I don't have time right now to dig into the issue, but I wanted to leave this issue in case you wanted to poke around and see if something is obviously wrong (nothing immediately jumped at me scrolling through zip.js
and core.js
). If not, I'll try and circle back to this in the next few weeks.
Edit: if it helps, both my local and the remote machines are running v0.10.11.
in master with node 0.8.18 on osx, poked around a little but I couldn't see anything obvious, other than the example files not existing but I was testing with the readme and package.json
I just tried to zip a folder containing files with french characters &eaigu; and while testing the zip file I get the following
testing: fr/particuliers/recherchez-une-soci+�t+�.ht
error: invalid compressed data to inflate
at this moment, im thinking it will be best to support the current stable version v0.8
and the previous stable version v0.6
. following that pattern when v0.10
is released, v0.6
support will be dropped
the supported versions will always be defined as part of package.json
as some features may require a certain patch version to work properly.
I use this module to compress a folder.
Here is my code (coffeescript) :
zip = archiver 'zip'
output = fs.createWriteStream targetFile
zip.pipe output
zip.bulk [
expand: true
cwd: folder
src: "*"
]
zip.finalize (err, bytes)->
if err then throw err
callback?()
Is there anything I miss ? or I am in wrong using ?
When I try to compress files more than 4GB the library crashes. It seems that it doesn't support ZIP64. This is the stack trace:
TypeError: value is out of bounds
at TypeError (<anonymous>)
at checkInt (buffer.js:784:11)
at Buffer.writeUInt32LE (buffer.js:841:5)
at /path-to-app/node_modules/archiver/node_modules/zip-stream/lib/headers.js:41:32
at Array.forEach (native)
at ZipHeaderCentralFooter.ZipHeader.toBuffer (/path-to-app/node_modules/archiver/node_modules/zip-stream/lib/headers.js:28:15)
at Object.exports.encode (/path-to-app/node_modules/archiver/node_modules/zip-stream/lib/headers.js:198:24)
at ZipStream._writeCentralDirectory (/path-to-app/node_modules/archiver/node_modules/zip-stream/lib/zip-stream.js:221:22)
at ZipStream.finalize (/path-to-app/node_modules/archiver/node_modules/zip-stream/lib/zip-stream.js:271:8)
at Zip.finalize (/path-to-app/node_modules/archiver/lib/modules/zip/index.js:27:15)
at Archiver._onQueueEnd (/path-to-app/node_modules/archiver/lib/modules/core/index.js:66:18)
at g (events.js:175:14)
at EventEmitter.emit (events.js:92:17)
at Queue.run (/path-to-app/node_modules/archiver/lib/modules/core/queue.js:49:12)
at Queue.next (/path-to-app/node_modules/archiver/lib/modules/core/queue.js:39:8)
at null.<anonymous> (/path-to-app/node_modules/archiver/lib/modules/core/index.js:85:17)
at onend (/path-to-app/node_modules/archiver/node_modules/zip-stream/lib/zip-stream.js:119:5)
at DeflateRawChecksum.<anonymous> (/path-to-app/node_modules/archiver/node_modules/zip-stream/lib/zip-stream.js:130:5)
at DeflateRawChecksum.g (events.js:175:14)
at DeflateRawChecksum.EventEmitter.emit (events.js:117:20)
at _stream_readable.js:920:16
at process._tickCallback (node.js:415:13)
this appears isolated to projects with a high number of files queued and node v0.8 at the moment. no real debugging info was available but I tracked it down to the pipe process between zlib and self (internal buffer).
for now this will remain an open issue until more cases arise to assist in debugging.
Hey, I'm using archiver to stream zips over http and when the server gets shut down while there's still a transfer in progress, I get this seemingly uncatchable error thrown.
Right now, I'm setting
archive.catchEarlyExitAttached = true;
to avoid getting the handler attached, but this seems pretty hackish. Would an option to control this behavior be reasonable?
Environment:
If the callback passed to finalize() exits the node process, the generated zip file can be corrupted (cannot be read by unzip or similar). This suggests that finalize() is invoking its callback before the output zip file has actually finished writing to disk.
I've put some sample code to reproduce this bug in a repo at https://github.com/townxelliot/node-archiver-issue-40. This seems to reproduce the issue 100% of the time.
(Note that I noticed this bug when attempting to use node-archiver as part of a grunt task, where you need to be able to tell grunt when an asynchronous task is finished. As I was creating a zip file in such a task, I ran into this bug.)
I would like to create folders in my zip, is it possible?
planning to add a gzip
option to tar which does the somewhat repetitive pipe through gzip before emitting to save everyone a little of time.
Hi,
I've just started using master version of your module and have found out that it is not possible to decompress created out.zip file from example/pack.js file (I've tried to get back test1.txt and test2.txt using 7-zip).
i was testing using NodeJs 0.8.14 and NodeJS 0.9.9 on Linux.
On NodeJS 0.8.14 I've got out.zip file size 931 bytes and on Node 0.9.9 I've got size 662 bytes.
I'm using async 0.2.5, lodash 0.10 and readeable-stream 0.2.0.
Can you please provide in pack.js example information (comment) about what is the expected output file size and MD5/SHA256 checksum?
Also can you provide an additional example without async library?
I am trying to upload few files using node-formidable module. Once I get the part stream for a file, I am appending it to the archive. In the form end event I am calling archive.finalize but it is producing a garbled zip file. Any ideas why?
Below is the full code link
https://gist.github.com/vishr/5675747
@niknah would you mind creating a new simplified PR for the bug you reported for zipstream. now that name change is done, i think we can nail it down.
Please see: Ziv-Barber/officegen#20
I am a user of officegen, which uses archiver. We run node 0.8.26, on heroku (linux). Using [email protected], everything has been working reliably. We just tried updating to [email protected], and we are seeing a high failure rate where progress just stops partway through the zip file creation, with no events (like error) emitted. The point in the archive where it dies varies run to run, and sometimes it succeeds, but with bigger zip files (say, a multi-slide pptx, vs. a single slide one), it fails almost all the time. Interestingly, using the exact same app code, node version, and module versions, i don't see these failures on my OSX Mavericks machine. So it might be a timing issue, or some platform-specific issue.
I apologize because at this time i don't have a great repro case -- the way i'm seeing this issue involves a ton of code on top of node-archiver. But i wanted to at least raise the issue here to see if anyone has seen this, or to see if it rings any bells on possible causes.
I did skim through the issue backlog in this project, and it seems perhaps similar to cases where the high water mark setting was causing hangs? not sure.
Thanks for your help.
Hello,
during my use of node-archiver I have noticed that the .finalize()
function just returns one argument written
to the callback function. This kind of stands out in the rest of my code, so I wonder if there aren't any errors which could possibly be thrown during execution of this function. If there are errors which can be thrown then in how far can we return them to the callback function? This might be a place to apply the widely-used function (err, data) { .. }
callback style.
If there are in fact no errors to return from this function, why is .finalize()
asynchronous in the first place?
I'm just wondering and hope you can point me in the right direction.
hi, i use archive to zip a dir , but when i unzip the zip file with node module "adm-zip", it when wrong. the error message is "invalid format".
i find out that when "adm-zip" unzip the file, it will check the end zip signigure " 0x06054b50", but sometimes it does't work. it will failed 1/10 probability
TO solve this problem .i write a valid function which is almost the same with the adm-zip module. to validat the zip format in finalize function's callback. but the validation function dose's work at all.
my valid function is:
function isZip (file){
var ENDSIG = 0x06054b50;
var LOCSIG = 0x04034b50;
var COMMENT_LENGTH = 0xFFFF;
var END_HEADER_SIZE = 22;
var inBuffer = fs.readFileSync(file);
var i = inBuffer.length - END_HEADER_SIZE, // END header size
n = Math.max(0, i - COMMENT_LENGTH), // 0xFFFF is the max zip file comment length
endOffset = 0; // Start offset of the END header
for (i; i >= n; i--) {
if (inBuffer[i] != 0x50) continue; // quick check that the byte is 'P'
if (inBuffer.readUInt32LE(i) == ENDSIG) { // "PK\005\006"
endOffset = i;
break;
}
}
if (endOffset && inBuffer.readUInt32LE(0) == LOCSIG){
return true;
}
return false;
}
the function is used like :
var files =this.util.find(dir);
var componentZip ='./a.zip',
componentStream = fs.createWriteStream(componentZip);
archive.pipe(componentStream);
for(var i=0; i<files.length; i++){
archive.append(fs.createReadStream(files[i]), {name : files[i].replace(dir, '')});
}
archive.on('error', function(err) {
callback(err);
});
var zipStream = fs.createReadStream(componentZip);
archive.finalize(function(error, written){
if(!error){
if(!isZip(componentZip)){
//检查zip后的文件是否符合zip格式
callback('sorry, zip failed, please try publish again');
}else{
//something else
}
//it seems that the validation function read the file is not complete.
// if i validate the zip directly, it is true, but in the finalize callback, is wrong.
//the version of archiver is 0.4.9
how to solve the problem? it is really very confused.
Hello.
All browsers return a zip but Chrome, which displays binary data in the browser.
Anyone has experienced that? I'm guessing it's a Content Type issue.
Hi, is it possible to set compression level or disable compression?
var archive = archiver('zip', {level: 0}); - ?
A particularly helpful feature for myself and possibly others would be to add more abstraction: I'd like to just give a folder path and an output archive file name and it would build an archive containing all the contents of the folder. (this is my first issue on GitHub and I apologize if I've done it incorrectly, I wanted to label it as an enhancement like the others but not sure how)
Hello,
It looks like it's using the URL to define the name of the zip, is it possible to explicitly define the name?
Thanks!
Header produced by console tar archive
0000000: 626c6f636b732f7363726f6c6c626172 blocks/scrollbar
0000010: 2f5f7363726f6c6c6261722e73637373 /_scrollbar.scss
0000020: 00000000000000000000000000000000 ................
0000030: 00000000000000000000000000000000 ................
0000040: 00000000000000000000000000000000 ................
0000050: 00000000000000000000000000000000 ................
0000060: 00000000303030303634340030303031 ....0000644.0001
0000070: 37353000303030313735300030303030 750.0001750.0000
0000080: 30303030343532003132323136303732 0000452.12216072
0000090: 30363200303135373732002030000000 062.015772. 0...
00000a0: 00000000000000000000000000000000 ................
00000b0: 00000000000000000000000000000000 ................
Header produced by node-archiver:
0000000: 626c6f636b732f616c6572742f5f616c blocks/alert/_al
0000010: 6572742e736373730000000000000000 ert.scss........
0000020: 00000000000000000000000000000000 ................
0000030: 00000000000000000000000000000000 ................
0000040: 00000000000000000000000000000000 ................
0000050: 00000000000000000000000000000000 ................
0000060: 00000000303030303636342030303030 ....0000664 0000
0000070: 30303020303030303030302030303030 000 0000000 0000
0000080: 30303030373330203031323231363037 0000730 01221607
0000090: 32303632303136313633002030000000 2062016163. 0...
00000a0: 00000000000000000000000000000000 ................
00000b0: 00000000000000000000000000000000 ................
Files are different, but have same mtime == "12216072063" (in hex format)
In node-archove variant mtime prerended with '0' for some strange reason (in HeaderTarFile.prototype._prepNumeric function).
Some programs (apache commons compress java library) fails to read such tar files. (It assumes that mtime is null or space ended string).
So it would be nice to fix mtime serializing.
Im piping the archive to a http response.
Example of a corrupt zip file generated: https://www.dropbox.com/s/jdtgq9s51pjyhgt/bad.zip
Example of a valid zip file generated:
https://www.dropbox.com/s/wqlc8fx4qgy5tbu/good.zip
If I make a request locally everything works fine.
When receiving the http response over a slower network, the zip file is corrupt in some (most) cases.
Trying to create a ZIP archive using async
I went to this problem:
...\node_modules\archiver\lib\archiver\zip.js:306
file.lastModifiedDate = utils.convertDateOctal(file.date);
^
TypeError: Object # has no method 'convertDateOctal'
Looking at the sources for utils I just found two convert methods:
convertDateTimeDos
andconvertDateTimeOctal
, so it seems to me to a misspelling error.
Is it addFile or append? The examples and docs are completely different but both ways seem to exist
think zero length files are messing with archiver event flow as per @F21 issue in gruntjs/grunt-contrib-compress#29.
Hi,
First of all, thanks for this neat and useful module.
I'm generating zip files and now I need to define a file entry as executable. The zip format does not support this. But the tar format should support it. However I don't see how tell this npm module the permissions of a file entry. I've seen the source code for the tar format and I just see that you can specify the name and date of the file in the "data" argument. Maybe this argument could be extendend to support a "mode" like fs.createWriteStream.
Thank you.
Hi,
I am not sure if it is a bug or expected behaviour in 0.5, but the following code behaves differently in 0.4 and 0.5.0:
var fs = require('fs'),
archiver = require('archiver');
function go() {
var zipStream = fs.createWriteStream('./profile.zip'),
archive = archiver('zip', { forceUTC: true });
zipStream.on('close', function() {
console.log('ok, created');
});
archive.pipe(zipStream);
archive.finalize();
}
go();
In 0.4, on 'close' is called, but not with 0.5. Note that the zip is created.
(I know that creating an empty zip file seems weird... It's just a test scenario actually)
Could you update current version in npm registry? If you don't want to replace current version of node-archiver you can publish with --tab beta
(Publish Beta Versions of NPM Modules)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.