mscdex / dicer Goto Github PK
View Code? Open in Web Editor NEWA very fast streaming multipart parser for node.js
License: MIT License
A very fast streaming multipart parser for node.js
License: MIT License
If the multipart data like this:
POST /member.php?mod=register&inajax=1 HTTP/1.1
Host: domainExample
Accept: text/html, application/xhtml+xml, */*
Connection: Keep-Alive
Content-Length: 522
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryzca7IDMnT6QwqBp7
Referer: http://domainExample/member.php?mod=register
User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)
------WebKitFormBoundaryzca7IDMnT6QwqBp7
Content-Disposition: form-data; name="regsubmit"
yes
------WebKitFormBoundaryzca7IDMnT6QwqBp7
------WebKitFormBoundaryzca7IDMnT6QwqBp7
Content-Disposition: form-data; name="referer"
http://domainExample/./
------WebKitFormBoundaryzca7IDMnT6QwqBp7
Content-Disposition: form-data; name="activationauth"
------WebKitFormBoundaryzca7IDMnT6QwqBp7
Content-Disposition: form-data; name="seccodemodid"
member::register
------WebKitFormBoundaryzca7IDMnT6QwqBp7--
The second part is empty,this will make the self._parts number is less then the number of processing this._part.on('end', function() {}
emit. This will cause the process to hang.
Test case:
const { Readable } = require('stream');
const { pipeline } = require('stream/promises');
const Dicer = require('dicer');
async function main() {
const r = new Readable({ read() {} });
const d = new Dicer({ boundary: 'a' });
d.on('part', async (part) => {
part.resume();
});
r.push(`--a\r\nA: 1\r\nB: 1\r\n\r\n123\r\n--a\r\n\r\n456\r\n--a--\r\n`);
setImmediate(() => {
r.push(null);
});
const timer = setTimeout(() => {
throw new Error('Should be canceled');
}, 2000);
await pipeline(r, d);
clearTimeout(timer);
}
main();
Thank you for this package!
The last relevant commit is 2021, not sure it's still active.
Hello,
there is a need to change new Buffer() to Buffer.from or Buffer.alloc for Node 10
see also https://nodejs.org/en/docs/guides/buffer-constructor-deprecation/
Thanks
The README.md
suggests that Dicer will emit a special end
event "when all parts have been parsed". Upon reading the code, it seems that in reality a finish
event is emitted. Is the README.md
wrong, or have I misunderstood something?
Hi, I'm the author of Bop and Qap.
I think your results are not explanatory for the real performances of the parsing libraries involved in your benchmarks.
I have written a message about it, in nodejs google group:
https://groups.google.com/d/msg/nodejs/pd1n-HpAcbk/4jXVz8T3vuoJ
Hello,
Thanks for writing a streaming upload parser for Node.
I think a nice addition to the module would be a shortcut method for the common case of processing a file upload field. Something like this perhaps:
d.onFileHeader('my-file-upload-field-name', function (header) { ... }
d.onFileData('my-file-upload-field-name', function (data) { ... }
So, if you are looking for a part with a file upload named 'my-file-upload-field-name', you can just declare that, with a bit less syntactic overhead.
I am able to parse this , but why part is not giving header and the content seprately , check the parse content
--END_OF_PART
Content-Length: 337
Content-Type: application/http
content-id: 1
content-transfer-encoding: binary
POST https://www.googleapis.com/drive/v3/files/<var class="apiparam">fileId</var>/permissions?fields=id
Authorization: Bearer <var class="apiparam">authorization_token</var>
Content-Length: 70
Content-Type: application/json; charset=UTF-8
{
"emailAddress":"[email protected]",
"role":"writer",
"type":"user"
}
--END_OF_PART
Content-Length: 353
Content-Type: application/http
content-id: 2
content-transfer-encoding: binary
POST https://www.googleapis.com/drive/v3/files/<var class="apiparam">fileId</var>/permissions?fields=id&sendNotificationEmail=false
Authorization: Bearer <var class="apiparam">authorization_token</var>
Content-Length: 58
Content-Type: application/json; charset=UTF-8
{
"domain":"appsrocks.com",
"role":"reader",
"type":"domain"
}
--END_OF_PART--
The part i am getting is like this
'\r\nPOST https://www.googleapis.com/drive/v3/files/<var class="apiparam">fileId/permissions?fields=id\r\nAuthorization: Bearer <var class="apiparam">authorization_token\r\nContent-Length: 70\r\nContent-Type: application/json; charset=UTF-8\r\n\r\n\r\n{\r\n "emailAddress":"[email protected]",\r\n "role":"writer",\r\n "type":"user"\r\n}'
How can i get the part as header,parms , body separately .
Can you tell me what is the issue dicer have?
It might be nice to add a comparison to multipart-parser, which is a lot older but also claims insane speeds.
stream.pipe(dicer);
never fired dicer
'end'
event if broken multipart passed.
I had to write this code to solve the problem
stream.on('data', function (chunk) {
dicer.write(chunk);
});
dicer.on('end', function () {
dicer.__ended__ = true;
// ok
});
stream.on('end', function () {
dicer.end();
if ( dicer.__ended__ ) {
return;
}
// ERROR
});
I expected SyntaxError or another on 'error' dicer event
TL;DR> The last part of a stream is discarded when a terminating boundary is missing. In the case when a Content-Length is known for the part (e.g. Motion JPEG streams over HTTP), we propose sending on the last part when the length of the part matches the Content-Length in the last boundary.
We are using dicer to transform a http multipart Motion JPEG stream into individual JPEG frames. During the creation of the testsuite for our 'MjpegReader' we created a sample. This will give our CI bot something to work on without requiring access to a camera. We created the sample.mjpeg using curl, e.g.
curl -m 1 http://somecamera/video.cgi > out.mjpg
The -m 1 switch 'cuts' the download after 1 second, which results in a captured mjpeg stream of 1 second and 15 frames.
Feeding this sample to dicer causes the last frame to drop. After investigating, we found that dicer is expecting an ending boundary, e.g.
--myboundary--
as per RFC 1341 (http://www.w3.org/Protocols/rfc1341/7_2_Multipart.html)
We notices that your testsuite has a testcase for this scenario, but that is limited to multipart form data for an uploaded file, which is a use case that is differing from our use case of streaming mjpeg data.
In our use case, we will be faced with connections dropping and therefore the terminating boundaries most likely always missing. We have attempted to listen to emitted events, but we were not able to coax dicer in to giving us that (potentially complete) data.
But since we know what the Content-Length is of the part in question, we (or perhaps even Dicer itself) can acertain that the part is complete and still send it on anyway.
--myboundary
Content-Type: image/jpeg
Content-Length: 68463
<partdata>
Insights, feedback and comments are very welcome.
Running a Snyk analysis of our dependencies at work we found your library has a vulnerability which was disclosed on the 8th Dec 2021 you can find more information on Snyk.
The specific CVE details and replication instructions can be found: CVE-2022-24434
First of all thanks a lot for this library!
There is currently a high risk security finding recognized by audit-ci, which blocks our pipeline. A fix would be very appreciated.
Dicer will not process any input if you don't set a boundary in the constructor. It looks like this might have been broken in ac6da3b.
submit a file in form. 中文.txt
Part header: k: 'content-disposition', v: [ 'form-data; name="filefield"; filename="ä¸æ�.js"' ]
This affects all versions of package dicer. A malicious attacker can send a modified form to server, and crash the node.js service. An attacker could sent the payload again and again so that the service continuously crashes.
Detected by:
Black Duck (SCA)
Scan date:
Jun 5, 2024, 12:14 PM
Please find the link below for the git hub security alert.
Hi guys,
anyone working on this https://security.snyk.io/vuln/SNYK-JS-DICER-2311764 ?
Affected versions of this package are vulnerable to Denial of Service (DoS). A malicious attacker can send a modified form to server, and crash the nodejs service. An attacker could sent the payload again and again so that the service continuously crashes.
Hi,
Getting this error often, But don't know what is the exact scenario.
Any idea how to catch this error?
Error: Part terminated early due to unexpected end of multipart data
at /srv/storage-server/node_modules/dicer/lib/Dicer.js:65:36
at _combinedTickCallback (internal/process/next_tick.js:131:7)
at process._tickDomainCallback (internal/process/next_tick.js:218:9)
Hi,
veracode finds the following vulnerability in all available versions of the library.
CVE-2022-24434
Denial Of Service (DoS): dicer is vulnerable to denial of service. The vulnerability exists in parseHeader
function in HeaderParser.js
due to the use of a variable h
which allows an attacker to modify and send the form to server and crash the service.
This library supports parsing multipart requests very well, but I don't see any routines for constructing such requests. What do you think of adding this function? Or did you intend for this to go purely in one direction?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.