Comments (11)
@bug249286 Clients should be sending non-latin1 header parameter values using the format (encoded words) defined by RFC5987. If they don't send that, then the values are assumed to be encoded as latin1.
You can safely convert the filename to utf-8 since latin1 preserves individual bytes. For example: Buffer.from(filename, 'latin1').toString('utf8')
or using TextDecoder
, node supports both.
from busboy.
Parsing fails if filename contains UTF-8 characters
Content-Disposition: form-data; name="file"; filename="ทดสอบภาษาไทย.xlsx"
Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
const bb = busboy({ headers: req.headers ,defCharset:'utf8'});
info {
filename: 'à¸\x97à¸\x94สà¸à¸\x9Aภาษาà¹\x84à¸\x97ย.xlsx',
encoding: '7bit',
mimeType: 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet'
}
filename Parsing fail.
how to fix.
Thank you.
from busboy.
I agree that it is a valid solution (and I am using it already), but my question was more about how we are supposed to do this if we wanted to do it "properly". My use cases are limited to common browsers (and react-native to some extent) and no client send UTF-8 data the way described in RFC5987 (using extended parameters).
This, in addition to the comment in RFC7578 regarding not using extended parameters for filename in form data leaves me confused as to what's the proper way to handle this. In that sense, starting to add filename*
into formdata client side would seem to go the opposite direction as everyone else (in addition to being tedious).
Anyway, I just now see that this is a quite old issue, so it's probably not the correct place to discuss this. I only found out recently through an update of the multer library. Sorry for the noise.
from busboy.
I'm confused, both of your curl statements are the same. Where is the utf-8 filename?
from busboy.
Yeah, sorry about that, I've updated the report.
from busboy.
Ok, this should be fixed in master now. Can you give it a try?
from busboy.
It seems to be working fine now, thank you very much.
from busboy.
@mscdex Thank you very much.
from busboy.
I am confused on how we are supposed to send UTF-8 (or other) strings. While RFC5987 do mention extended parameters, RFC7578 discourage their use, and some actual browser do not send the extra filename*
(including Chrome, Edge and Firefox), instead putting the utf-8 name in filename
.
As it is, we can do the aforementioned conversion by hand outside of busboy (or in my case multer), but is that really something that fall outside the scope of this library, seeing that the "supported" method of using filename*
is not used much in the wild?
from busboy.
@CleyFaye Writing a library like this that works for everyone everywhere is basically impossible. IMO it's safer to err on the side of history for compatibility purposes until all clients overwhelmingly assume UTF-8 values for filenames. The workaround I provided earlier in this issue is a valid solution if you are in control of the client and want to assume UTF-8 (or any other charset for that matter).
from busboy.
@CleyFaye If you're using HTML forms, I would say either set the page's encoding to utf-8 or set the form's accept-charset
attribute to utf-8 (it defaults to the page encoding if not set). If nothing else, another potential solution that would work for HTML and non-HTML would be to send the encoding as the first field in the form.
from busboy.
Related Issues (20)
- ERR_CONNECTION_ABORTED with long file uploads HOT 6
- licenses in package.json is deprecated HOT 2
- Bug on UTF-8 filenames HOT 2
- Problem on utf8 filename HOT 5
- Missing Content-Type HOT 7
- [email protected] SyntaxError: Unexpected token { HOT 1
- Busboy does not recognize `image/jpeg` without `filename` as files HOT 1
- Extracting Filename when Empty String fails
- need to get raw headers HOT 1
- busboy seems to read ahead too fast for many small (< 16KiB) files HOT 5
- Events order when parsing file HOT 6
- Typings for field name HOT 1
- File event has not been triggered with nuxt 3 HOT 5
- Unexpected end of form HOT 2
- File size unavalaible HOT 1
- Unusable uploaded files. HOT 1
- Unusable uploaded files (with the right module) HOT 1
- Error: Unexpected end of form HOT 6
- Error: Malformed urlencoded form HOT 2
- Unknown license vulnerability HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from busboy.