Comments (3)
Nice job! I'm happy to have run across this.
Thanks!
I see some of your conformance test exceptions (e.g. not-wf-sa-173) say "claims to test an illegal char, but tests the wrong char". How are these the wrong character?
Good catch. It looks like not-wf-sa-173
may have been skipped by mistake, since it does in fact appear to test the correct char.
not-wf-sa-168
is an example of a test case whose description claims that it tests the presence of an unpaired surrogate, D800
, but D800
doesn't actually appear in the test case. I'm not sure exactly why this is, but I suspect some of the test cases (which I downloaded directly from https://www.w3.org/XML/Test/xmlconf-20020606.htm) may have encoding issues. Either that or I'm missing something?
from parse-xml.
Good catch.
Actually, just a random catch. I just happened to poke there.
not-wf-sa-168
This is a weird one. The file is utf8 and has 0xed 0xa0 0x80 in it which is actually 0xD800 in utf8 ( https://www.compart.com/en/unicode/U+D800 ) So, I think that's the right thing?
from parse-xml.
Ah, I think I see what's happening. When the test case is read by fs.readFile()
, 0xED, 0xA0, and 0x80 each become U+FFFD because Node's Buffer#toString()
refuses to produce an invalid UTF-8 string. Since U+FFFD is valid in an XML document, parsing succeeds when the test expects it to fail.
When I manually construct an XML doc in the Node.js REPL containing U+D800 and try to parse it, parseXml fails as it should:
> parseXml('<doc>\uD800</doc>')
Uncaught Error: Invalid character (line 1, column 6)
<doc>�</doc>
I'm not sure if there's a good way to read this test file and feed it to parseXml while preserving the invalid byte sequence, but I suppose I could convert it to a manual test. Will give this some thought.
from parse-xml.
Related Issues (18)
- Consider adding opt-in support for parsing XML 1.1 HOT 3
- RegExp issue with very long attributes HOT 1
- Does it support running in WebWorkers? HOT 2
- Very cool lib HOT 1
- xml.replace is not a function HOT 1
- Serialising back to XML
- Documentation says second argument is optional, but TS compiler says its required HOT 3
- Positional info HOT 6
- Include additional structured information in error objects HOT 7
- Add type information for errors
- Don't trim comment content
- Text content following CDATA is appended to a preceding `XmlCdata` node
- Option to ignore missing ends HOT 1
- Optionally include XML declarations and doctype declarations in the DOM
- Streams HOT 2
- 
 is interpreted as space instead of line feed HOT 3
- Parsing can hang on attributes with many references
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from parse-xml.