Git Product home page Git Product logo

Comments (2)

jpmckinney avatar jpmckinney commented on July 21, 2024 1

Thanks for the explanation! I’m convinced it will be simpler to just collapse the two steps in the end :)

from ijson.

rtobar avatar rtobar commented on July 21, 2024

@jpmckinney if I understand your use case correctly from the description above, you basically want to iterate over your target object in the JSON content twice: once just to extract its raw bytes from the stream, then to actually parse it.

The feature you propose is far from trivial to implement actually. Almost all (if not all) parsing backends return already-parsed values, so in order to return the bytes at the given prefix we'd have to reconstruct the original bytes off the given values -- which is not even possible all the times. Even if we had access to the underlying bytes and offsets in all backends required to keep track of the original contents, notes that "return the bytes from the given prefix, without parsing them" is self-contradictory: there is no way to detect the end of a prefix's content without actually parsing the content coming after it. You can get away from creating the actual objects, but the parsing and lexing will still be necessary. Finally, even if all of this was implemented, you'd end up parsing the JSON content twice. In your case it might not be much of a concern given that you are dealing with small amounts, but it just doesn't sound necessary.

All in all this sounds like a huge amount of work, with high chances of no being possible to implement in the first place, and for little benefit. So unfortunately I'll have to mark this as "wont fix".

Back to your problem at hand, it's not entirely clear to me is why is it undesirable to use the collapsing of the two steps into one. On the one hand you mentioned you already have some code that accepts the extra prefix, but on the other you say that results on having to maintain "multiple pipelines". I'm not saying you don't have valid reasons, it's just that I don't understand them with the given information. A third alternative worth considering is using events interception: use ijson.parse to consume all the events leading to the beginning of the extra prefix element and ignore them, then pass the rest to ijson.items with your full prefix. I'm not sure if this falls into your category of collapsing two steps into one, so it might be as undesirable for your pipeline design as the first option you proposed.

from ijson.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.