I don't know if this is too narrow a use case for this library, or if there is another

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Feature request: Return bytes at prefix about ijson HOT 2 CLOSED

icrar commented on July 21, 2024

Feature request: Return bytes at prefix

from ijson.

Comments (2)

jpmckinney commented on July 21, 2024 1

Thanks for the explanation! I’m convinced it will be simpler to just collapse the two steps in the end :)

from ijson.

rtobar commented on July 21, 2024

@jpmckinney if I understand your use case correctly from the description above, you basically want to iterate over your target object in the JSON content twice: once just to extract its raw bytes from the stream, then to actually parse it.

The feature you propose is far from trivial to implement actually. Almost all (if not all) parsing backends return already-parsed values, so in order to return the bytes at the given prefix we'd have to reconstruct the original bytes off the given values -- which is not even possible all the times. Even if we had access to the underlying bytes and offsets in all backends required to keep track of the original contents, notes that "return the bytes from the given prefix, without parsing them" is self-contradictory: there is no way to detect the end of a prefix's content without actually parsing the content coming after it. You can get away from creating the actual objects, but the parsing and lexing will still be necessary. Finally, even if all of this was implemented, you'd end up parsing the JSON content twice. In your case it might not be much of a concern given that you are dealing with small amounts, but it just doesn't sound necessary.

All in all this sounds like a huge amount of work, with high chances of no being possible to implement in the first place, and for little benefit. So unfortunately I'll have to mark this as "wont fix".

Back to your problem at hand, it's not entirely clear to me is why is it undesirable to use the collapsing of the two steps into one. On the one hand you mentioned you already have some code that accepts the extra prefix, but on the other you say that results on having to maintain "multiple pipelines". I'm not saying you don't have valid reasons, it's just that I don't understand them with the given information. A third alternative worth considering is using events interception: use ijson.parse to consume all the events leading to the beginning of the extra prefix element and ignore them, then pass the rest to ijson.items with your full prefix. I'm not sure if this falls into your category of collapsing two steps into one, so it might be as undesirable for your pipeline design as the first option you proposed.

from ijson.

Recommend Projects

Feature request: Return bytes at prefix about ijson HOT 2 CLOSED

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent