Git Product home page Git Product logo

Comments (9)

rtobar avatar rtobar commented on August 23, 2024

@prrvchr thanks for opening this.

Could you have a look at #44? It sounds like this is exactly the same thing you're asking here in terms of what you'd like to see supported. As mentioned there, there is no such support at the moment for working with generators as sources of data, only file-like objects. Which is unfortunate, because internally we do turn the file-like objects into the generators.... You'll see in that issue too why it's not immediately trivial to support this, but it should be possible.

The only way at the moment you have to work around this limitation is to write a file-like class that internally advances your generator on every call to read, then pass an object of that class to ijson. I'm sure I've seen/used something like that before, I'll post some update here if I find something relevant.

Please confirm that these two issues are one and the same, and I'll close this as a duplicate of #44.

from ijson.

prrvchr avatar prrvchr commented on August 23, 2024

Hi rtobar,

Thanks for your API...
It does appear to be the same enhancement request.

I've looked at the code and it looks like if one could by pass the file_source(f, buf_size=64*1024) function and use directly the generator, the deal will be done...

from ijson.

rtobar avatar rtobar commented on August 23, 2024

@prrvchr thanks for confirming that, I'm closing this issue as a duplicate of #44 then, we can continue further discussions there.

I'll say this before closing though: like I mentioned before, it's not immediately trivial to do the change. Yes, what you found is the generator I was mentioning earlier to wrap file-like objects. That's not the issue. The problem is that when the main API functions inspect their input argument to decide what mode to work on (i.e., is this a file-like object, an async file-like object, etc?), generators are already treated in a particular way to support event interception. See for example

elif is_iterable(source):
and around. It's this breakage in behaviour, and also the peculiars of the yajl2_c backend that is implemented in C that needs to be taken care of separately (it doesn't use that file-object-to-generator function, for instance), that makes it more difficult than you'd anticipate to add the support you'd like to see.

from ijson.

rtobar avatar rtobar commented on August 23, 2024

See #58 (comment) for an (untested) example of a simple file-like wrapper around a generator.

from ijson.

prrvchr avatar prrvchr commented on August 23, 2024

I got lost, it's not the support of a generator that I'm looking for but just the possibility of parsing the content of an HTTP JSON response into several chunks.

I own the generator, it's the Requests iter_content() function that I get through the UNO API as a com.sun.star.container.XEnumeration interface.

But in fact the API does not seem to be designed to be used this way, since I would first have to be able to initialize the API with a parser (initialize a buffer) then after making successive calls, for each chunk of JSON, to an API's function allowing to parse these chunks.

It's a shame especially since I have to remain compatible with python 2.7 for OpenOffice and that there are not many choices left for JSON streaming...

from ijson.

rtobar avatar rtobar commented on August 23, 2024

Yes, I get that you own the generator, but the problem is that you can't pass a generator to ijson -- but you can wrap it in a file-like class like the one I linked above, and pass that to ijson.

Alternatively (I should have mentioned this before) you can iterate over your iterator yourself, and pass the individual chunks to ijson, see the "push interfaces" section of the docs. That's a more complex API, but it should work if you rather go in that direction.

from ijson.

prrvchr avatar prrvchr commented on August 23, 2024

Thanks for the "push interface", I use the ijson.sendable_list and it works perfectly (sample code)

It was necessary for me to use a modified version (which only uses relative imports) in order to be able to integrate and use the ijson package in my LibreOffice / OpenOffice .oxt file extension (in order to make the ijson API usable using only relative import...)

Maybe it deserves a request for improvement to allow me to do without my modified version...

Anyway thank you for your help and your API.

from ijson.

prrvchr avatar prrvchr commented on August 23, 2024

I have a strange behavior:

I get a parser with parser = ijson.parse_coro(events)
if I leave my iterator prematurely, by a break for example, when I have obtained the necessary data, then the program block when I try to close the parser with parser.close()

The error is:

yajl2.py, line 50, in basic_parse_basecoro
    raise exception(error)
ijson.common.IncompleteJSONError: parse error: premature EOF

In fact I'm trying to perform lazy loading and quit as soon as I have the data I'm looking for...

from ijson.

rtobar avatar rtobar commented on August 23, 2024

@prrvchr the error on parser.close is expected, as the underlying parser is being flushed but it realises that a full JSON document hasn't been pushed through. I think you can safely ignore the error if you know you are breaking prematurely; otherwise the error is useful for when you think you are pushing a full JSON document through, but you aren't actually.

I see you've opened another issue about the relative imports, which is great, I'd rather we don't mix too many topics in one issue. Likewise, if you have other issues with the push interface let's also discuss them in a new issue.

from ijson.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.