Comments (13)
I already implemented the a new map_type
option, it's on the master
branch. Could you give it a try?
from ijson.
ijson.items
(I'm assuming you are using this?) doesn't return a dict
by default; it returns an iterator that you can use to navigate individual values. You should be able to use that to build your OrderedDict
directly with the values coming out of the iteration, but without more details it would be difficult to give more advise.
from ijson.
Here is some sample code:
echo "{}{}" > test.json
import ijson.backends.yajl2_cffi as ijson
with open('test.json', 'rb') as f:
for item in ijson.common.items(ijson.parse(f, multiple_values=True), ''):
print(type(item))
Output is:
<class 'dict'>
<class 'dict'>
from ijson.
In your example you are selecting the top-level element, which is an object; thus you get dictionaries. Have you had a look at the examples in https://github.com/ICRAR/ijson/blob/master/README.rst? I think you are basically after the lower-level ijson.parse
function, but again without realistic JSON content I can't say for sure.
from ijson.
I essentially want the option for this line to be map = OrderedDict()
instead of map = {}
: https://github.com/isagalaev/ijson/blob/e252a50db34b71cc2b5e0b9a77cd76dee8e95005/ijson/common.py#L116
I can re-implement ijson.common.items
to use a new sub-class of ObjectBuilder
that contains the change above, but that seems like a lot of effort to get a change in behaviour that is very commonly used in the standard library's json module.
from ijson.
There are a couple of gotchas with modifying the object builder directly:
- It is used by most, but not all backends (the C back-end re-implements this logic in C for efficiency), so it's not exactly a one-liner.
- It would affect all levels in your JSON structure, which is not necessarily what you want. As far as I understand it, you want to preserve the order at the highest level, not necessarily in the deeper objects.
So again, I think such a change would be a bit of an overkill.
On the other hand, I think you basically want a modified version of this: isagalaev#62 (comment)
from collections import OrderedDict
import ijson
from ijson.common import ObjectBuilder
def objects(data):
key = '-'
builder = None
for prefix, event, value in ijson.parse(data):
if not prefix and event == 'map_key':
if builder:
yield key, builder.value
key = value
builder = ObjectBuilder()
elif prefix.startswith(key):
builder.event(event, value)
if builder:
yield key, builder.value
with open('json.json', 'rb') as data:
result = OrderedDict(objects(data))
for key, value in result.items():
print(key, value)
from ijson.
I do want it to affect all levels :) (like with the object_pairs_hook
in the standard library). Converting to an OrderedDict at the top level is easy – just OrderedDict(item)
in my earlier snippet.
from ijson.
Right now, I need to do something like this (won't work with all backends):
import ijson.backends.yajl2_cffi as ijson
# Copy of ijson.common.items, using different builder.
def items(prefixed_events, prefix):
prefixed_events = iter(prefixed_events)
try:
while True:
current, event, value = next(prefixed_events)
if current == prefix:
if event in ('start_map', 'start_array'):
builder = OrderedObjectBuilder()
end_event = event.replace('start', 'end')
while (current, event) != (prefix, end_event):
builder.event(event, value)
current, event, value = next(prefixed_events)
del builder.containers[:]
yield builder.value
else:
yield value
except StopIteration:
pass
# Copy of ObjectBuilder, using OrderedDict instead of dict.
class OrderedObjectBuilder(ijson.common.ObjectBuilder):
def event(self, event, value):
if event == 'start_map':
map = OrderedDict()
self.containers[-1](map)
def setter(value):
map[self.key] = value
self.containers.append(setter)
else:
super().event(event, value)
Later, my code calls items
.
from ijson.
@jpmckinney thanks for the pointer to the mechanism used by the standard lib, I actually didn't know about it. Such a generic solution sounds good actually i.e., provide a object_pairs_hook
argument or similar that can be used by the object builder. Do you think you could provide a PR with the change, including a test? Otherwise I could implement when I get some time, probably next week. Note that the C backend will need this as well.
from ijson.
@jpmckinney while we are at this, maybe offering an option for using something other than lists could also be a possibility worth considering.
from ijson.
I'm not sure that I can get to it this week – and I'm not familiar with Python in C.
Using something other than lists sounds interesting; however, it hasn't come up as an option in the standard library. I think it's fine to start with object_pairs_hook
.
The standard library does allow alternative constructors through parse_float
, parse_int
, parse_constant
, and object_hook
(which is lower priority than object_pairs_hook
). Another optional parameter is cls
, which allows even more customization of decoding. However, I think these can be implemented later if there is demand (for example, I prefer ijson's default behaviour of using Decimal
instead of float
).
from ijson.
It works! Thanks
from ijson.
Great! I'll close then for now, and will also adjust the title for future reference.
from ijson.
Related Issues (20)
- YAJL backend not working in Win/Conda HOT 3
- Segmentation fault with yajl2 backend HOT 3
- How to define a prefix where an object contains dot characters HOT 2
- Generators created by ijson.items() do not seem to have an end HOT 5
- Nested structure reading HOT 2
- Use stacklevel to point str vs bytes warning to user code HOT 3
- Release wheels for 3.11 HOT 6
- How to determine which backend is being used at runtime? HOT 3
- Is the yajl_c backend supported on PyPy? HOT 7
- High level interface to iterate over lists HOT 3
- HighLevelAPI: Raise an error if the prefix does not exist HOT 2
- Is it possible to use multiple prefix HOT 8
- yajl2_c backend for lambda function HOT 2
- How to use ijson to covert string to dict? HOT 3
- How to read json records in chunks using ijson? HOT 4
- Question: is it possible that returing bytes instead of str could speedup parsing? HOT 3
- Thread safety HOT 9
- Full support for byte stream generator HOT 9
- Allow to use ijson package by a relative import HOT 4
- How can I most-efficiently check for a key in the top-level of a json object? HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ijson.