Comments (3)
Comment by voutilad
Thursday Jul 22, 2021 at 14:16 GMT
I've been thinking through how to implement this and it I see a few usability concerns. In short, I feel the actual problem exists in whatever created the JSON to begin with.
In the above example:
{
"info_file_properties_modified_time": "18446744062065078016",
"info_file_properties_modified_time2":18446744062065078016
}
How do those values represent time? Is it some number of picoseconds from an epoch? It doesn't appear to be nanoseconds since Unix epoch unless I'm mistaken.
I ask because thinking about this, I don't think having the ability to convert all numbers to String
s makes sense. It probably makes more of a headache because at some point String
to Integer
/Long
/etc. conversion will need to occur. Providing the ability to do it on a per-field basis is also challenging as JSON can contain nested data, think about:
{
"map": {
"aList": [
{ "value": 1 },
{ "value": 1111111111111111111111111111111111111111111}
]
}
}
Even if we converted to instances of Java's BigInteger
, there's no support for them in Neo4j and no way to inspect the type of a property using Cypher (that I know of).
Let's say we convert it instead to Java "String" instances...now you have a value that, again, you can't really inspect in Cypher and conditionally process if it's a valid Java Number
instance of a String
. Assuming the value ends up as a Node
or Relationship
property, you've got a case of mixed datatypes in the database and the behavior of filtering/sorting/aggregation will most likely be undesired.
For example, if I have nodes with label :Node
and a property of name
with various numeric, String
, and even temporal data types, then the following Cypher:
MATCH (n:Node)
RETURN id(n), n.name
ORDER BY n.name DESC
produces:
╒═══════╤════════════╕
│"id(n)"│"n.name" │
╞═══════╪════════════╡
│1 │123 │
├───────┼────────────┤
│3 │0 │
├───────┼────────────┤
│0 │"dave" │
├───────┼────────────┤
│4 │"0" │
├───────┼────────────┤
│2 │"2021-07-22"│
└───────┴────────────┘
from apoc.
Comment by voutilad
Thursday Jul 22, 2021 at 14:41 GMT
Another argument for why this seems to be a problem with the JSON itself...
If you look at what I'd consider de facto JSON-centric platforms, jq and Node, neither can handle the sample provided.
Given:
dave@neo-t490s:/tmp$ cat crap.json
{
"info_file_properties_modified_time": "18446744062065078016",
"info_file_properties_modified_time2": 18446744062065078016
}
jq
will truncate the precision:
dave@neo-t490s:/tmp$ cat crap.json | jq
{
"info_file_properties_modified_time": "18446744062065078016",
"info_file_properties_modified_time2": 18446744062065078000
}
With Node loading the same file, it does the same:
dave@neo-t490s:/tmp$ node -e 'console.log(require("/tmp/crap.json"))'
{
info_file_properties_modified_time: '18446744062065078016',
info_file_properties_modified_time2: 18446744062065078000
}
So something is amiss here!
On a whim, I took a look at the value in binary in Python:
>>> bin(18446744062065078016)
'0b1111111111111111111111111111110101001001111011110110111100000000'
Could it just be that the system producing this value goofed and didn't properly mask off the upper 32 bits? If you take the lower 32, that is 0b0101001001111011110110111100000000
, and parse it as a unix timestamp it looks like (to me) potentially the intended timestamp:
>>> time.gmtime(int(bin(18446744062065078016)[2:][32:], 2))
time.struct_time(tm_year=2009, tm_mon=4, tm_mday=22, tm_hour=19, tm_min=24, tm_sec=48, tm_wday=2, tm_yday=112, tm_isdst=0)
from apoc.
Closing due to #222 (comment)
The json should be manipulated manually, in cases like this.
from apoc.
Related Issues (20)
- more export formats HOT 1
- using Dijkstra with calculated properties HOT 3
- having periodic.iterate allow substitutions from parameters ala `#{row.type}` HOT 1
- Request: apoc.path.expand. Option to assume unlisted labels are whitelisted HOT 3
- support file-contents/binary data for load.csv/json/xml/cypher
- Evaluate `ProcedureCallContext` to optimize heavy procedures
- apoc.coll.different() returning false when it should return true HOT 3
- Failed to read config `/var/lib/neo4j/conf/neo4j.conf`: Unrecognized setting. No declared setting with name: `apoc.export.file.enabled` HOT 5
- difference between the maven central apoc-core jar and the one on the release page for version 5.x HOT 8
- Support apoc.dump.jdbc HOT 4
- Unable to find 4.x versions of APOC when running through docker HOT 9
- Parameter to make apoc.periodic.iterate() raise an error when any batch fails HOT 8
- The missing constraint error message for apoc.import.json() is incorrect HOT 1
- apoc.path.subgraphAll returns more relationships HOT 2
- Adding apoc.import.file.enabled=true would prevent db from starting HOT 9
- Aggregation function `toMap` HOT 1
- load.xml can't load entity reference. HOT 4
- apoc.export.cypher.all exports unique indexes as node keys HOT 3
- Docs Example Bug apoc.load.json docs search demo doesn't exist anymore
- apoc.nlp.aws.entities.graph and apoc.nlp.aws.entities.stream not returning results or debug information HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from apoc.