Comments (7)
hi @sheymann
you'll have to prepare your data a bit beforehand.
For example, you could:
- generate a list of all distinct node names (Michael, Selina, Rana, Selma...)
- assign them a sequential incremental ID, starting from 1
- write your nodes.csv file with the nodes in the same sequence order, similarly to
USERNAME
Michael
Selina
Rana
Selma
...
you can add additional columns/properties to your nodes if necessary
- write your relations.csv file with at least 3 columns: a source node, a target node, and relation type
The first 2 columns should reference the nodes using the sequential ID you chose before
For the 3rd column, I'm assuming simple friendship
SOURCE TARGET RELTYPE
1 2 friend
3 4 friend
1 4 friend
...
Hope this helps
from batch-import.
hm... looking at your profile @sheymann and your work on http://linkurio.us/
I guess your point was more about batch-import
supporting only an edge-list as input than how to convert the input data (something you surely have all figured out)
Anyway, it may help others
from batch-import.
Hey yes my question was focused on pure edge lists, as many complex networks datasets are encoded this way.
from batch-import.
@redapple Thanks for chiming, in. I think it would make sense to also support edge-only csv data and also allowing to use indexable keys in the start/end columns. Just thought about using https://github.com/jankotek/MapDB as an in memory cache.
from batch-import.
@sheymann Would you then just leave off the node file and assume that it is meant this way? This would also probably mean to support multiple relationship-files as for one file only one property-value mapping for nodes could then be realized.
from batch-import.
Well, this is an extreme case where we only know the graph structure, and we don't care about node properties (we may have edge properties though) :)
e.g. all of these datasets:
http://snap.stanford.edu/data/
from batch-import.
btw, I started a python helper module to export RDB data dumps into Neo4J
https://github.com/redapple/sql2graph
For now, it uses quite a lot of memory (when experimenting with MusicBrainz data)
I could write something similar to convert pure edge-lists into nodes.csv; rels.csv, index.csv... but it'd be in Python ;)
and having that support directly in batch-import would be easier/cleaner
from batch-import.
Related Issues (20)
- Make Chunker.BUFSIZE a configuration option HOT 1
- Index lookup in relationship file throws NumberFormatException HOT 1
- Cannot connect to graphdb created by batch-import tool in Windows OS HOT 3
- NumberFormatException when importing relationships file HOT 1
- Rename master branch to 1.9 HOT 1
- Slow Import / 2G nodes file HOT 2
- Import Error HOT 1
- Error running batch_importer_22 HOT 3
- No relations input
- How can I know which line is it crashing on? HOT 1
- View graph in Neo4j Browser after importing HOT 3
- Imported csv files successfully but query results are empty
- common interface for BatchInserter and GraphDatabaseService?
- Why Batch-Importer is not work when i import about one hundred million nodes?
- Improt error,emergency!!!!
- How can i use this tool to import data that may be duplicated HOT 1
- Download link to binary is broken HOT 2
- Failed to load csv in Neo4j Ver 3.3
- An exception occured while executing the Java class. More than one element in org.mapdb.Bind$5$1@46248627. First element is '983727989' and the second element is '997379223' HOT 1
- version 3.4.0
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from batch-import.