Git Product home page Git Product logo

neo4j-got's Introduction

Exploring and analysing the Game Of Thrones dataset using visualisation powered by vis.js

Using information from epic fantasy series for graph exploration and analysis seems to be really popular with the data science community, with many posts on Game of Thrones, Lord of the Rings, and Star Wars. I thought it would be fun to follow some of the posts along with GoT data, even if the finale of GoT television series was disappointing to say the least.
The visualisation capabilities of neovis.js is certainly more impactful, flexible (configuration options), and performant than provided in Neo4j's desktop browser.
image HTML available. Compared to the same cypher query in Neo4j browser: image

Some of the features listed for neovis.js used in the visualisation:

  1. Connect to Neo4j instance to get live data
  2. User specified labels and property to be displayed
  3. User specified Cypher query to populate
  4. Specify edge property for edge thickness
  5. Specify node property for community / clustering
  6. Specify node property for node size

But first, as usual, we start by cleaning and importing the data to the database.

Step 1: Import Data

We will be using data made public by GitHub user mathbeveridge, and presented by Andrew Beveridge and Jie Shan in their article Network of Thrones which was published in Math Horizons magazine. As the data is very much smaller than the yelp dataset which I had explored recently, it should be relatively fast to use the LOAD CSV cypher command directly from the browser command line.
The files are in CSV format with header "Source,Target,Type,weight,book": image Since the Type is all "undirected", we will ignore it. Before importing, it is a good practise to create any unique constraints first as we certainly do not want multiple nodes of the same person appearing in our graph and the corresponding index create will hopefully make the merge commands run faster.

CREATE CONSTRAINT ON (p:Person) ASSERT p.id IS UNIQUE
UNWIND ['1','2','3','45'] as book
LOAD CSV WITH HEADERS FROM 
'https://raw.githubusercontent.com/mathbeveridge/asoiaf/master/data/asoiaf-book' + book + '-edges.csv' as value
MERGE (source:Person{id:value.Source})
MERGE (target:Person{id:value.Target})
WITH source,target,value.weight as weight,book
CALL apoc.merge.relationship(
    source,
    'INTERACTS_' + book, 
    {}, 
    {weight:toFloat(weight)}, 
    target) YIELD rel
RETURN distinct 'done'

Step 2: Identify persons of interest

A node is said to be pivotal if it lies on all shortest paths between two other nodes in the network. We can find the top pivotal nodes using:

MATCH (a:Person), (b:Person) WHERE id(a) > id(b)
MATCH p=allShortestPaths((a)-[:INTERACTS_1*]-(b)) WITH collect(p) AS paths, a, b
MATCH (c:Person) WHERE all(x IN paths WHERE c IN nodes(x)) AND NOT c IN [a,b]
WITH collect(c.id) AS pivotal_nodes
UNWIND pivotal_nodes as node
RETURN node, COUNT(node) AS num ORDER BY num DESC

The top results are the usual suspects in book 1 with Daenerys-Targaryen not very high on the list:

"node" "num"
"Eddard-Stark" 2731
"Tyrion-Lannister" 2549
"Jon-Snow" 2157
"Catelyn-Stark" 1628
"Robert-Baratheon" 1530
"Drogo" 731
"Daenerys-Targaryen" 560
"Walder-Frey" 549
"Benjen-Stark" 382

In book 4 and 5, the Mother of Dragons (MoT) have shot to the top of the list with fan favorite Jon Snow.

"node" "num"
"Jon-Snow" 18688
"Daenerys-Targaryen" 18038
"Cersei-Lannister" 15859
"Stannis-Baratheon" 14354
"Jaime-Lannister" 12309
"Tyrion-Lannister" 9885
"Asha-Greyjoy" 7466
"Arya-Stark" 7419
"Theon-Greyjoy" 6208
"Victarion-Greyjoy" 5860

Degree of a node is the number of connections (or edges) that it has in the network. In this graph, the edges are the interaction between the characters. We can derive the degree using the following cypher:

MATCH (c:Person)
RETURN c.id AS person, size( (c)-[:INTERACTS_1]-() ) AS degree ORDER BY degree DESC
"person" "degree"
"Eddard-Stark" 66
"Robert-Baratheon" 50
"Tyrion-Lannister" 46
"Catelyn-Stark" 43
"Jon-Snow" 37
"Robb-Stark" 35
"Sansa-Stark" 35
"Bran-Stark" 32
"Cersei-Lannister" 30

Dear MoD does not feature in the top 10 in book 1 and is overshadowed by the Lannisters in book 4 and 5:

"person" "degree"
"Jaime-Lannister" 67
"Cersei-Lannister" 66
"Jon-Snow" 65
"Daenerys-Targaryen" 58
"Stannis-Baratheon" 57
"Tyrion-Lannister" 52
"Theon-Greyjoy" 35
"Brienne-of-Tarth" 29
"Sansa-Stark" 26
"Barristan-Selmy" 26

neo4j-got's People

Contributors

gzaifa avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.