Git Product home page Git Product logo

neo4r's Introduction

Travis-CI Build Status

lifecycle

Disclaimer: this package is still under active development. Read the NEWS.md to be informed of the last changes.

Read complementary documentation at https://neo4j-rstats.github.io/user-guide/

neo4r

The goal of {neo4r} is to provide a modern and flexible Neo4J driver for R.

It’s modern in the sense that the results are returned as tibbles whenever possible, it relies on modern tools, and it is designed to work with pipes. Our goal is to provide a driver that can be easily integrated in a data analysis workflow, especially by providing an API working smoothly with other data analysis ({dplyr} or {purrr}) and graph packages ({igraph}, {ggraph}, {visNetwork}…).

It’s flexible in the sense that it is rather unopinionated regarding the way it returns the results, by trying to stay as close as possible to the way Neo4J returns data. That way, you have the control over the way you will compute the results. At the same time, the result is not too complex, so that the “heavy lifting” of data wrangling is not left to the user.

The connexion object is also an easy to control R6 method, allowing you to update and query information from the API.

Server Connection

Please note that for now, the connection is only possible through http / https.

Installation

You can install {neo4r} from GitHub with:

# install.packages("remotes")
remotes::install_github("neo4j-rstats/neo4r")

or from CRAN :

install.packages("neo4r")

Create a connexion object

Start by creating a new connexion object with neo4j_api$new

library(neo4r)
con <- neo4j_api$new(
  url = "http://localhost:7474", 
  user = "neo4j", 
  password = "plop"
)

This connexion object is designed to interact with the Neo4J API.

It comes with some methods to retrieve information from it. ping(), for example, tests if the endpoint is available.

# Test the endpoint, that will not work :
con$ping()
#> [1] 401

Being an R6 object, con is flexible in the sense that you can change url, user and password at any time:

con$reset_user("neo4j")
con$reset_password("password") 
con$ping()
#> [1] 200

Other methods:

# Get Neo4J Version
con$get_version()
#> [1] "3.5.5"
# List constaints (if any)
con$get_constraints()
#> Null data.table (0 rows and 0 cols)
# Get a vector of labels (if any)
con$get_labels()
#> # A tibble: 0 x 1
#> # … with 1 variable: labels <chr>
# Get a vector of relationships (if any)
con$get_relationships()
#> # A tibble: 0 x 1
#> # … with 1 variable: labels <chr>
# Get index 
con$get_index()
#> Null data.table (0 rows and 0 cols)

Call the API

You can either create a separate query or insert it inside the call_neo4j function.

The call_neo4j() function takes several arguments :

  • query : the cypher query
  • con : the connexion object
  • type : “rows” or “graph”: whether to return the results as a list of results in tibble, or as a graph object (with $nodes and $relationships)
  • output : the output format (R or json)
  • include_stats : whether or not to include the stats about the call
  • meta : whether or not to include the meta arguments of the nodes when calling with “rows”

The movie graph

Starting at version 0.1.3, the play_movie() function returns the full cypher query to create the movie graph example from the Neo4J examples.

play_movies() %>%
  call_neo4j(con)
#> $a
#> # A tibble: 10 x 2
#>     born name     
#>    <int> <chr>    
#>  1  1956 Tom Hanks
#>  2  1956 Tom Hanks
#>  3  1956 Tom Hanks
#>  4  1956 Tom Hanks
#>  5  1956 Tom Hanks
#>  6  1956 Tom Hanks
#>  7  1956 Tom Hanks
#>  8  1956 Tom Hanks
#>  9  1956 Tom Hanks
#> 10  1956 Tom Hanks
#> 
#> $m
#> # A tibble: 10 x 3
#>    tagline                                         title           released
#>    <chr>                                           <chr>              <int>
#>  1 In every life there comes a time when that thi… That Thing You…     1996
#>  2 Once in a lifetime you get a chance to do some… A League of Th…     1992
#>  3 What if someone you never met, someone you nev… Sleepless in S…     1993
#>  4 A stiff drink. A little mascara. A lot of nerv… Charlie Wilson…     2007
#>  5 At the edge of the world, his journey begins.   Cast Away           2000
#>  6 Walk a mile youll never forget.                 The Green Mile      1999
#>  7 Break The Codes                                 The Da Vinci C…     2006
#>  8 This Holiday Season… Believe                    The Polar Expr…     2004
#>  9 A story of love, lava and burning desire.       Joe Versus the…     1990
#> 10 Everything is connected                         Cloud Atlas         2012
#> 
#> $d
#> # A tibble: 10 x 2
#>     born name                
#>    <int> <chr>               
#>  1  1956 Tom Hanks           
#>  2  1943 Penny Marshall      
#>  3  1941 Nora Ephron         
#>  4  1931 Mike Nichols        
#>  5  1951 Robert Zemeckis     
#>  6  1959 Frank Darabont      
#>  7  1954 Ron Howard          
#>  8  1951 Robert Zemeckis     
#>  9  1950 John Patrick Stanley
#> 10  1965 Tom Tykwer          
#> 
#> attr(,"class")
#> [1] "neo"  "list"

“rows” format

The user chooses whether or not to return a list of tibbles when calling the API. You get as many objects as specified in the RETURN cypher statement.

library(magrittr)

'MATCH (tom {name: "Tom Hanks"}) RETURN tom;' %>%
  call_neo4j(con)
#> $tom
#> # A tibble: 1 x 2
#>    born name     
#>   <int> <chr>    
#> 1  1956 Tom Hanks
#> 
#> attr(,"class")
#> [1] "neo"  "list"

'MATCH (cloudAtlas {title: "Cloud Atlas"}) RETURN cloudAtlas;' %>%
  call_neo4j(con)
#> $cloudAtlas
#> # A tibble: 1 x 3
#>   tagline                 title       released
#>   <chr>                   <chr>          <int>
#> 1 Everything is connected Cloud Atlas     2012
#> 
#> attr(,"class")
#> [1] "neo"  "list"

"MATCH (people:Person)-[relatedTo]-(:Movie {title: 'Cloud Atlas'}) RETURN people.name, Type(relatedTo), relatedTo" %>%
  call_neo4j(con, type = 'row')
#> $people.name
#> # A tibble: 10 x 1
#>    value           
#>    <chr>           
#>  1 Tom Hanks       
#>  2 Jim Broadbent   
#>  3 David Mitchell  
#>  4 Tom Tykwer      
#>  5 Lana Wachowski  
#>  6 Stefan Arndt    
#>  7 Jessica Thompson
#>  8 Halle Berry     
#>  9 Hugo Weaving    
#> 10 Lilly Wachowski 
#> 
#> $`Type(relatedTo)`
#> # A tibble: 10 x 1
#>    value   
#>    <chr>   
#>  1 ACTED_IN
#>  2 ACTED_IN
#>  3 WROTE   
#>  4 DIRECTED
#>  5 DIRECTED
#>  6 PRODUCED
#>  7 REVIEWED
#>  8 ACTED_IN
#>  9 ACTED_IN
#> 10 DIRECTED
#> 
#> $relatedTo
#> # A tibble: 18 x 3
#>    roles     summary            rating
#>    <list>    <chr>               <int>
#>  1 <chr [1]> <NA>                   NA
#>  2 <chr [1]> <NA>                   NA
#>  3 <chr [1]> <NA>                   NA
#>  4 <chr [1]> <NA>                   NA
#>  5 <chr [1]> <NA>                   NA
#>  6 <chr [1]> <NA>                   NA
#>  7 <chr [1]> <NA>                   NA
#>  8 <NULL>    An amazing journey     95
#>  9 <chr [1]> <NA>                   NA
#> 10 <chr [1]> <NA>                   NA
#> 11 <chr [1]> <NA>                   NA
#> 12 <chr [1]> <NA>                   NA
#> 13 <chr [1]> <NA>                   NA
#> 14 <chr [1]> <NA>                   NA
#> 15 <chr [1]> <NA>                   NA
#> 16 <chr [1]> <NA>                   NA
#> 17 <chr [1]> <NA>                   NA
#> 18 <chr [1]> <NA>                   NA
#> 
#> attr(,"class")
#> [1] "neo"  "list"

By default, results are returned as an R list of tibbles. For example here, RETURN tom will return a one element list, with object named tom. We think this is the more “truthful” way to implement the outputs regarding Neo4J calls.

When you want to return two nodes types, you’ll get two results, in the form of two tibbles - the result is a two elements list with each element being labelled the way it has been specified in the Cypher query.

'MATCH (tom:Person {name: "Tom Hanks"})-[:ACTED_IN]->(tomHanksMovies) RETURN tom,tomHanksMovies' %>%
  call_neo4j(con)
#> $tom
#> # A tibble: 12 x 2
#>     born name     
#>    <int> <chr>    
#>  1  1956 Tom Hanks
#>  2  1956 Tom Hanks
#>  3  1956 Tom Hanks
#>  4  1956 Tom Hanks
#>  5  1956 Tom Hanks
#>  6  1956 Tom Hanks
#>  7  1956 Tom Hanks
#>  8  1956 Tom Hanks
#>  9  1956 Tom Hanks
#> 10  1956 Tom Hanks
#> 11  1956 Tom Hanks
#> 12  1956 Tom Hanks
#> 
#> $tomHanksMovies
#> # A tibble: 12 x 3
#>    tagline                                         title           released
#>    <chr>                                           <chr>              <int>
#>  1 Houston, we have a problem.                     Apollo 13           1995
#>  2 At odds in life... in love on-line.             Youve Got Mail      1998
#>  3 Once in a lifetime you get a chance to do some… A League of Th…     1992
#>  4 A story of love, lava and burning desire.       Joe Versus the…     1990
#>  5 In every life there comes a time when that thi… That Thing You…     1996
#>  6 Break The Codes                                 The Da Vinci C…     2006
#>  7 Everything is connected                         Cloud Atlas         2012
#>  8 At the edge of the world, his journey begins.   Cast Away           2000
#>  9 Walk a mile youll never forget.                 The Green Mile      1999
#> 10 What if someone you never met, someone you nev… Sleepless in S…     1993
#> 11 This Holiday Season… Believe                    The Polar Expr…     2004
#> 12 A stiff drink. A little mascara. A lot of nerv… Charlie Wilson…     2007
#> 
#> attr(,"class")
#> [1] "neo"  "list"

Results can also be returned in JSON, for example for writing to a file:

tmp <- tempfile(fileext = ".json")
'MATCH (people:Person) RETURN people.name LIMIT 1' %>%
  call_neo4j(con, output = "json") %>%
  write(tmp)
jsonlite::read_json(tmp)
#> [[1]]
#> [[1]][[1]]
#> [[1]][[1]]$row
#> [[1]][[1]]$row[[1]]
#> [[1]][[1]]$row[[1]][[1]]
#> [1] "Keanu Reeves"
#> 
#> 
#> 
#> [[1]][[1]]$meta
#> [[1]][[1]]$meta[[1]]
#> named list()

If you turn the type argument to "graph", you’ll get a graph result:

'MATCH (tom:Person {name: "Tom Hanks"})-[act:ACTED_IN]->(tomHanksMovies) RETURN act,tom,tomHanksMovies' %>%
  call_neo4j(con, type = "graph")
#> $nodes
#> # A tibble: 13 x 3
#>    id    label     properties
#>    <chr> <list>    <list>    
#>  1 144   <chr [1]> <list [3]>
#>  2 71    <chr [1]> <list [2]>
#>  3 67    <chr [1]> <list [3]>
#>  4 162   <chr [1]> <list [3]>
#>  5 78    <chr [1]> <list [3]>
#>  6 85    <chr [1]> <list [3]>
#>  7 111   <chr [1]> <list [3]>
#>  8 105   <chr [1]> <list [3]>
#>  9 150   <chr [1]> <list [3]>
#> 10 130   <chr [1]> <list [3]>
#> 11 73    <chr [1]> <list [3]>
#> 12 161   <chr [1]> <list [3]>
#> 13 159   <chr [1]> <list [3]>
#> 
#> $relationships
#> # A tibble: 12 x 5
#>    id    type     startNode endNode properties
#>    <chr> <chr>    <chr>     <chr>   <list>    
#>  1 202   ACTED_IN 71        144     <list [1]>
#>  2 84    ACTED_IN 71        67      <list [1]>
#>  3 234   ACTED_IN 71        162     <list [1]>
#>  4 98    ACTED_IN 71        78      <list [1]>
#>  5 110   ACTED_IN 71        85      <list [1]>
#>  6 146   ACTED_IN 71        111     <list [1]>
#>  7 137   ACTED_IN 71        105     <list [1]>
#>  8 213   ACTED_IN 71        150     <list [1]>
#>  9 182   ACTED_IN 71        130     <list [1]>
#> 10 91    ACTED_IN 71        73      <list [1]>
#> 11 232   ACTED_IN 71        161     <list [1]>
#> 12 228   ACTED_IN 71        159     <list [1]>
#> 
#> attr(,"class")
#> [1] "neo"  "list"

The result is returned as one node or relationship by row.

Due to the specific data format of Neo4J, there can be more than one label and property by node and relationship. That’s why the results is returned, by design, as a list-dataframe.

We have designed several functions to unnest the output :

+unnest_nodes(), that can unnest a node dataframe :

res <- 'MATCH (tom:Person {name:"Tom Hanks"})-[a:ACTED_IN]->(m)<-[:ACTED_IN]-(coActors) RETURN m AS acted,coActors.name' %>%
  call_neo4j(con, type = "graph")
unnest_nodes(res$nodes)
#> # A tibble: 11 x 5
#>    id    value tagline                                title        released
#>    <chr> <chr> <chr>                                  <chr>           <int>
#>  1 144   Movie Houston, we have a problem.            Apollo 13        1995
#>  2 67    Movie At odds in life... in love on-line.    Youve Got M…     1998
#>  3 162   Movie Once in a lifetime you get a chance t… A League of…     1992
#>  4 78    Movie A story of love, lava and burning des… Joe Versus …     1990
#>  5 85    Movie In every life there comes a time when… That Thing …     1996
#>  6 111   Movie Break The Codes                        The Da Vinc…     2006
#>  7 105   Movie Everything is connected                Cloud Atlas      2012
#>  8 150   Movie At the edge of the world, his journey… Cast Away        2000
#>  9 130   Movie Walk a mile youll never forget.        The Green M…     1999
#> 10 73    Movie What if someone you never met, someon… Sleepless i…     1993
#> 11 159   Movie A stiff drink. A little mascara. A lo… Charlie Wil…     2007

Please, note that this function will return NA for the properties that aren’t in a node.

Also, it is possible to unnest either the properties or the labels :

res %>%
  extract_nodes() %>%
  unnest_nodes(what = "properties")
#> # A tibble: 11 x 5
#>    id    label   tagline                              title        released
#>    <chr> <list>  <chr>                                <chr>           <int>
#>  1 144   <chr [… Houston, we have a problem.          Apollo 13        1995
#>  2 67    <chr [… At odds in life... in love on-line.  Youve Got M…     1998
#>  3 162   <chr [… Once in a lifetime you get a chance… A League of…     1992
#>  4 78    <chr [… A story of love, lava and burning d… Joe Versus …     1990
#>  5 85    <chr [… In every life there comes a time wh… That Thing …     1996
#>  6 111   <chr [… Break The Codes                      The Da Vinc…     2006
#>  7 105   <chr [… Everything is connected              Cloud Atlas      2012
#>  8 150   <chr [… At the edge of the world, his journ… Cast Away        2000
#>  9 130   <chr [… Walk a mile youll never forget.      The Green M…     1999
#> 10 73    <chr [… What if someone you never met, some… Sleepless i…     1993
#> 11 159   <chr [… A stiff drink. A little mascara. A … Charlie Wil…     2007
res %>%
  extract_nodes() %>%
  unnest_nodes(what = "label")
#> # A tibble: 11 x 3
#>    id    properties value
#>    <chr> <list>     <chr>
#>  1 144   <list [3]> Movie
#>  2 67    <list [3]> Movie
#>  3 162   <list [3]> Movie
#>  4 78    <list [3]> Movie
#>  5 85    <list [3]> Movie
#>  6 111   <list [3]> Movie
#>  7 105   <list [3]> Movie
#>  8 150   <list [3]> Movie
#>  9 130   <list [3]> Movie
#> 10 73    <list [3]> Movie
#> 11 159   <list [3]> Movie
  • unnest_relationships()

There is only one nested column in the relationship table, thus the function is quite straightforward :

'MATCH (people:Person)-[relatedTo]-(:Movie {title: "Cloud Atlas"}) RETURN people.name, Type(relatedTo), relatedTo' %>%
  call_neo4j(con, type = "graph") %>%
  extract_relationships() %>%
  unnest_relationships()
#> # A tibble: 23 x 8
#>    id    type     startNode endNode roles     value summary rating
#>    <chr> <chr>    <chr>     <chr>   <list>    <lgl> <chr>    <int>
#>  1 137   ACTED_IN 71        105     <chr [1]> NA    <NA>        NA
#>  2 137   ACTED_IN 71        105     <chr [1]> NA    <NA>        NA
#>  3 137   ACTED_IN 71        105     <chr [1]> NA    <NA>        NA
#>  4 137   ACTED_IN 71        105     <chr [1]> NA    <NA>        NA
#>  5 140   ACTED_IN 107       105     <chr [1]> NA    <NA>        NA
#>  6 140   ACTED_IN 107       105     <chr [1]> NA    <NA>        NA
#>  7 140   ACTED_IN 107       105     <chr [1]> NA    <NA>        NA
#>  8 144   WROTE    109       105     <NULL>    NA    <NA>        NA
#>  9 141   DIRECTED 108       105     <NULL>    NA    <NA>        NA
#> 10 143   DIRECTED 6         105     <NULL>    NA    <NA>        NA
#> # … with 13 more rows

Note that unnest_relationships() only does one level of unnesting.

  • unnest_graph

This function takes a graph results, and does unnest_nodes and unnest_relationships.

'MATCH (people:Person)-[relatedTo]-(:Movie {title: "Cloud Atlas"}) RETURN people.name, Type(relatedTo), relatedTo' %>%
  call_neo4j(con, type = "graph") %>%
  unnest_graph()
#> $nodes
#> # A tibble: 11 x 7
#>    id    value   born name           tagline             title     released
#>    <chr> <chr>  <int> <chr>          <chr>               <chr>        <int>
#>  1 71    Person  1956 Tom Hanks      <NA>                <NA>            NA
#>  2 105   Movie     NA <NA>           Everything is conn… Cloud At…     2012
#>  3 107   Person  1949 Jim Broadbent  <NA>                <NA>            NA
#>  4 109   Person  1969 David Mitchell <NA>                <NA>            NA
#>  5 108   Person  1965 Tom Tykwer     <NA>                <NA>            NA
#>  6 6     Person  1965 Lana Wachowski <NA>                <NA>            NA
#>  7 110   Person  1961 Stefan Arndt   <NA>                <NA>            NA
#>  8 169   Person    NA Jessica Thomp… <NA>                <NA>            NA
#>  9 106   Person  1966 Halle Berry    <NA>                <NA>            NA
#> 10 4     Person  1960 Hugo Weaving   <NA>                <NA>            NA
#> 11 5     Person  1967 Lilly Wachows… <NA>                <NA>            NA
#> 
#> $relationships
#> # A tibble: 23 x 8
#>    id    type     startNode endNode roles     value summary rating
#>    <chr> <chr>    <chr>     <chr>   <list>    <lgl> <chr>    <int>
#>  1 137   ACTED_IN 71        105     <chr [1]> NA    <NA>        NA
#>  2 137   ACTED_IN 71        105     <chr [1]> NA    <NA>        NA
#>  3 137   ACTED_IN 71        105     <chr [1]> NA    <NA>        NA
#>  4 137   ACTED_IN 71        105     <chr [1]> NA    <NA>        NA
#>  5 140   ACTED_IN 107       105     <chr [1]> NA    <NA>        NA
#>  6 140   ACTED_IN 107       105     <chr [1]> NA    <NA>        NA
#>  7 140   ACTED_IN 107       105     <chr [1]> NA    <NA>        NA
#>  8 144   WROTE    109       105     <NULL>    NA    <NA>        NA
#>  9 141   DIRECTED 108       105     <NULL>    NA    <NA>        NA
#> 10 143   DIRECTED 6         105     <NULL>    NA    <NA>        NA
#> # … with 13 more rows
#> 
#> attr(,"class")
#> [1] "neo"  "list"

Extraction

There are two convenient functions to extract nodes and relationships:

'MATCH (bacon:Person {name:"Kevin Bacon"})-[*1..4]-(hollywood) RETURN DISTINCT hollywood' %>%
  call_neo4j(con, type = "graph") %>% 
  extract_nodes()
#> # A tibble: 135 x 3
#>    id    label     properties
#>    <chr> <list>    <list>    
#>  1 72    <chr [1]> <list [2]>
#>  2 68    <chr [1]> <list [2]>
#>  3 54    <chr [1]> <list [2]>
#>  4 34    <chr [1]> <list [2]>
#>  5 70    <chr [1]> <list [2]>
#>  6 69    <chr [1]> <list [2]>
#>  7 67    <chr [1]> <list [3]>
#>  8 163   <chr [1]> <list [2]>
#>  9 166   <chr [1]> <list [2]>
#> 10 77    <chr [1]> <list [2]>
#> # … with 125 more rows
'MATCH p=shortestPath(
  (bacon:Person {name:"Kevin Bacon"})-[*]-(meg:Person {name:"Meg Ryan"})
)
RETURN p' %>%
  call_neo4j(con, type = "graph") %>% 
  extract_relationships()
#> # A tibble: 4 x 5
#>   id    type     startNode endNode properties
#>   <chr> <chr>    <chr>     <chr>   <list>    
#> 1 202   ACTED_IN 71        144     <list [1]>
#> 2 203   ACTED_IN 19        144     <list [1]>
#> 3 91    ACTED_IN 71        73      <list [1]>
#> 4 92    ACTED_IN 34        73      <list [1]>

Convert for common graph packages

{igraph}

In order to be converted into a graph object:

  • The nodes should be a dataframe with the first column being a series of unique ID, understood as “names” by igraph - these are the ID columns from Neo4J. Other columns are considered attributes.

  • relationships need a start and an end, i.e. startNode and endNode in the Neo4J results.

Here how to create a graph object from a {neo4r} result:

G <- "MATCH a=(p:Person {name: 'Tom Hanks'})-[r:ACTED_IN]->(m:Movie) RETURN a;" %>% 
  call_neo4j(con, type = "graph") 

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(purrr)
#> 
#> Attaching package: 'purrr'
#> The following object is masked from 'package:magrittr':
#> 
#>     set_names
# Create a dataframe with col 1 being the ID, 
# And columns 2 being the names
G$nodes <- G$nodes %>%
  unnest_nodes(what = "properties") %>% 
  # We're extracting the first label of each node, but 
  # this column can also be removed if not needed
  mutate(label = map_chr(label, 1))
head(G$nodes)
#> # A tibble: 6 x 7
#>   id    label  tagline                     title      released  born name  
#>   <chr> <chr>  <chr>                       <chr>         <int> <int> <chr> 
#> 1 144   Movie  Houston, we have a problem. Apollo 13      1995    NA <NA>  
#> 2 71    Person <NA>                        <NA>             NA  1956 Tom H…
#> 3 67    Movie  At odds in life... in love… Youve Got…     1998    NA <NA>  
#> 4 162   Movie  Once in a lifetime you get… A League …     1992    NA <NA>  
#> 5 78    Movie  A story of love, lava and … Joe Versu…     1990    NA <NA>  
#> 6 85    Movie  In every life there comes … That Thin…     1996    NA <NA>

We then reorder the relationnship table:

G$relationships <- G$relationships %>%
  unnest_relationships() %>%
  select(startNode, endNode, type, everything()) %>%
  mutate(roles = unlist(roles))
head(G$relationships)
#> # A tibble: 6 x 5
#>   startNode endNode type     id    roles             
#>   <chr>     <chr>   <chr>    <chr> <chr>             
#> 1 71        144     ACTED_IN 202   Jim Lovell        
#> 2 71        67      ACTED_IN 84    Joe Fox           
#> 3 71        162     ACTED_IN 234   Jimmy Dugan       
#> 4 71        78      ACTED_IN 98    Joe Banks         
#> 5 71        85      ACTED_IN 110   Mr. White         
#> 6 71        111     ACTED_IN 146   Dr. Robert Langdon
graph_object <- igraph::graph_from_data_frame(
  d = G$relationships, 
  directed = TRUE, 
  vertices = G$nodes
)
plot(graph_object)

This can also be used with {ggraph} :

library(ggraph)
#> Loading required package: ggplot2
graph_object %>%
  ggraph() + 
  geom_node_label(aes(label = label)) +
  geom_edge_link() + 
  theme_graph()
#> Using `nicely` as default layout

{visNetwork}

{visNetwork} expects the following format :

nodes

  • “id” : id of the node, needed in edges information
  • “label” : label of the node
  • “group” : group of the node. Groups can be configure with visGroups
  • “value” : size of the node
  • “title” : tooltip of the node

edges

  • “from” : node id of begin of the edge
  • “to” : node id of end of the edge
  • “label” : label of the edge
  • “value” : size of the node
  • “title” : tooltip of the node

(from ?visNetwork::visNetwork).

visNetwork is smart enough to transform a list column into several label, so we don’t have to worry too much about this one.

Here’s how to convert our {neo4r} result:

G <-"MATCH a=(p:Person {name: 'Tom Hanks'})-[r:ACTED_IN]->(m:Movie) RETURN a;" %>% 
  call_neo4j(con, type = "graph") 

# We'll just unnest the properties
G$nodes <- G$nodes %>%
  unnest_nodes(what = "properties")
head(G$nodes)  

# Turn the relationships :
G$relationships <- G$relationships %>%
  unnest_relationships() %>%
  select(from = startNode, to = endNode, label = type)
head(G$relationships)

visNetwork::visNetwork(G$nodes, G$relationships)

Sending data to the API

You can simply send queries has we have just seen, by writing the cypher query and call the api.

Transform elements to cypher queries

  • vec_to_cypher() creates a list :
vec_to_cypher(iris[1, 1:3], "Species")
#> [1] "(:`Species` {`Sepal.Length`: '5.1', `Sepal.Width`: '3.5', `Petal.Length`: '1.4'})"
  • and vec_to_cypher_with_var() creates a cypher call starting with a variable :
vec_to_cypher_with_var(iris[1, 1:3], "Species", a)
#> [1] "(a:`Species` {`Sepal.Length`: '5.1', `Sepal.Width`: '3.5', `Petal.Length`: '1.4'})"

This can be combined inside a cypher call:

paste("MERGE", vec_to_cypher(iris[1, 1:3], "Species"))
#> [1] "MERGE (:`Species` {`Sepal.Length`: '5.1', `Sepal.Width`: '3.5', `Petal.Length`: '1.4'})"

Reading and sending a cypher file :

  • read_cypher reads a cypher file and returns a tibble of all the calls:
read_cypher("data-raw/create.cypher")
#> # A tibble: 4 x 1
#>   cypher                                                                   
#>   <chr>                                                                    
#> 1 CREATE CONSTRAINT ON (b:Band) ASSERT b.name IS UNIQUE;                   
#> 2 CREATE CONSTRAINT ON (c:City) ASSERT c.name IS UNIQUE;                   
#> 3 CREATE CONSTRAINT ON (r:record) ASSERT r.name IS UNIQUE;                 
#> 4 CREATE (ancient:Band {name: 'Ancient', formed: 1992}), (acturus:Band {na…
  • send_cypher reads a cypher file, and send it the the API. By default, the stats are returned.
send_cypher("data-raw/constraints.cypher", con)
#> No data returned.
#> No data returned.
#> No data returned.
#> [[1]]
#> # A tibble: 12 x 2
#>    type                  value
#>    <chr>                 <dbl>
#>  1 contains_updates          1
#>  2 nodes_created             0
#>  3 nodes_deleted             0
#>  4 properties_set            0
#>  5 relationships_created     0
#>  6 relationship_deleted      0
#>  7 labels_added              0
#>  8 labels_removed            0
#>  9 indexes_added             0
#> 10 indexes_removed           0
#> 11 constraints_added         1
#> 12 constraints_removed       0
#> 
#> [[2]]
#> # A tibble: 12 x 2
#>    type                  value
#>    <chr>                 <dbl>
#>  1 contains_updates          1
#>  2 nodes_created             0
#>  3 nodes_deleted             0
#>  4 properties_set            0
#>  5 relationships_created     0
#>  6 relationship_deleted      0
#>  7 labels_added              0
#>  8 labels_removed            0
#>  9 indexes_added             0
#> 10 indexes_removed           0
#> 11 constraints_added         1
#> 12 constraints_removed       0
#> 
#> [[3]]
#> # A tibble: 12 x 2
#>    type                  value
#>    <chr>                 <dbl>
#>  1 contains_updates          1
#>  2 nodes_created             0
#>  3 nodes_deleted             0
#>  4 properties_set            0
#>  5 relationships_created     0
#>  6 relationship_deleted      0
#>  7 labels_added              0
#>  8 labels_removed            0
#>  9 indexes_added             0
#> 10 indexes_removed           0
#> 11 constraints_added         1
#> 12 constraints_removed       0

Sending csv dataframe to Neo4J

The load_csv sends an csv from an url to the Neo4J browser.

The args are :

  • on_load : the code to execute on load
  • con : the connexion object
  • url : the url of the csv to send
  • header : whether or not the csv has a header
  • periodic_commit : the volume for PERIODIC COMMIT
  • as : the AS argument for LOAD CSV
  • format : the format of the result
  • include_stats : whether or not to include the stats
  • meta : whether or not to return the meta information

Let’s use Neo4J northwind-graph example for that.

# Create the query that will create the nodes and relationships
on_load_query <- 'CREATE (n:Product)
  SET n = row,
  n.unitPrice = toFloat(row.unitPrice),
  n.unitsInStock = toInteger(row.unitsInStock), n.unitsOnOrder = toInteger(row.unitsOnOrder),
  n.reorderLevel = toInteger(row.reorderLevel), n.discontinued = (row.discontinued <> "0");'
# Send the csv 
load_csv(url = "http://data.neo4j.com/northwind/products.csv", 
         con = con, header = TRUE, periodic_commit = 50, 
         as = "row", on_load = on_load_query)
#> No data returned.
#> # A tibble: 12 x 2
#>    type                  value
#>    <chr>                 <dbl>
#>  1 contains_updates          1
#>  2 nodes_created            77
#>  3 nodes_deleted             0
#>  4 properties_set         1155
#>  5 relationships_created     0
#>  6 relationship_deleted      0
#>  7 labels_added             77
#>  8 labels_removed            0
#>  9 indexes_added             0
#> 10 indexes_removed           0
#> 11 constraints_added         0
#> 12 constraints_removed       0

Using the Connection Pane

{neo4r} comes with a Connection Pane interface for RStudio.

Once installed, you can go to the “Connections”, and use the widget to connect to the Neo4J server:

Sandboxing in Docker

You can get an RStudio / Neo4J sandbox with Docker :

docker pull colinfay/neo4r-docker
docker run -e PASSWORD=plop -e ROOT=TRUE -d -p 8787:8787 neo4r

CoC

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

neo4r's People

Contributors

colinfay avatar dianebeldame avatar garrettmooney avatar statnmap avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

neo4r's Issues

Getting an error on creating with convert_to("igraph") has error on result

I create a graph using the Neo4j movie graph database with "graph" as the output type and then use convert_to with "igraph"

> G<-"MATCH a=(p:Person {name: 'Tom Hanks'})-[r:ACTED_IN]->(m:Movie) RETURN a;" %>%
    call_neo4j(con, type = "graph") %>% 
    convert_to("igraph")

I get the following error referencing G:

> G
IGRAPH 35b7516 DN-- 13 12 -- 
+ attr: name (v/c), label (v/c), tagline (v/c), title (v/c), released
| (v/n), born (v/n), type (e/c), id (e/c), properties (e/x)
+ edges from 35b7516 (vertex names):
Error in if (is.na(no)) no <- len : argument is of length zero

Am thinking it might part of a different, but seemingly related error unnesting relationships when returned as a data.frame and not a graph:

> G<-"MATCH a=(p:Person {name: 'Tom Hanks'})-[r:ACTED_IN]->(m:Movie) RETURN a;" %>%
      call_neo4j(con, type = "graph")

> unnest_nodes(G$nodes)
# A tibble: 13 x 7
   id    label  tagline                         title      released  born name  
   <chr> <chr>  <chr>                           <chr>         <int> <int> <chr> 
  1 148   Movie  Houston, we have a problem.     Apollo 13      1995    NA NA    
  2 71    Person NA                              NA               NA  1956 Tom H…
  3 67    Movie  At odds in life... in love on-… You've Go…     1998    NA NA    
  4 170   Movie  Once in a lifetime you get a c… A League …     1992    NA NA    
  5 78    Movie  A story of love, lava and burn… Joe Versu…     1990    NA NA    
  6 85    Movie  In every life there comes a ti… That Thin…     1996    NA NA    
  7 111   Movie  Break The Codes                 The Da Vi…     2006    NA NA    
  8 105   Movie  Everything is connected         Cloud Atl…     2012    NA NA    
  9 158   Movie  At the edge of the world, his … Cast Away      2000    NA NA    
 10 130   Movie  Walk a mile you'll never forge… The Green…     1999    NA NA    
 11 73    Movie  What if someone you never met,… Sleepless…     1993    NA NA    
 12 169   Movie  This Holiday Season… Believe    The Polar…     2004    NA NA    
 13 167   Movie  A stiff drink. A little mascar… Charlie W…     2007    NA NA 

> unnest_relationships(G$relationships)
Error: Can't coerce element 1 from a list to a character

get_schema delivers only a part of the schema?

Follwing code snippet only delivers a part of the actual graph schema:

library(neo4r)
con <- neo4j_api$new(url = "http://52.86.4.26:34781", user = "neo4j", password = "cheaters-garages-cardboard")
con$get_schema()
#> # A tibble: 1 x 2
#>   label   property_keys
#>   <chr>   <chr>        
#> 1 Station sid

The Station nodes contain at least one more property name which is not shown in the schema? Why is it so? Is it also possible to see which relationship types exist?

convert to igraph error when node has more than one label - similar to closed issue #43

Differing number of rows error if the node has more than one label on it - e.g.:

l<-"MATCH (p:Person {name: 'Tom Hanks'}) SET p:Test return p" %>% call_neo4j(con, type = "row")
G <-"MATCH a=(p:Person {name: 'Tom Hanks'})-[r:ACTED_IN]->(m:Movie) RETURN a;" %>% call_neo4j(con, type = "graph") %>% convert_to('igraph')

Error in data.frame(..., check.names = FALSE) : 
  arguments imply differing number of rows: 14, 13

You can see that in unnest_nodes that the lab and df data frames have different lengths:

2019-02-11_16-50-09

[Bug] 'call db.schema'

I'm not sure db.schema is returned / parse at it should.

'call db.schema' %>%
  call_api(con)

$nodes
# A tibble: 15 x 1
   row       
   <list>    
 1 <list [0]>
 2 <chr [1]> 
 3 <list [1]>
 4 <list [0]>
 5 <chr [1]> 
 6 <list [0]>
 7 <list [0]>
 8 <chr [1]> 
 9 <list [0]>
10 <list [0]>
11 <chr [1]> 
12 <list [1]>
13 <list [0]>
14 <chr [1]> 
15 <list [1]>

$relationships
# A tibble: 0 x 1
# ... with 1 variable: row <list>

'call db.schema' %>%
  call_api(con, type = "graph") 

$nodes
# A tibble: 5 x 3
  id    label     properties
  <chr> <list>    <list>    
1 -58   <chr [1]> <list [3]>
2 -60   <chr [1]> <list [3]>
3 -59   <chr [1]> <list [3]>
4 -57   <chr [1]> <list [3]>
5 -56   <chr [1]> <list [3]>

$relationships
# A tibble: 2 x 5
  id    type         startNode endNode properties
  <chr> <chr>        <chr>     <chr>   <list>    
1 -24   has_recorded -59       -60     <list [0]>
2 -23   MAINTAINS    -58       -56     <list [0]>

unnest_graph doesn't work here :

'call db.schema' %>%
  call_api(con, type = "graph") %>%
  unnest_graph()
 Erreur : Columns `name`, `constraints` must be length 0, not 1, 1 

[Implementation] Better JSON creator

For now the way the JSON api call is kind of a hack, there must be a way to code it a better way.

I need to write something that looks exactly like :

{
  "statements" : [ {
    "statement" : "CREATE (n) RETURN id(n)"
  } ]
}

For now I'm doing :

to_json_neo <- function(query, include_stats, meta, type){
  toJSON(list(statement = query, includeStats = include_stats, meta = meta, resultDataContents = list(type)), auto_unbox = TRUE)
}

query_jsonised <- to_json_neo(query_clean, include_stats, meta, type)

body <- glue('{"statements" : [ %query_jsonised% ]}', .open = "%", .close = "%")

query only yielding single line results?

call_neo4j("MATCH (n) RETURN distinct labels(n)", con)

yields a single line result,
While on the admin webpage of the server I get a list of mulitple records.
What am I doing wrong?

Feature Request: Create a Node from a List

Instead of using vec_to_cypher, the idea would be able to pass a list and create a node from there. The benefit is that we can handle columns with NA's by dropping them and avoiding null property errors.

This feature existed within RNeo4j here.

While this ask is for the creation of a single node, the majority of my use-cases when using R is for bulk data creation, so bonus points for the ability to dump a dataframe to a list with one entry for each row, the "rows" option in jsonlite::toJSON.

could not find function "convert_to"

Trying to follow {neo4r} user guide chapter 6 (convert_to), and getting the following error messages:

Error in convert_to(., "igraph") : could not find function "convert_to"

Error in convert_to(., "visNetwork") : could not find function "convert_to"

Cannot connect to my Neo4j instance

Here is the output of my reprex file:

library(neo4r)
library(httr)
con <- neo4j_api$new(url = "http://107.23.190.94:33961", user = "codes", password = "codes")
con
#> <Neo4JAPI>
#>   Public:
#>     access: function () 
#>     auth: Y29kZXM6Y29kZXM=
#>     clone: function (deep = FALSE) 
#>     get_constraints: function () 
#>     get_labels: function () 
#>     get_relationships: function () 
#>     get_schema: function () 
#>     get_version: function () 
#>     initialize: function (url, user, password) 
#>     password: codes
#>     ping: function () 
#>     reset_password: function (password) 
#>     reset_url: function (url) 
#>     reset_user: function (user) 
#>     url: http:/107.23.190.94:33961
#>     user: codes

con$get_version()
#> Error in curl::curl_fetch_memory(url, handle = handle): Could not resolve host: http

x <- GET("http://107.23.190.94:33961")
x
#> Response [http://107.23.190.94:33961/]
#>   Date: 2018-04-05 04:58
#>   Status: 200
#>   Content-Type: application/json; charset=UTF-8
#>   Size: 151 B
#> {
#>   "management" : "http://107.23.190.94:33961/db/manage/",
#>   "data" : "http://107.23.190.94:33961/db/data/",
#>   "bolt" : "bolt://107.23.190.94:33960"

I get the error message could not resolve host: http also by trying to execute con$ping(). The same error comes also by using a local Neo4j instance with http:\\localhost:7474 and by trying to connect from my Databricks Notebook.

Do you need any information about my local environment? I use Linux Mint if it makes any difference.

[Question] create_nodes function?

I'm wondering if the implementation of create_nodes / create_constraint and such would make sense?

Here are why :

  • Do we really implement one node/relationship at a time at some point ? Given that the package will implement an as_nodes from a dataframe, it seems to me that we are more prone to create a bulk of nodes than a single node

  • Is create_nodes("Person", name = "Colin", con_object) significantly quicker/less verbose than call_api("CREATE (Person {name : 'Colin'})", con_object)?

What's your take on that ?

Error with unnest_relationships - Error: Can't coerce element 1 from a list to a character

Using the movie graph database I get an error with unnest_relationships. Output followed by clean code below:

Output

`

library(remotes)
remotes::install_github("neo4j-rstats/neo4r")
Skipping install of 'neo4r' from a github remote, the SHA1 (0f2054e) has not changed since last install.
Use 'force = TRUE' to force installation
library(neo4r)
neo4jServerURL <- "http://localhost:7474"
neo4jUser <- "neo4j"
neo4jPassword <- "admin"
con <- neo4j_api$new(url = neo4jServerURL, user = neo4jUser, password = neo4jPassword)
res <- "MATCH (tom:Person {name: 'Tom Hanks'})-[r:ACTED_IN]->(tomHanksMovies) RETURN *;" %>% call_neo4j(con, type = "graph")
node_data <<- as.data.frame(unnest_nodes(res$nodes))
edge_data <<- as.data.frame(unnest_relationships(res$relationships))
Error: Can't coerce element 1 from a list to a character
`

Clean Code

library(remotes)
remotes::install_github("neo4j-rstats/neo4r")
library(neo4r)
neo4jServerURL <- "http://localhost:7474"
neo4jUser <- "neo4j"
neo4jPassword <- "admin"
con <- neo4j_api$new(url = neo4jServerURL, user = neo4jUser, password = neo4jPassword)
res<-"MATCH (tom:Person {name: 'Tom Hanks'})-[r:ACTED_IN]->(tomHanksMovies) RETURN *;" %>% call_neo4j(con, type = 'graph')
node_data<<-as.data.frame(unnest_nodes(res$nodes))
edge_data<<-as.data.frame(unnest_relationships(res$relationships))

Useable output for calculation queries

Hi,
I'm mainly using queries which output summaries, e.g. counts, averages etc. such as:

MATCH (u:User) 
RETURN u.status as status, count(u) as n_users

which produce hard-to-use outputs, a list of data frames with one column called value...

List of 2
 $ status :Classes ‘tbl_df’, ‘tbl’ and 'data.frame':	5 obs. of  1 variable:
  ..$ value: int [1:5] 0 9 8 14 3
  ..- attr(*, ".internal.selfref")=<externalptr> 
 $ n_users:Classes ‘tbl_df’, ‘tbl’ and 'data.frame':	5 obs. of  1 variable:
  ..$ value: int [1:5] 3819684 887574 16521 2367 58606
  ..- attr(*, ".internal.selfref")=<externalptr> 
 - attr(*, "class")= chr [1:3] "neo" "neo" "list"

You can always stitch them back together afterwards...

However queries which return a list for one field fail:

MATCH (u:User) 
RETURN labels(u) as labels, count(u) as n_users

yields

Error in .x[[i]] : subscript out of bounds

(Because they add an extra layer, so removing the NULLs fails)

Anyway, the expected output would be a clean dataframe:

  • One potential way is to change the queries to output maps
MATCH (u:User) 
RETURN {labels:labels(u), n_users:count(u)}

But

  1. That fails at the same moment because it adds an extra layer like before (when getting nodes, it also adds an extra layer, but it doesn't fail because the meta part with each row is not NULL in the neo4j response...
  2. Add it's somewhat unnatural to format queries like that (it should fail still)
  • Another option is to revive the format argument and reshape the output as done in my code here (which still fails when one of the fields is a list...)

Hope it helps

Driver timeout

A customer tried to test the R-driver, but got the following error.

————————————
It seems like is getting a timeout. Do you have any input on what may be wrong?

timeouts with:
devtools::install_github("neo4j-rstats/neo4r”)

Another colleague tried to connect via a virtual machine and her response was:
“I tried now in VDI and I got also “ Installation failed: Timeout was reached: Connection timed out after 10000 milliseconds””
————————————

[Idea] Should the connection object keep track of every call?

We could set the con object to keep track of every call we are making to the API with this object, just to keep a kind of log file inside the object.

i.e : every time you run call_api("MY CALL", con), the con object get the time, the call, and the resulting stats.

'CREATE CONSTRAINT ON (d:Day) ASSERT d.name IS UNIQUE' %>%
  call_api(con, include_stats = TRUE)
# Will add to the con object : 
Call `CREATE CONSTRAINT ON (d:Day) ASSERT d.name IS UNIQUE` at XX
# A tibble: 12 x 2
   type                  value
   <chr>                 <dbl>
 1 contains_updates         1.
 2 nodes_created            0.
 3 nodes_deleted            0.
 4 properties_set           0.
 5 relationships_created    0.
 6 relationship_deleted     0.
 7 labels_added             0.
 8 labels_removed           0.
 9 indexes_added            0.
10 indexes_removed          0.
11 constraints_added        1.
12 constraints_removed      0.

Would it make sense?

when the result comes back with a null, the null gets ignored.

I'm using the game of thrones dataset and ask back the relationships per book

MATCH ()-[r]->()
 RETURN r.book as book, count(r) ORDER BY book

The browser returns a table

╒══════╤══════════╕
│"book"│"count(r)"│
╞══════╪══════════╡
│1 │684 │
├──────┼──────────┤
│2 │775 │
├──────┼──────────┤
│3 │1008 │
├──────┼──────────┤
│45 │1329 │
├──────┼──────────┤
│null │2823 │
└──────┴──────────┘

neo4r seems to ignore the null, but can't return a data frame either because of the unequal lengths, so it returns a list with a tibble of nrow 4 and a tibble with nrow 5

Problems with connection in Neo4r

I have the same problem in Neo4r than others persons (I can't do the connection with Neo4J), I can connect my R session 3.5.2 in R studio 1.1.4 in to Ne4j host this is the error:

con <- neo4j_api$new(url = "http://localhost:7474", user = "neo4j", password = "neo4j")
con$ping()
"Error: Failed to connect to localhost port 7474: Connection refused"

I do the change in to connected function neo4j_api$new
neo4j_api <- R6::R6Class("Neo4JAPI",
public = list(
url = character(0),
user = character(0),
password = character(0),....

but this doesn't work.

@ColinFay
@statnmap

[Question] Should the include_stats argument of call_api be TRUE by default?

The include_stats argument of the call_api function is now FALSE by default.

It returns the stats about the actions performed by the call, e.g. :

# A tibble: 12 x 2
   type                  value
   <chr>                 <dbl>
 1 contains_updates         1.
 2 nodes_created         1021.
 3 nodes_deleted            0.
 4 properties_set        1021.
 5 relationships_created    0.
 6 relationship_deleted     0.
 7 labels_added          1021.
 8 labels_removed           0.
 9 indexes_added            0.
10 indexes_removed          0.
11 constraints_added        0.
12 constraints_removed      0.

It would make sense that it is always returned when called on a CREATE / MERGE (just as above), but would it be less practical to have it to TRUE by default when we are querying data :

'MATCH (u:User) RETURN COUNT(u) AS Users_count' %>%
  call_api(con, include_stats = TRUE)
$Users_count
# A tibble: 1 x 1
  value
  <int>
1  1021

$stats
# A tibble: 12 x 2
   type                  value
   <chr>                 <dbl>
 1 contains_updates         0.
 2 nodes_created            0.
 3 nodes_deleted            0.
 4 properties_set           0.
 5 relationships_created    0.
 6 relationship_deleted     0.
 7 labels_added             0.
 8 labels_removed           0.
 9 indexes_added            0.
10 indexes_removed          0.
11 constraints_added        0.
12 constraints_removed      0.

unnest_relationships - Error when relationships contain attributes

I notice when I use the Norwegian band dataset where the relationships have no attributes then unnest_relationships works perfectly.

However, when I test it with some of the movie data I run into one of the two following errors:

If there is just one attribute: Error: Can't coerce element 1 from a list to a character

If there are more than one: Error: Result 1 is not a length 1 atomic vector

I find just changing from map_chr to map seems to resolve the issue:

unnest_relationships <- function(relationships_tbl){
  relationships_tbl$properties <- map(relationships_tbl$properties, na_or_self)
  unnest(relationships_tbl, properties)
}

With the Norwegian band data it will still produce NAs however if there is a data set where some edges have attributes and others do not then it just shows NULL see example below:

## assumes active connection named con

'CREATE (matrix:Movie { title:"The Matrix",released:1997 })
CREATE (cloudAtlas:Movie { title:"Cloud Atlas",released:2012 })
CREATE (forrestGump:Movie { title:"Forrest Gump",released:1994 })
CREATE (keanu:Person { name:"Keanu Reeves", born:1964 })
CREATE (robert:Person { name:"Robert Zemeckis", born:1951 })
CREATE (tom:Person { name:"Tom Hanks", born:1956 })
CREATE (tom)-[:ACTED_IN { roles: ["Forrest","Frank"], acts: ["two","three"]}]->(forrestGump)
CREATE (tom)-[:ACTED_IN]->(cloudAtlas)
CREATE (robert)-[:DIRECTED]->(forrestGump)' %>%
  call_api(con)

res <- 'MATCH p=()-[r:ACTED_IN]->() RETURN p' %>%
  call_api(con, type = "graph")

neo4r::unnest_relationships(res$relationships)  # results in an error

na_or_self <- function(x){
  if(length(x) == 0) return(NA)
  x
}

# switch from map_chr to map
unnest_relationships <- function(relationships_tbl){
  relationships_tbl$properties <- map(relationships_tbl$properties, na_or_self)
  unnest(relationships_tbl, properties)
}

unnest_relationships(res$relationships) # I believe this produces the desired result

Unfortunately, I am not finding a way to retain the NAs in these cases.

[Implementation] A node or relationship pipeable extractor

Something that could do :

'MATCH p=()-[r:MAINTAINS]->() RETURN p' %>%
  call_api(con, type = "graph")%>%
  extract_nodes() 
# A tibble: 191 x 3
   id    label     properties
   <chr> <list>    <list>    
 1 0     <chr [1]> <list [5]>
 2 1     <chr [1]> <list [1]>
 3 2     <chr [1]> <list [5]>
 4 3     <chr [1]> <list [1]>
 5 4     <chr [1]> <list [5]>
 6 5     <chr [1]> <list [1]>
 7 6     <chr [1]> <list [5]>
 8 7     <chr [1]> <list [1]>
 9 8     <chr [1]> <list [5]>
10 9     <chr [1]> <list [5]>
# ... with 181 more rows

So you can

'MATCH p=()-[r:MAINTAINS]->() RETURN p' %>%
  call_api(con, type = "graph") %>%
  extract_nodes() %>%
  dplyr::count(names)

Feature Request: neo4j_api$new() Defaults

As a user that is developing locally, when I look at ?neo4j_api, I see this as the example:

neo4j_api$new(url = "http://localhost:7474", user = "neo4j", password = "password")

As I believe this is a common pattern and set of credentials, I expected to run:

con <- neo4j_api$new()

to authenticate and then

con$ping()

to return a 200.

Basically, unless the user supplies values as they would for something beyond local dev efforts, it might be easier to have the defaults set for quick authentication.

This is a feature request, not a bug.

Feature Request: Common Test Dataset

Not a bug, but as I am looking at the docs, README, and the guide, I think it might be easier if we all had a common dataset to work through as we "learn" the api of neo4r.

It could be neat/easy to do something like neo4j_api$create_example_graph(). I am sure the command could be more succinct, but you get the idea.

This might help with tests and debugging of issues.

Passing parameters to queries

It's useful to be able to pass data as "parameters" in the queries.
(E.g. to load json, see https://neo4j.com/blog/cypher-load-json-from-url/)

I've updated the code in my fork below. Seems to be working so far... if it helps.
(Not doing a pull request as I've not tried every potential impact...)

to_json_neo <- function(query, params, include_stats, meta, type) {
  toJSON(
    list(
      statement = query,
      parameters = params,
      includeStats = include_stats,
      meta = meta,
      resultDataContents = list(type)
    ),
    auto_unbox = TRUE
  )
}

[Implementation] Create a import widget that imports all type of tabular data

We can use all the tools available in R to build a global importer for tabular format.

How it would work:

  • The widget has a file selector
  • File is read and previewed
  • The user enters the cypher code
  • The file is read as a csv in a temp file
  • A plumber app is launched
  • That api takes a path and generates a csv
  • The widget generates a LOAD CSV FROM {plumberapiurl}cypher code
  • The generated code is sent to the Neo4J instance

Subscript out of bounds with what looks like returning relationship properties using Movie Graph

Seems to be tied to relationships with properties. Have other examples of same type of error that seem to be the same root cause

res<-"MATCH (people:Person)-[relatedTo]-(:Movie {title: 'Cloud Atlas'}) RETURN people.name, Type(relatedTo), relatedTo" %>%
  call_neo4j(con, type = 'row')

Should return

╒══════════════════╤═════════════════╤══════════════════════════════════════════════════════════════════════╕
│"people.name"     │"type(relatedTo)"│"relatedTo"                                                           │
╞══════════════════╪═════════════════╪══════════════════════════════════════════════════════════════════════╡
│"Hugo Weaving"    │"ACTED_IN"       │{"roles":["Bill Smoke","Haskell Moore","Tadeusz Kesselring","Nurse Noa│
│                  │                 │kes","Boardman Mephi","Old Georgie"]}                                 │
├──────────────────┼─────────────────┼──────────────────────────────────────────────────────────────────────┤
│"Jessica Thompson"│"REVIEWED"       │{"summary":"An amazing journey","rating":95}                          │
├──────────────────┼─────────────────┼──────────────────────────────────────────────────────────────────────┤
│"Stefan Arndt"    │"PRODUCED"       │{}                                                                    │
├──────────────────┼─────────────────┼──────────────────────────────────────────────────────────────────────┤
│"Tom Hanks"       │"ACTED_IN"       │{"roles":["Zachry","Dr. Henry Goose","Isaac Sachs","Dermot Hoggins"]} │
├──────────────────┼─────────────────┼──────────────────────────────────────────────────────────────────────┤
│"Jim Broadbent"   │"ACTED_IN"       │{"roles":["Vyvyan Ayrs","Captain Molyneux","Timothy Cavendish"]}      │
├──────────────────┼─────────────────┼──────────────────────────────────────────────────────────────────────┤
│"Halle Berry"     │"ACTED_IN"       │{"roles":["Luisa Rey","Jocasta Ayrs","Ovid","Meronym"]}               │
├──────────────────┼─────────────────┼──────────────────────────────────────────────────────────────────────┤
│"Lilly Wachowski" │"DIRECTED"       │{}                                                                    │
├──────────────────┼─────────────────┼──────────────────────────────────────────────────────────────────────┤
│"David Mitchell"  │"WROTE"          │{}                                                                    │
├──────────────────┼─────────────────┼──────────────────────────────────────────────────────────────────────┤
│"Tom Tykwer"      │"DIRECTED"       │{}                                                                    │
├──────────────────┼─────────────────┼──────────────────────────────────────────────────────────────────────┤
│"Lana Wachowski"  │"DIRECTED"       │{}                                                                    │

Instead get subscript out of bonds error

Error in .x[[i]] : subscript out of bounds

last_error

> rlang::last_error()
<error>
message: Each column must either be a list of vectors or a list of data frames [roles]
class:   `rlang_error`
backtrace:
 1. neo4r::unnest_relationships(res$relationships)
 3. tidyr:::unnest.data.frame(relationships_tbl)
Call `rlang::last_trace()` to see the full backtrace

> rlang::last_trace()
    █
 1. └─neo4r::unnest_relationships(res$relationships)
 2.   ├─tidyr::unnest(relationships_tbl)
 3.   └─tidyr:::unnest.data.frame(relationships_tbl)

Traceback

30. $is_atomic(.x)
29. $modify_depth_rec(x, .depth - 1, .f, ..., .ragged = .ragged, .atomic = .atomic)
28. $.f(.x[[i]], ...)
27. $modify.default(.x, function(x) { modify_depth_rec(x, .depth - 1, .f, ..., .ragged = .ragged, .atomic = .atomic) })
26. $modify(.x, function(x) { modify_depth_rec(x, .depth - 1, .f, ..., .ragged = .ragged, .atomic = .atomic) })
25. $modify_depth_rec(x, .depth - 1, .f, ..., .ragged = .ragged, .atomic = .atomic)
24. $.f(.x[[i]], ...)
23. $modify.default(.x, function(x) { modify_depth_rec(x, .depth - 1, .f, ..., .ragged = .ragged, .atomic = .atomic) })
22. $modify(.x, function(x) { modify_depth_rec(x, .depth - 1, .f, ..., .ragged = .ragged, .atomic = .atomic) })
21. $modify_depth_rec(x, .depth - 1, .f, ..., .ragged = .ragged, .atomic = .atomic)
20. $.f(.x[[i]], ...)
19. $modify.default(.x, function(x) { modify_depth_rec(x, .depth - 1, .f, ..., .ragged = .ragged, .atomic = .atomic) })
18. $modify(.x, function(x) { modify_depth_rec(x, .depth - 1, .f, ..., .ragged = .ragged, .atomic = .atomic) })
17. $modify_depth_rec(x, .depth - 1, .f, ..., .ragged = .ragged, .atomic = .atomic)
16. $.f(.x[[i]], ...)
15. $modify.default(.x, function(x) { modify_depth_rec(x, .depth - 1, .f, ..., .ragged = .ragged, .atomic = .atomic) })
14. $modify(.x, function(x) { modify_depth_rec(x, .depth - 1, .f, ..., .ragged = .ragged, .atomic = .atomic) })
13. $modify_depth_rec(.x, .depth, .f, ..., .ragged = .ragged, .atomic = FALSE)
12. $modify_depth.default(results, vec_depth(results) - 1, function(x) { if (is.null(x)) { NA } ... $11. $modify_depth(results, vec_depth(results) - 1, function(x) { if (is.null(x)) { NA } ... $10. $parse_api_results(res = res, type = type, format = format, include_stats = include_stats, meta = include_meta)
9. $call_neo4j(., con, type = "row")
8. $function_list[[k]](value)
7. $withVisible(function_list[[k]](value))
6. $freduce(value, `_function_list`)
5. $`_fseq`(`_lhs`)
4. $eval(quote(`_fseq`(`_lhs`)), env, env)
3. $eval(quote(`_fseq`(`_lhs`)), env, env)
2. $withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
1. $"MATCH (people:Person)-[relatedTo]-(:Movie {title: 'Cloud Atlas'}) RETURN people.name, Type(relatedTo), relatedTo" %>% call_neo4j(con, type = "row")

[Implementation] Better results parser

The current bottleneck is the result parsing:

library(neo4r)
con <- neo4j_api$new(url = "http://138.197.15.1:7474", user = "all", password = "readonly")
con$ping()

"match (t:Tag) return t.name as name, size((t)--()) as deg limit 1000;" %>%
    call_api(con)
  # Calling the API take 500 ms
  bench::mark({ 
    res <- POST(url = glue("{con$url}/db/data/transaction/commit?includeStats=true"),
                add_headers(.headers = c("Content-Type"="application/json",
                                         "accept"="application/json",
                                         #"X-Stream" = "true",
                                         "Authorization"= paste0("Basic ", con$auth))),
                body = body)
  })
# A tibble: 1 x 14
  expression   min  mean median   max `itr/sec`
  <chr>      <bch> <bch> <bch:> <bch>     <dbl>
1 {...       561ms 561ms  561ms 561ms      1.78


# while parsing the results around 2.5 seconds
    bench::mark({ 
      parse_api_results(res = res, type = type, format = format, include_stats = include_stats, meta = meta)
    })
# A tibble: 1 x 14
  expression   min  mean median   max `itr/sec` mem_alloc
  <chr>      <bch> <bch> <bch:> <bch>     <dbl> <bch:byt>
1 {...       2.29s 2.29s  2.29s 2.29s     0.437    10.1MB

Neo4j Connection Problem: API Error

I can build up the connection between Neo4j and R. However, I got an error message when I call the API. Below is an example of the error when I tried to run a Cypher query pulling data.

> library(neo4r)
> library(magrittr)
> con <- neo4j_api$new(url = "http://localhost:7474", 
+                      user = "xxx", password = "xxx")
> 
> con$ping()
[1] 401
> 
> 'MATCH (p:Person) -[r:ACTED_IN] -> (m:Movie) RETURN *;' %>%
+   call_neo4j(con, output = "json")
Error: API error

Thanks in advance.

[Doc] Introduction Vignette

There are no Vignette for now in the package (which is a bad thing).

The "Intro to {neo4r}" could be more or less a copy of the current README.

attempt to apply non function

I wanted to play around with these projects and started with the README. However, I can't ping the server, see below:

options(stringsAsFactors = FALSE)

## load libraries
install.packages("remotes")
remotes::install_github("neo4j-rstats/neo4r")
library(neo4r)
remotes::install_github("neo4j-rstats/play4j")
library(play4j)

## connect to neo4 3.3.0 from neo4j desktop
URL = "/Users/btibert/Library/Application Support/Neo4j Desktop/Application/neo4jDatabases/database-d2412486-df0a-43fb-8ec8-c578391fcaee/installation-3.3.0"
con <- neo4j_api$new(url = "http://localhost:7474",  user = "neo4j", password = "password")
con$ping()

results in the error

> con$ping()
Error: attempt to apply non-function

my session info

> sessionInfo()
R version 3.4.4 RC (2018-03-08 r74373)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: OS X El Capitan 10.11.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] bindrcpp_0.2.2      tidyjson_0.2.1.9000 play4j_0.0.0.9000   neo4r_0.0.0.9000    tibble_1.4.2        purrr_0.2.4        
[7] tidyr_0.8.0         dplyr_0.7.4         jsonlite_1.5       

loaded via a namespace (and not attached):
 [1] igraph_1.2.1     Rcpp_0.12.16     knitr_1.20       bindr_0.1.1      magrittr_1.5     tidyselect_0.2.4 R6_2.2.2        
 [8] rlang_0.2.0      stringr_1.3.0    httr_1.3.1       tools_3.4.4      remotes_1.1.1    htmltools_0.3.6  rprojroot_1.2   
[15] digest_0.6.15    yaml_2.1.18      assertthat_0.2.0 attempt_0.2.0    base64enc_0.1-3  curl_3.2         evaluate_0.10.1 
[22] glue_1.2.0       rmarkdown_1.6    stringi_1.1.7    compiler_3.4.4   pillar_1.2.1     backports_1.1.1  pkgconfig_2.0.1 

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.