Git Product home page Git Product logo

grand-cypher's Introduction

GrandCypher

GitHub Workflow Status (branch)
pip install grand-cypher
# Note: You will want a version of grandiso>=2.2.0 for best performance!
pip install -U 'grandiso>=2.2.0'

GrandCypher is a partial (and growing!) implementation of the Cypher graph query language written in Python, for Python data structures.

You likely already know Cypher from the Neo4j Graph Database. Use it with your favorite graph libraries in Python!

Usage

Example Usage with NetworkX:

from grandcypher import GrandCypher
import networkx as nx

GrandCypher(nx.karate_club_graph()).run("""
MATCH (A)-[]->(B)
MATCH (B)-[]->(C)
WHERE A.club == "Mr. Hi"
RETURN A.club, B.club
""")

Example Usage with SQL

Create your own "Sqlite for Neo4j"! This example uses grand-graph to run queries in SQL:

import grand
from grandcypher import GrandCypher

G = grand.Graph(
    backend=grand.backends.SQLBackend(
        db_url="my_persisted_graph.db",
        directed=True
    )
)

# use the networkx-style API for the Grand library:
G.nx.add_node("A", foo="bar")
G.nx.add_edge("A", "B")
G.nx.add_edge("B", "C")
G.nx.add_edge("C", "A")

GrandCypher(G.nx).run("""
MATCH (A)-[]->(B)-[]->(C)
MATCH (C)-[]->(A)
WHERE
    A.foo == "bar"
RETURN
    A, B, C
""")

Feature Parity

Feature Support
Multiple MATCH clauses
WHERE-clause filtering on nodes
Anonymous -[]- edges
LIMIT
SKIP
Node/edge attributes with {} syntax
WHERE-clause filtering on edges
Named -[]- edges
Chained ()-[]->()-[]->() edges ✅ Thanks @khoale88!
Backwards ()<-[]-() edges ✅ Thanks @khoale88!
Anonymous () nodes ✅ Thanks @khoale88!
Undirected ()-[]-() edges ✅ Thanks @khoale88!
Boolean Arithmetic (AND/OR) ✅ Thanks @khoale88!
OPTIONAL MATCH 🛣
(:Type) node-labels ✅ Thanks @khoale88!
[:Type] edge-labels ✅ Thanks @khoale88!
Graph mutations (e.g. DELETE, SET,...) 🛣
✅ = Supported 🛣 = On Roadmap 🔴 = Not Planned

Citing

If this tool is helpful to your research, please consider citing it with:

# https://doi.org/10.1038/s41598-021-91025-5
@article{Matelsky_Motifs_2021,
    title={{DotMotif: an open-source tool for connectome subgraph isomorphism search and graph queries}},
    volume={11},
    ISSN={2045-2322},
    url={http://dx.doi.org/10.1038/s41598-021-91025-5},
    DOI={10.1038/s41598-021-91025-5},
    number={1},
    journal={Scientific Reports},
    publisher={Springer Science and Business Media LLC},
    author={Matelsky, Jordan K. and Reilly, Elizabeth P. and Johnson, Erik C. and Stiso, Jennifer and Bassett, Danielle S. and Wester, Brock A. and Gray-Roncal, William},
    year={2021},
    month={Jun}
}

grand-cypher's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

grand-cypher's Issues

NetworkXDialect does not work correctly with networkx.DiGraph

since NetworkXDialect is inherited from networkx.Graph, there happens to be discrepancies between networkx.Graph and networkx.Digraph popagated back to grand.Graph. One of them is the networkx.Graph.edges return EdgeView while networkx.Digraph.edges return OutEdgeView.

Below is one of the test to replicate the issue

def test_nx_edges(self):
        G = Graph(directed=True).nx
        H = nx.DiGraph()
        G.add_edge("1", "2")
        G.add_edge("2", "1")   # <<< this won't work with EdgeView for G
        G.add_edge("1", "3")
        H.add_edge("1", "2")
        H.add_edge("2", "1")   # <<< OutEdgeView returns this for H
        H.add_edge("1", "3")
        self.assertEqual(dict(G.edges), dict(H.edges))
        self.assertEqual(dict(G.edges()), dict(H.edges()))
        self.assertEqual(list(G.edges["1", "2"]), list(H.edges["1", "2"]))

the result is

    def test_nx_edges(self):
        G = Graph(directed=True).nx
        H = nx.DiGraph()
        # H = nx.Graph()
        G.add_edge("1", "2")
        G.add_edge("2", "1")
        G.add_edge("1", "3")
        H.add_edge("1", "2")
        H.add_edge("2", "1")
        H.add_edge("1", "3")
>       self.assertEqual(dict(G.edges), dict(H.edges))
E       AssertionError: {('1', '2'): {}, ('1', '3'): {}} != {('1', '2'): {}, ('1', '3'): {}, ('2', '1'): {}}
E       - {('1', '2'): {}, ('1', '3'): {}}
E       + {('1', '2'): {}, ('1', '3'): {}, ('2', '1'): {}}
E       ?                              ++++++++++++++++

Feature request: path groups

For example, the syntax:

MATCH p=(n1 {type: "compilation_unit"})-[]->(n2 {type: "class_declaration"})-[*2]->(n3 {type: "method_declaration"})-->(n4 {text: "Main", type:"identifier"})
RETURN p

The return type for a path from neo4j has start, end, segments, and notably includes nodes or edge that would be traversed by the edge * operator.

I can still work around it, but I would very much like to have access to variable-length paths!

Edge Hopping

My use case is a perfect fit for this feature. I do not know exactly the depth of a branch, so I would like to search for depth starting from a node all the way down or to a limit. It would be good if edge hopping or variable relationship is supported.

From what I understand, the syntax for it is -[*min..max]- where min and max are positive integers. The result is subgraphs having that branch node reaching out from min to max.

Is it possible to run node related queries instead of relation related queries?

I apologize for asking this from a place of relatively little understanding. I am working with a project which has brought me further into graphs than I have ever ventured before. For my project I need a way to query my graph for nodes which match a pattern. For the moment I am using https://geronimo-iia.github.io/networkx-query/ . However, I am curious to use this library for the expansive capabilities of cypher. Is it possible to make this query work within the supported syntax of the library?

MATCH (c:City)
WHERE c.name = "London"
RETURN c

I'm reading through the source code to try to answer this for myself but there are a lot of new concepts being introduced to me all at once so I figured it wouldn't hurt to ask. I appreciate the time and effort!

WHERE-clause boolean algebra

This involves AND/OR/NOT support (with order-of-operations to match that of Cypher) with parentheses. I think this might be pretty complicated because it will entail backtracking the entire structural match if clauses aren't met; it might make more sense to run OR operands in parallel, so that

MATCH (A)
WHERE (A.type = 1 AND B.type = 1) OR (B.type = 2)
RETURN A

becomes two queries:

MATCH (A)
WHERE (A.type = 1 AND B.type = 1)
RETURN A
MATCH (A)
WHERE (B.type = 2)
RETURN A

But I imagine this will get much more complicated for deeper nesting.

Graph mutations?

Graph mutations (updating, deleting, and creating vertices using Cypher) are a big engineering change, and will likely require a lot of corner-case tests.

I previously listed this as a "not-planned" feature but I wonder if users are interested in this capability existing? Perhaps @khoale88, I wonder what your current use-cases look like? I would be interested in adding this feature back into the roadmap if it will be useful!

Entity types

In Cypher, node and edge types are represented by :ColonNotation. For example,

(A:Neuron)-[AB:Synapse]->(B:Neuron)

NetworkX has no concept of entity "types," so this will be the first time that this codebase mandates a data schema (i.e., a type attribute on the entities in the graph). I'm not sure this is something I want to enforce, but if we do decide to use vertex/edge attributes like this, I'd like to open discussion in this issue to establish what schema we want to support.

Multi-hop graph relationships

In building multi-hop queries, is there currently a method to retrieve the node ids along with the attributes?

i.e for the query

MATCH (A{id: "Vikings"})-[R*0..3]->(B{id: "England"})
RETURN A, R, B
LIMIT 1

The source and target nodes could be added as attributes. In looking at the code, it's also straightforward to add those in all cases or with multi-hop relationships. Lastly, the openCypher standard has startNode and endNode functions.

Curious if there are any thoughts on this use case.

OPTIONAL MATCH

@j6k4m8 do you have any idea how the implementation for optional match should be? Not sure if isomorphic search can do this. This is helpful for query with variable/dynamic length of relationship

Support for Equijoins

Hello, thank you for the great project :)

In how far are equijoins exactly supported?

Given that I have the following NetworkX graph:

G = nx.DiGraph()
G.add_node("x")
G.add_node("y")
G.add_node("z")
G.add_edge("x", "y")
G.add_edge("y", "x")
G.add_edge("x", "x")
G.add_edge("z", "x")

When I execute the following query:

MATCH (n)-->(n)
RETURN n

I get the result:

{Token('CNAME', 'n'): ['x', 'y']}

However, if I execute the same query on the equivalent graph in neo4j, I only get the node x as result - which to my understanding of Cypher would be the correct result.

Therefore, to my understanding, equijoins are currently only supported in the project when they are distributed over multiple match clauses and do not recognizes self cycles on nodes. Is that correct?

For instances, the following query correctly recognizes two loops in the graph:

MATCH (n)-->(m)
MATCH (m)-->(n)
RETURN n, m

While neo4j additionally returns n=x and m=x as a result.

Many regards,
Felix

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.