Git Product home page Git Product logo

grandiso-networkx's Introduction

Grand Isomorphisms

Codecov GitHub Workflow Status GitHub PyPI

Subgraph isomorphism is a resource-heavy (but branch-parallelizable) algorithm that is hugely impactful for large graph analysis. SotA algorithms for this (Ullmann, VF2, BB-Graph) are heavily RAM-bound, but this is due to a large number of small processes each of which hold a small portion of a traversal tree in memory.

Grand-Iso is a subgraph isomorphism algorithm that exchanges this resource-limitation for a parallelizable partial-match queue structure.

It performs favorably compared to other pure-python (and even some non-pure-python!) implementations:

image

See the wiki for more documentation.

Example Usage

from grandiso import find_motifs
import networkx as nx

host = nx.fast_gnp_random_graph(10, 0.5)

motif = nx.Graph()
motif.add_edge("A", "B")
motif.add_edge("B", "C")
motif.add_edge("C", "D")
motif.add_edge("D", "A")

len(find_motifs(motif, host))

Directed graph support:

from grandiso import find_motifs
import networkx as nx

host = nx.fast_gnp_random_graph(10, 0.5, directed=True)

motif = nx.DiGraph()
motif.add_edge("A", "B")
motif.add_edge("B", "C")
motif.add_edge("C", "D")
motif.add_edge("D", "A")

len(find_motifs(motif, host))

Counts-only

For very large graphs, you may use a good chunk of RAM not only on the queue of hypotheses, but also on the list of results. If all you care about is the NUMBER of results, you should pass count_only=True to the find_motifs function. This will dramatically reduce your RAM overhead on higher-count queries.

There are many other arguments that you can pass to the motif search algorithm. For a full list, see here.

Hacking on this repo

Running Tests

coverage run --source=grandiso -m pytest

Citing

If this tool is helpful to your research, please consider citing it with:

# https://doi.org/10.1038/s41598-021-91025-5
@article{Matelsky_Motifs_2021, 
    title={{DotMotif: an open-source tool for connectome subgraph isomorphism search and graph queries}},
    volume={11}, 
    ISSN={2045-2322}, 
    url={http://dx.doi.org/10.1038/s41598-021-91025-5}, 
    DOI={10.1038/s41598-021-91025-5}, 
    number={1}, 
    journal={Scientific Reports}, 
    publisher={Springer Science and Business Media LLC}, 
    author={Matelsky, Jordan K. and Reilly, Elizabeth P. and Johnson, Erik C. and Stiso, Jennifer and Bassett, Danielle S. and Wester, Brock A. and Gray-Roncal, William},
    year={2021}, 
    month={Jun}
}

Made with ๐Ÿ’™ at JHU APL

grandiso-networkx's People

Contributors

aleclearmind avatar davidmezzetti avatar j6k4m8 avatar melomancool avatar nobutoba avatar raphtor avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

grandiso-networkx's Issues

subgraph Isomorphism by edge and node labels

Thank you for this very useful library, we will definitely consider it for our future research. In our specific case, we would need to count subgraph isomorphisms by edge and node labels. E.g., given a source directed graph with nodes n0, n1, and n2 with labels 'class', 'sublcass' and 'subclass' respectively, and with edges, n1->n0, n2->n0 both with labels 'gen', we would like to count the exact matches in a given target graph.

Is there an option to do this? Thank you!

Finding the largest common monomorphic subgraph

Hi,

I am looking for an algorithm that can find the largest common monomorphic subgraph between two graphs. Specifically, I have two graphs G1 and G2, and I would like to identify the largest subgraph (may be non-unique) of G2 that is monomorphic to a subgraph of G1.

I am wondering whether find_motifs_iter can be useful in an iterative method to find an approximation of the largest subgraph of G2 that is a monomorphic to a subgraph of G1. I suppose that somewhere in this method, I may be able to extract the largest considered candidate subgraph that failed to be extended to a full graph monomorphism.

Any ideas on where to start with this?

Thanks!

Feature: Accept a list of hint mappings

Accept an optional list of "hint" mappings to start from, rather than beginning with the empty map.

Proposed API:

find_motifs(motif: nx.Graph, host: nx.Graph, hints=List[Dict[Hashable, Hashable]])

Find Larger Matches

Dear @j6k4m8,
I am looking for a way to find matches that are larger than the input motif (possibly given a certain threshold).

For instance, given the following motif:

motif = nx.DiGraph()
motif.add_edge("n1", "n0", label="gen")
motif.add_edge("n2", "n0", label="gen")
motif.add_node("n0", label="class")
motif.add_node("n1", label="subclass")
motif.add_node("n2", label="subclass")

Finding matches in a given target graph G like:

("n1", "n0", label="gen")
("n2", "n0", label="gen")
("n1", "n3", label="rel")
("n2", "n4", label="rel")
("n0", label="class")
("n1", label="subclass")
("n2", label="subclass")
("n3", label="class")
("n4", label="class")

or

("n1", "n0", label="gen")
("n2", "n0", label="gen")
("n1", "n3", label="rel")
("n1", "n4", label="rel")
("n1", "n5", label="rel")
("n0", label="class")
("n1", label="subclass")
("n2", label="subclass")
("n3", label="class")
("n4", label="class")
("n5", label="class")

etc.

It should be a kind of inexact match where the input motif is always a subset of the output matches and we can select, for instance, the number of neighbor nodes to be considered in the output matches.

Thanks in advance for your help!

Add `count_motifs` endpoint

For sufficiently common motifs, len(find_motifs(...)) has an unnecessarily large memory footprint. If the user is just looking for a count rather than an enumeration of each monomorphism, we can increment a count rather than saving a copy of monomorphisms in a list.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.