Git Product home page Git Product logo

tksaha / fs3-graph-mining Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 836 KB

FS3: A sampling based method for top‐k frequent subgraph mining

Home Page: https://onlinelibrary.wiley.com/doi/full/10.1002/sam.11277

License: MIT License

C++ 8.32% Makefile 0.15% Perl 59.13% HTML 29.21% CSS 2.95% Shell 0.24%
subgraph mining frequent-subgraphs mcmc-sampler mcmc-sampler-graph approximate-graph-mining sampling-and-mining-graph

fs3-graph-mining's Introduction

FSCube

Mining labeled subgraph is a popular research task in data mining because of its potential application in many different scientific domains. All the existing methods for this task explicitly or implicitly solve the subgraph isomorphism task which is computationally expensive, so they suffer from the lack of scalability problem when the graphs in the input database are large. In this work, we propose FS^3, which is a sampling based method. It mines a small collection of subgraphs that are most frequent in the probabilistic sense. FS^3 performs a Markov chain Monte Carlo (MCMC) sampling over the space of a fixed-size subgraphs such that the potentially frequent subgraphs are sampled more often. Besides, FS^3 is equipped with an innovative queue manager. It stores the sampled subgraph in a finite queue over the course of mining in such a manner that the top-k positions in the queue contain the most frequent subgraphs. Our experiments on database of large graphs show that FS^3 is efficient, and it obtains subgraphs that are the most frequent amongst the subgraphs of a given size.

Installation

I use mpc makefile creator to generate makefile. Please read the article in the link to learn more.

If you have added new files, please change the mpc file accordingly and then run the following command to generate new makefile.

chmod +x mwc.pl
./mwc.pl -type make codes/randomminer.mwc

Run the following command in randommining/codes folder:

make

A Sample Run:

./randomminer -d mutagen_2.interactive (data set)  -i 100 (number of iteration)  -s 6 (subgraph size) -q  100000 (queue size)

Reference

If you are using the code for research purposes, please consider citing the following paper:

@article{saha.hasan:15,
  title={FS3: A sampling based method for top-k frequent subgraph mining},
  author={Saha, Tanay Kumar and Al Hasan, Mohammad},
  journal={Statistical Analysis and Data Mining: The ASA Data Science Journal},
  volume={8},
  number={4},
  pages={245--261},
  year={2015},
  publisher={Wiley Online Library}
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.