Git Product home page Git Product logo

subdisc's Introduction

SubDisc: Subgroup Discovery

SubDisc is a Data Mining tool for discovering local patterns in data. SubDisc features a generic Subgroup Discovery algorithm that can be configured in many ways, in order to implement various forms of local pattern discovery. The tool can deal with a range of data types, both for the input attributes as well as the target attributes, including nominal, numeric and binary.

A unique feature of SubDisc is its ability to deal with a range of Subgroup Discovery settings, determined by the type and number of target attributes. Where regular SD algorithms only consider a single target attribute, nominal or sometimes numeric, Cortana is able to deal with targets consisting of multiple attributes, in a setting called Exceptional Model Mining.

SubDisc was previously developed under the name Cortana.

screenshots

Features

  • Generic parameterized Subgroup Discovery algorithm.
  • Multiple data types supported.
  • Implemented in Java, so works on all major platforms, including Windows, Linux and Mac OS.
  • Works on propositional (tabular) data from flat files, .TXT or .ARFF.
  • Includes Exceptional Model Mining settings.
  • Statistical validation of mining results.
  • Graphical presentation of results, such as ROC curves, scatter plots, and exceptional models.
  • Additional bioinformatics module for literature-based gene set enrichment (see bioinformatics below).
  • Free binary version and open-source access.
  • Wrapper available for R (https://github.com/SubDisc/rSubDisc) and Python (soon)

The code is compatible with Java 15.

To use

  1. Either download the last released version jar file (https://github.com/SubDisc/SubDisc/releases/) or build it yourself (below).
  2. Double-click on the .jar file or use java cli (ex.: java -jar target/subdisc-gui.jar).

The interface should appear, and you are ready to open a data file and discover subgroups!

How to build

  1. Clone the repository: git clone https://github.com/SubDisc/SubDisc.git
  2. Use maven to assemble the .jar file: mvn package
  3. The .jar file is created in ./target and named something like subdisc-gui-2.1094.jar.

Scientific Publications

Technical details concerning the algorithms behind Cortana can be found in various scientific publications:

Contributors

The following people have contributed in various ways to the development of SubDisc/Cortana:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.