Git Product home page Git Product logo

mdfs-py's Introduction

MDFS (MultiDimensional Feature Selection) for Python

MDFS is a library to assist in MultiDimensional Feature Selection (MDFS), i.e. feature selection that accounts for multidimensional interactions in the dataset. To learn more about MDFS, please visit the MDFS website.

This project is the implementation of the MDFS library for Python. Functionality-wise, it is aligned with the R version of the MDFS library, but the interface differs to make it more native to the Python ecosystem (i.e. pythonic) and to free it from early assumptions carried on for backward compatibility in R.

License

This software is released the same as the R MDFS library: under the GNU General Public License (GPL) v3.

Copyright

The copyrights are held by Radosław Piliszek (the package maintainer and author), Abraham Kaczmarski (major contributor to the new interface), Krzysztof Mnich and Witold Rudnicki (authors of the MDFS method).

Changelog

See the common changelog.

Library structure

The library consists of a single package module: mdfs, which exports all the user-facing functionality.

Introduction for beginners

The mdfs package module needs to be imported. Then, the main function to run is, aptly named, run. It accepts a numpy data matrix data and its corresponding decision, and returns a dictionary with the details of analysis, including the entry for relevant_variables which gives the indices of variables deemed relevant under chosen conditions.

Interface differences between R and Python

Function names

The following list gives the translation between R functions and their Python counterparts.

  • MDFS = run
  • ComputeMaxInfoGains = compute_max_ig
  • ComputeInterestingTuples = compute_tuples
  • ComputePValue = fit_p_value
  • Discretize = discretize
  • GetRange = get_suggested_range
  • GenContrastVariables = gen_contrast_variables

Function parameter names

Function parameter names have been adjusted to avoid the dot (.), replacing it with an underscore (_).

No global seed in Python

There is no global seed in use. All functions depending on PRNG take a seed parameter.

Quirks

Due to the way the Python-C interface is implemented in this library with numpy views, there is one quirk to be aware of. Functions returning a Structure subclass object do so without incurring a copy. Properties present on such objects return views, not copies. These views do not protect the result from being garbage collected (i.e., think of them as weak references to the underlying data). Thus, to avoid freed memory reads, keep the original structures around when using these views or copy data elsewhere as necessary. This quirk might be lifted in the future.

mdfs-py's People

Contributors

yoctozepto avatar bala-hedev avatar balaram26 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.