Git Product home page Git Product logo

lofs's Introduction

LOFS: A Library of Online Streaming Feature Selection

As an emerging research direction, online streaming feature selection deals with sequentially added dimensions in a feature space while the number of data instances is fixed. Online streaming feature selection provides a new, complementary algorithmic methodology to enrich online feature selection, especially targets to high dimensionality in big data analytics.

LOFS is a software toolbox for online streaming feature selection. It provides the first open-source library for use in MATLAB and OCTAVE that implements the state-of-the-art algorithms of online streaming feature selection. The library is designed to facilitate the development of new algorithms in this research direction and make comparisons between the new methods and existing ones available. Two versions of the LOFS library in MATLAB and OCTAVE are available from https://github.com/kuiy/LOFS.

The LOFS library comes with detailed documentation. The documentation is available from https://github.com/kuiy/LOFS/tree/master/LOFS_Matlab/manual and https://github.com/kuiy/LOFS/tree/master/LOFS_Octave/manual.

This documentation describes the setup and usage of LOFS. All functions and related data structures are explained in detail.

Copyright © 2015 Kui Yu, Wei Ding, and Xindong Wu

License: GNU GENERAL PUBLIC LICENSE Version 3

The LOFS architecture is based on a separation of three modules, that is, CM (Correlation Measure), Learning, and SC (Statistical Comparison). All of the online streaming feature selection algorithms in the in the LOFS architecture are designed independently, and all codes follow the MATALB/OCTAVE standards. This makes that the LOFS library is simple, easy to implement, and extendable flexibly. For example, one can easily add a new algorithm to the LOFS library, then share it through the LOFS framework without modifying the other functions in the library.

In the CM module, LOFS provides four measures to calculate correlations between features, Chi-square test, G2 test, Fisher's Z test, and mutual information, where Chi-square test, G2 test, and mutual information for dealing with discrete data while Fisher's Z test for handling continuous data.

With the measures above, the learning module consists of two submodules, LFI (Learning Features added Individually) and LGF (Learning Grouped Features added sequentially). The LFI module includes Alpha-investing OSFS, Fast-OSFS, and SAOLA to learn features added individually over time, while the LGF module provides the group-SAOLA algorithm to online mine grouped features added sequentially.

Based on the learning module, the SC module provides a series of performance evaluation metrics (i.e., prediction accuracy, AUC, kappa statistic, and compactness, etc.). To conduct statistical comparisons of algorithms, the SC model further provides the Friedman test and the Nemenyi test.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.