Git Product home page Git Product logo

feast's Introduction

FEAST

A FEAture Selection Toolbox for C/C++ & MATLAB/OCTAVE, v2.0.0.

FEAST provides implementations of common mutual information based filter feature selection algorithms, and an implementation of RELIEF. All functions expect discrete inputs (except RELIEF, which does not depend on the MIToolbox), and they return the selected feature indices. These implementations were developed to help our research into the similarities between these algorithms, and our results are presented in the following paper:

 Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection
 G. Brown, A. Pocock, M.-J. Zhao, M. Lujan
 Journal of Machine Learning Research, 13:27-66 (2012)

The weighted feature selection algorithms are described in Chapter 7 of:

 Feature Selection via Joint Likelihood
 A. Pocock
 PhD Thesis, University of Manchester, 2012

If you use these implementations for academic research please cite the relevant paper above. All FEAST code is licensed under the BSD 3-Clause License.

Contains implementations of: mim, mrmr, mifs, cmim, jmi, disr, cife, icap, condred, cmi, relief, fcbf, betagamma

And weighted implementations of: mim, cmim, jmi, disr, cmi

References for these algorithms are provided in the accompanying feast.bib file (in BibTeX format).

FEAST works on discrete inputs, and all continuous values must be discretised before use with FEAST. In our experiments we've found that using 10 equal width bins is suitable for many problems, though this is data set size dependent. FEAST produces unreliable results when used with continuous inputs, runs slowly and uses much more memory than usual. The discrete inputs should have small cardinality, FEAST will treat values {1,10,100} the same way it treats {1,2,3} and the latter will be both faster and use less memory.

MATLAB Example (using "data" as our feature matrix, and "labels" as the class label vector):

>> size(data)
ans = 
     (569,30)                                     %% denoting 569 examples, and 30 features
>> selectedIndices = feast('jmi',5,data,labels) %% selecting the top 5 features using the jmi algorithm
selectedIndices =

    28
    21
     8
    27
    23
>> selectedIndices = feast('mrmr',10,data,labels) %% selecting the top 10 features using the mrmr algorithm
selectedIndices =

    28
    24
    22
     8
    27
    21
    29
     4
     7
    25
>> selectedIndices = feast('mifs',5,data,labels,0.7) %% selecting the top 5 features using the mifs algorithm with beta = 0.7
selectedIndices =

    28
    24
    22
    20
    29

The library is written in ANSI C for compatibility with the MATLAB mex compiler, except for MIM, FCBF and RELIEF, which are written in MATLAB/OCTAVE script. There is a different implementation of MIM available for use in the C library.

MIToolbox v3.0.0 is required to compile these algorithms, and these implementations supercede the example implementations given in that package (they have more robust behaviour when used with unexpected inputs).

MIToolbox can be found at: http://www.github.com/Craigacp/MIToolbox/

The C library expects all matrices in column-major format (i.e. Fortran style). This is for two reasons, a) MATLAB generates Fortran-style arrays, and b) feature selection iterates over columns rather than rows, unlike most other ML processes.

Compilation instructions:

  • MATLAB/OCTAVE
    • run CompileFEAST.m in the matlab folder.
  • Linux C shared library
    • run make x86 or make x64 for a 32-bit or 64-bit library.
  • Windows C dll (expects pre built libMIToolbox.dll)
  • Java (requires Java 8)
    • run make x64, sudo make install to build and install the C library.
    • then make java to build the JNI wrapper.
    • then run mvn package in the java directory to build the jar file.
    • Note: the Java code should work on all platforms and future versions of Java, but the included Makefile only works on Ubuntu & Java 8.

Update History

  • 07/01/2017 - v2.0.0 - Added weighted feature selection, major refactoring of the code to improve speed and portability. FEAST functions now return the internal scores assigned by each criteria as well. Added a Java API via JNI. FEAST v2 is approximately 30% faster when called from Matlab.
  • 12/03/2016 - v1.1.4 - Fixed an issue where Matlab would segfault if all features had zero MI with the label.
  • 12/10/2014 - v1.1.2 - Updated documentation to note that FEAST expects column-major matrices.
  • 11/06/2014 - v1.1.1 - Fixed an issue where MIM wasn't compiled into libFSToolbox.
  • 22/02/2014 - v1.1.0 - Bug fixes in memory allocation, added a C implementation of MIM, moved the selected feature increment into the mex code.
  • 12/02/2013 - v1.0.1 - Bug fix for 32-bit Windows MATLAB's lcc.
  • 08/11/2011 - v1.0.0 - Public Release to complement the JMLR publication.

feast's People

Contributors

bieito98 avatar craigacp avatar evgenidubov avatar mellorjc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

feast's Issues

Mex file crash

Hi, I tried to use the Matlab toolbox, but I got crash with the mex file.
I attached the log file created by matlab. Thank you.


      Access violation detected at Sun Aug 07 20:39:05 2016

Configuration:
Crash Decoding : Disabled
Crash Mode : continue (default)
Current Graphics Driver: Unknown hardware
Default Encoding : windows-1252
Graphics card 1 : NVIDIA ( 0x10de ) NVIDIA GeForce GTX 560 Ti Version 9.18.13.4788
Graphics card 2 : NVIDIA ( 0x10de ) NVIDIA Tesla M2090 Version 10.18.13.6256
Host Name : ChuongWin8
MATLAB Architecture : win64
MATLAB Root : C:\Program Files\MATLAB\R2016a
MATLAB Version : 9.0.0.341360 (R2016a)
OpenGL : hardware
Operating System : Microsoft Windows 8.1 Pro
Processor ID : x86 Family 6 Model 45 Stepping 6, GenuineIntel
Virtual Machine : Java 1.7.0_60-b19 with Oracle Corporation Java HotSpot(TM) 64-Bit Server VM mixed mode
Window System : Version 6.3 (Build 9600)

Fault Count: 1

Abnormal termination:
Access violation

Feature Score For CMIM

Dear Adam

I need to get a score for each fearure as I need to to know the difference between them. For example, how much important is feature 1 than feature 2?

Can please provide the code for that?

Thanks

Elina

segmentation fault with pyfeast

I compiled this code to use with PyFeast. I ran make x64 and sudo make install. When I tried to run the code that uses FEAST through PyFeast I got a segmentation fault. I determined the segmentation fault was due to the FEAST code.

I didn't find instructions on proper installation with the make file, could the segmentation fault be due to incorrect installation, or are there any suggestions for another reason I got a segmentation fault?

JMI and DISR generates the same score for selectable and non selectable case.

Hello!

I'm testing the latest feast toolbox with this simple code:

y=[zeros(5,1);ones(5,1)];

X=ones(1,10); %Constant
X=[X' rand(10,1)]; %Random
X=[X y]; %Just a copy of target Y

[sel,score]=feast('jmi',3,X,y)
sel =
3
1
2

score =
1
1
1

And the score of all the variables are the same. Is that the desired result? I'm expecting different scores.

Cannot find the referenced paper

Hey,

I am looking for the following paper you mentioned:

 Information Theoretic Feature Selection for Cost-Sensitive Problems
 A. Pocock, N. Edakunni, M.-J. Zhao, M. Lujan, G. Brown.
 ArXiv

I didn't find it on ArXiv. Can you provide me a link?

Best,

Andreas

FEAST Portability

@alecuba16 and @mbq have worked on different R interfaces for FEAST (see #3). This issue is to discuss merging in any changes to the core C library or build system to improve usability for R and other possible ports.

Add pre given feautes to CMIM.

Hello!

As I told you on mloss, I would like to modify the CMIM to accept a set of predefined features by a input , so as a first approach I will work on this code from CMIM.c:

I think that the first loop on lines 80-94 is mandatory because you have to calculate first the max CMIM of whole the features (including pre-selected), is right?

Imagine that I have already received preSelectedFeatures as an array of the positions of the features presents on feature2D and a auxiliary function isApreSelectedFeature as a simply search and check if exists.

CMIM.c lines 104 ->129

for (i = 1; i < k; i++)
{
score = 0.0;
iMinus = i-1;
for (j = 0; j < noOfFeatures; j++)
{
while ((classMI[j] > score) && (lastUsedFeature[j] < i))
{
----------- my code ----------------------------------
if( isApreSelectedFeature(j,preSelectedFeatures)){
outputFeatures[i] = j;
}else{
---------- finish my code ----------------------------
/double calculateConditionalMutualInformation(double *firstVector, double *targetVector, double *conditionVector, int vectorLength);/
currentFeature = (int) outputFeatures[lastUsedFeature[j]];
conditionalInfo = calculateConditionalMutualInformation(feature2D[j],classColumn,feature2D[currentFeature],noOfSamples);
if (classMI[j] > conditionalInfo)
{
classMI[j] = conditionalInfo;
}/reset classMI/
}
/moved due to C indexing from 0 rather than 1/
lastUsedFeature[j] += 1;
}/while partial score greater than score & not reached last feature/
if (classMI[j] > score)
{
score = classMI[j];
outputFeatures[i] = j;
}/if partial score still greater than score/
}/for number of features/
}/*for the number of feature

What do you think?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.