Git Product home page Git Product logo

minirank's Introduction

Some ranking and ordinal regression algorithms in Python.

WARNING: THIS SOFTWARE HAS SOME BUGS AND IS PROBABLY BROKEN. I'M WORKING ON A FRENSH IMPLEMENTATION OF ORDINAL REGRESSION METHODS IN https://github.com/fabianp/mord

Dependencies

  • numpy
  • scipy

Methods

minirank.ordinal_logistic_fit

train an ordinal logistic model

minirank.ordinal_logistic_predict

predict a model trained by minirank.ordinal_logistic_fit

minirank's People

Contributors

agramfort avatar fabianp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

minirank's Issues

build_ext fails

Trying to run python setup.py build_ext fails on OS X 10.8.3 (Python 2.7.2) with Cython 0.17.

» python setup.py build_ext      fonnescj@Cepeda
running build_ext
Traceback (most recent call last):
  File "setup.py", line 15, in <module>
    requires = ['numpy', 'scipy'],
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/core.py", line 152, in setup
    dist.run_commands()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/dist.py", line 953, in run_commands
    self.run_command(cmd)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/dist.py", line 972, in run_command
    cmd_obj.run()
  File "/Library/Python/2.7/site-packages/Cython/Distutils/build_ext.py", line 159, in run
    if self.cython_gdb or [1 for ext in self.extensions
TypeError: 'NoneType' object is not iterable

Problem with prediction code / size of the weight vector

This is an interesting project! However, the code seems to be doing different things from what you describe in your blog post.

  1. Shouldn't the sentence "We will then assign the class j if the prediction wTX lies in the interval [θj−1,θj[" translate into:
def ordinal_logistic_predict(w, theta, X):
    """
    Parameters
    ----------
    w : coefficients obtained by ordinal_logistic
    theta : thresholds
    """
    unique_theta = np.sort(np.unique(theta))
    out = X.dot(w)
    unique_theta[-1] = np.inf # p(y <= max_level) = 1
    tmp = out[:, None].repeat(unique_theta.size, axis=1)
    return np.argmax(tmp < unique_theta, axis=1)

If I make this change, I obtain much better performance, but it seems almost too perfect:

MEAN ABSOLUTE ERROR (ORDINAL LOGISTIC):    2.88859180036
MEAN ABSOLUTE ERROR (LOGISTIC REGRESSION): 3.83957219251
MEAN ABSOLUTE ERROR (RIDGE REGRESSION):    3.5623885918
  1. Why is the threshold vector of the same size as the number of levels K? I believe it should have size K-1, but I am not sure of how to modify the gradient to have this...

"parallel" multinomial logit

Hi Fabian,

I'm working on a dense problem (n_samples=1000, n_features=32000)
for which classical formulation of ordinal logistic regression is somewhat
too constrained.

As an alternative I was considering standard multinomial logit
with the constraint that the sigmoid functions should be parallel.

This would be less restricted than ordinal logistic regression since
the "slopes" of the sigmoids would be allowed to be different
while avoiding any crossing of the curves.

The questions are:

  1. Do you thing such an optimization target loss would be useful?
    Would it be in scope with minirank? Are you aware of any
    paper related to this approach?

  2. Since manual derivation of gradient and hessian is quite heavy and
    error prone, is there any automatic differentiation framework that you
    have tested or you would suggest? Have you done any experiment
    with automatic differentiation?

  3. While being definitely out of scope for minirank. I would be nice to have
    a l2 penalized logistic package for dense/sparse problems based on
    pytron solving multinomial logit (with/without parallel constraint) and
    ordinal regression based on pytron ...

Thanks!
Paolo

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.