Git Product home page Git Product logo

mc4's Introduction

Markov Chain Type 4 Rank Aggregation

implementation of MC4 and MCT Rank Aggregation algorithm using Python

Description

This project is all about implementing two of the most popular rank aggregation algorithms, Markov Chain Type 4 or MC4 and MCT. In the field of Machine Learning and many other scientific problems, several items are often needed to be ranked based on some criterion. However, different ranking schemes order the items based on different preference criteria. Hence the rankings produced by them may differ greatly.

Therefore a rank aggregation technique is often used for combining the individual rank lists into a single aggregated ranking. Though there are many rank aggregation algorithms, MC4 and MCT are two of the most renowned ones.

Resource

Links to the original contents

Installation

For the latest release, pip install mc4

For a specific release, pip install mc4=={version} such as pip install mc4==1.0.0

General Usage

Using this package is very easy.

  1. Prepare a dataset containing ranks of all the items provided by different algorithms. See here for sample datasets and more info.

  2. Use following lines of code to use the package. Make sure to pass arguments according to your dataset otherwise answers will be incorrect.

from mc4.algorithm import mc4_aggregator
import pandas as pd

# Method 1
aggregated_ranks = mc4_aggregator('test_dataset_1.csv', header_row = 0, index_col = 0) 

# or Method 2
df = pd.read_csv('test_dataset_1.csv', header = 0, index_col = 0)
aggregated_ranks = mc4_aggregator(df, header_row = 0, index_col = 0) 

print(aggregated_ranks)

here test_dataset_1.csv is a sample dataset containing ranks of different items provided by different algorithms.

mc4_aggregator takes some mandatory and optional arguments -

  • algo (string): algorithm for rank aggregation, mc4 or mct, default is mc4
  • order (string): order of the dataset, row or column, default is row. More on this, here.
  • header_row (int or None): row number of the dataset containing the header, default is None
  • index_col (int or None): column number of the dataset containing the index, default is None
  • precision (float): acceptable error margin for convergence, default is 1e-07
  • iterations (int): number of iterations to reach stationary distribution, default is 200
  • erg_number (float): small, positive number used to calculate ergodic transition matrix, default is 0.15

Command Line Usage

You can directly use this package from command line if you have the dataset prepared already.

  • To get help and usage details,

    ~$ mc4_aggregator -h or --help
  • Use with default settings,

    ~$ mc4_aggregator dataset.csv
  • Specify the algorithm for rank aggregation using -a or --algo, options: mc4 or mct, default is mc4

    ~$ mc4_aggregator dataset.csv -a mct
  • Specify order using -oor --order, options: row or column, default is row

    ~$ mc4_aggregator dataset.csv -o column
  • Specify header row using -hr or --header_row, default is None

    ~$ mc4_aggregator dataset.csv -hr 0
  • Specify index column using -ic or --index_col, default is None

    ~$ mc4_aggregator dataset.csv -ic 0
  • Specify precision using -p or --precision, default is 1e-07

    ~$ mc4_aggregator dataset.csv -p 0.000001
  • Specify iterations using -i or --iterations, default is 200

    ~$ mc4_aggregator dataset.csv -i 300
  • Specify ergodic number using -e or --erg_number, default is 0.15

    ~$ mc4_aggregator dataset.csv -e 0.20
  • All together,

    ~$ mc4_aggregator dataset.csv -a mct -o column -hr 0 -ic 0 -p 0.000001 -i 300 -e 0.20

Output

Output of mc4_aggregator will be a dictionary containing itemwise ranks. In absence of item names, items will be represented using integers.

Important Links

mc4's People

Contributors

ayan-kumar-saha avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

mc4's Issues

Does this handle partitally ordered lists?

I have a list of rankings test_ranks = [['Basil', 'Carrot', 'Apple', 'Dandelion'], ['Apple', 'Basil', 'Dandelion', 'Eggs'], ['Apple', 'Fenugreek', 'Eggs', 'Dandelion']]

Each list inside test_ranks is a ranked list. Not all the items are in all the lists. Will this package be able to handle this? If yes, how do I create a dataset with these?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.