Git Product home page Git Product logo

mlr3fairness's Introduction

Machine Learning Fairness Extension for mlr3.

tic StackOverflow Mattermost

Installation

Install the development version from mlr3 repo

remotes::install_github("mlr-org/mlr3fairness")

Why should I care about fairness in machine learning?

Machine Learning model predictions can be skewed by a range of factors and thus might be considered unfair towards certain groups or individuals. An example would be the COMPAS algorithm, which is a popular commercial algorithm used by judges and parole officers for scoring criminal defendant’s likelihood of reoffending (recidivism). Studies have shown, that the algorithm might be biased in favor of white defendants. Biases can occur in a large variety of situations where algorithms automate or support human decision making e.g. credit checks, automatic HR tools along with a variety of other domains.

The goal of mlr3fairness is to allow for auditing of mlr3 learners, visualization and subsequently trying to improve fairness using debiasing strategies.

Feature Overview

  • Fairness Measures: Audit algorithmms for fairness using a variety of fairness criteria. This also allows for designing custom criteria.

  • Fairness Visualizations: Diagnose fairness problems through visualizations.

  • Debiasing Methods: Correct fairness problems in three lines of code.

  • Fairness Report: Obtain a report regarding an algorithm’s fairness. (Under development)

More Information

Protected Attribute

mlr3fairness requires information about the protected attribute wrt. which we want to assess fairness. This can be set via the col_role “pta” (protected attribute). Currently mlr3fairness allows only a single column as "pta".

task$col_roles$pta = "variable_name"

In case a non-categorical or more complex protected attribute is required, it can be manually computed and added to the task. mlr3fairness does not require specific types for pta, but will compute one metric for every unique value in the pta column.

Fairness Metrics

mlr3fairness offers a variety of fairness metrics. Metrics are prefixed with fairness. and can be found in the msr() dictionary. Most fairness metrics are based on a difference between two protected groups (e.g. male and female) for a given metric (e.g. the false positive rate: fpr). See here for a more in-depth introduction to fairness metrics and how to choose them.

library(mlr3)
library(mlr3fairness)
key description
fairness.eod Equalized Odds: Sum of abs. difference between true positive and
false positive rates across groups
fairness.fpr Abs. difference in false positive rates across groups
fairness.acc Abs. difference in accuracy across groups (Overall accuracy equality)
fairness.tpr Abs. difference in true positive rates across groups
fairness.tnr Abs. difference in true negative rates across groups
fairness.ppv Abs. difference in positive predictive values across groups
fairness.npv Abs. difference in negative predictive values across groups
fairness.fp Abs. difference in false positives across groups
fairness.fn Abs. difference in false negatives across groups

The fairness_tensor function can be used with a Prediction in order to print group-wise confusion matrices for each protected attribute group.

Fairness Visualizations

Visualizations can be used with either a Prediction, ResampleResult or a BenchmarkResult. For more information regarding those objects, refer to the mlr3 book.

  • fairness_accuracy_tradeoff: Plot available trade-offs between fairness and model performance.

  • compare_metrics: Compare fairness across models and cross-validation folds.

  • fairness_prediction_density: Density plots for each protected attribute.

Debiasing Methods

Debiasing methods can be used to improve the fairness of a given model. mlr3fairness includes several methods that can be used together with mlr3pipelines to obtain fair(er) models:

library(mlr3pipelines)
lrn = as_learner(po("reweighing_wts") %>>% lrn("classif.rpart"))
rs = resample(lrn, task = tsk("compas")$filter(1:500), rsmp("cv"))
rs$score(msr("fairness.acc"))

Overview:

key input.type.train input.type.predict output.type.train output.type.predict
EOd TaskClassif TaskClassif NULL PredictionClassif
reweighing_os TaskClassif TaskClassif TaskClassif TaskClassif
reweighing_wts TaskClassif TaskClassif TaskClassif TaskClassif

Datasets

mlr3fairness includes two fairness datasets: adult and compas. See ?adult and ?compas for additional information regarding columns.

You can load them using tsk(<key>).

Demo for Adult Dataset

We provide a short example detailing how mlr3fairness integrates with the mlr3 ecosystem.

library(mlr3fairness)

#Initialize Fairness Measure
fairness_measure = msr("fairness.fpr")
#Initialize tasks
task_train = tsk("adult_train")
task_test = tsk("adult_test")
#Initialize model
learner = lrn("classif.rpart", predict_type = "prob")

#Verify fairness metrics
learner$train(task_train)
predictions = learner$predict(task_test)
predictions$score(fairness_measure, task = task_test)

#Visualize the predicted probability score based on protected attribute.
fairness_prediction_density(predictions, task_test)

Extensions

  • The mcboost package integrates with mlr3 and offers additional debiasing post-processing functionality for classification, regression and survival.

Other Fairness Toolkits in R

  • The AI Fairness 360 toolkit offers an R extension that allows for bias auditing, visualization and mitigation.
  • The Fairmodels integrates with the DALEX R-packages and similarly allows for bias auditing, visualization and mitigation.
  • The fairness package allows for bias auditing in R.

Future Development

Several future developments are currently planned. Contributions are highly welcome!

Visualizations

  1. Improvement on visualizations, like anchor points and others. See issues.

Metrics

  1. Add support to non-binary target attributes and non-binary protected attributes.

Debiasing Methods

  1. More Debiasing Methods, post-processing and in-processing.

Fairness Report

Bugs, Feedbacks and Questions

mlr3fairness is a free and open source software project that encourages participation and feedback. If you have any issues, questions, suggestions or feedback, please do not hesitate to open an “issue” about it on the GitHub page! In case of problems / bugs, it is often helpful if you provide a “minimum working example” that showcases the behaviour.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.