In this project, we will explore unfairness in ML modesl using the diabetes dataset.
Fairlearn is an open-source, community-driven project to help data scientists improve fairness of AI systems. It includes:
-
A Python library for fairness assessment and improvement (fairness metrics, mitigation algorithms, plotting, etc.)
-
Educational resources covering organizational and technical processes for unfairness mitigation (user guide, case studies, Jupyter notebooks, etc.)
The project was started in 2018 at Microsoft Research. In 2021 it adopted neutral governance structure and since then it is completely community-driven.
This tutorial is built on the diabetes dataset to explore the racial disparities in how health care resources are allocated in the U.S. We built an automated system for recommending patients for high-risk care management programs. This based on the paper of Obermeyer et al., 2019. https://www.science.org/doi/10.1126/science.aax2342
In details, we will explore:
-
Examine the dataset with the look for types of fairness.
-
Train a classification model and assess its performance and fairness.
-
Apply unfariness mitigation methods.