Git Product home page Git Product logo

fairtest's Introduction

FairTest

FairTest enables developers or auditing entities to discover and test for unwarranted associations between an algorithm's outputs and certain user subpopulations identified by protected features.

FairTest works by learning a special decision tree, that splits a user population into smaller subgroups in which the association between protected features and algorithm outputs is maximized. FairTest supports and makes use of a variety of different fairness metrics each appropriate in a particular situation. After finding these so-called contexts of association, FairTest uses statistical methods to assess their validity and strength. Finally, FairTest retains all statistically significant associations, ranks them by their strength, and reports them as association bugs to the user.

Local Installation

FairTest is a Python application, developed and tested with Python 2.7. FairTest uses rpy2 python package that provides a python interface for R programming language and requires R (version > 3.1) to be installed. First, add the latests version of R (for Ubuntu 12.04 and 14.04).

sh -c 'echo "deb http://cran.rstudio.com/bin/linux/ubuntu trusty/" >> /etc/apt/sources.list'
gpg --keyserver keyserver.ubuntu.com --recv-key E084DAB9
gpg -a --export E084DAB9 | sudo apt-key add -
apt-get update
apt-get -y install r-base r-base-dev liblzma-dev libfreetype6-dev

Then, make sure python is properly installed

apt-get -y install python python-dev python-pip

Install mongo-db fo using Fairtest as an service.

apt-get install -y mongodb-server redis-server

Now, create a python virtual environment and install the required pip package dependencies.

apt-get install python-virtualenv
virtualenv venv
source venv/bin/activate
pip2 install numpy sklearn statsmodels scipy prettytable pydot ete2 rpy2 eve redis rq requests matplotlib pyyaml
python2.7 setup.py install

Preconfigured Virtual Machine

Alternatively, you can download an Ubuntu virtual machine with a complete, up-to-date FairTest installation available here. Launch the VM either with VMWare workstation or with Virtualbox and activate the preconfigured python virtual environment as follows:

cd ~/fairtest
source venv/bin/activate

Quick Start

Different benchmark datasets in CSV format are located in fairtest/data. You can use the utils.prepare_data.data_from_csv() function to load a dataset as a Pandas DataFrame, the format expected by FairTest investigations. The first line of the csv file should list the names of the different features.

from fairtest.utils.prepare_data import data_from_csv

data = data_from_csv('fairtest/data/adult/adult.csv', to_drop=['fnlwgt'])

The data is then pre-processed and split into training and testing sets by encapsulating it in a DataSource object.

from fairtest.holdout import DataSource

data = DataSource(data, budget=1, conf=0.95)

This creates a training set and holdout set that can be used to perform a single batch of investigations with an overall testing confidence of 95%. Budgets larger than 1 allow for adaptive data analysis, where new investigations may be performed based on previous results, and validated over an independent testing set.

Testing

To test for associations between user income and race or gender, first create the appropriate Fairtest Investigation:

from fairtest.testing import Testing

SENS = ['sex', 'race']     # Protected features
TARGET = 'income'             # Output
EXPL = ''                     # Explanatory feature

inv = Testing(data, SENS, TARGET, EXPL)

After you instantiated all the Investigations you wish to perform, you can train the guided decision tree, test the discovered association bugs (and correct for multiple testing), and report the results:

from fairtest.investigation import train, test, report

all_investigations = [inv]

train(all_investigations)
test(all_investigations)
report(all_investigations, 'adult', output_dir='/tmp/')

Discovery

Discovery investigations enable the search for potential associations over a large output space, with no prior knowledge of which outputs to focus on. An additional instance parameter topk specifies the maximum number of outputs that exhibit the strongest associations to consider:

from fairtest.discovery import Discovery

SENS = [...]        # Protected features
TARGET = [...]      # List of output labels
EXPL = ''           # Explanatory feature

inv = Discovery(data, SENS, TARGET, EXPL, topk=10)

Error Profiling

ErrorProfiling investigations let you search for user subpopulations for which an algorithm exhibits abnormally high error rates. The investigation expects an additional input specifying the ground truth for the algorithm's predictions. An appropriate error measure is then computed:

from fairtest.error_profiling import ErrorProfiling

SENS = [...]            # Protected features
TARGET = '...'          # Predicted output
GROUND_TRUTH = '...'    # Ground truth feature
EXPL = ''               # Explanatory feature

inv = ErrorProfiling(data, SENS, TARGET, GROUND_TRUTH, EXPL)

Explanatory Attribute

It is possible to specify a user attribute as explanatory, meaning that FairTest will only look for associations among users that are equal with respect to this attribute. We currently support a single, categorical attribute as explanatory for investigations with categorical protected features and outputs. Support for more general explanatory attributes can be enabled by defining further Fairness Metrics (see Extensions section below).

Other Examples

Additional examples, demonstrating how to use FairTest, are at: src/fairtest/examples.

Fairtest as an Online Service

In addition to using it as a standalone library, Fairtest can also be deployed as an online service. Our prototype supports multiple users asynchronously conducting Fairtest investigations. Users can post investigations through a web interface and access the respective bug reports once the experiments are completed. Our implementation is based on python job queues and each Fairtest investigation is abstracted into a job which is dispatched for asynchronous execution into a poll of workers.

To activate the online Fairtest service locally use the following instructions:

cd ~/fairtest/src/fairtest/service
source venv/bin/activate
python2.7 launch_server.py

This will launch the front-end server which is accessible at your local interface, port 5000 (http://0.0.0.0:5000/fairtest). Then, create a new tab and launch back-end workers (default number of workers is five).

python2.7 launch_workers.py

At that point you can navigate the web interface and post Fairtest investigations.

Extensions

Metrics

FairTest currently supports the following metrics:

  • Normalized Mutual Information (NMI)
    • For categorical protected feature and output
  • Normalized Conditional Mutual Information (CondNMI)
    • For categorical protected feature and output with explanatory feature
  • Binary Ratio (RATIO)
    • For binary protected feature and output
  • Binary Difference (DIFF)
    • For binary protected feature and output
  • Conditional Binary Difference (CondDIFF)
    • For binary protected feature and output with explanatory feature
  • Pearson Correlation (CORR)
    • For ordinal protected feature and output
  • Logistic Regression (REGRESSION)
    • For binary protected feature and multi-labeled output

By default FairTest selects an appropriate metric depending on the type of investigation and of protected and output features provided. You can specify a particular metric to use (as long as that metric is applicable to the data at hand) with the metrics parameter passed to an Investigation:

from fairtest.testing import Testing

SENS = ['gender', 'race']   # Protected features
TARGET = 'income'           # Output
EXPL = ''                   # Explanatory feature

metrics = {'gender': 'DIFF'}  # Specify a metric for 'gender' and let FairTest
                            # select a default metric for 'race'

inv = Testing(data, SENS, TARGET, EXPL, metrics=metrics)

FairTest can be extended with custom metrics, in order to handle situations where the above metrics are not applicable. The class fairtest.modules.metrics.metric.Metric defines an abstract metric. Metrics can expect three types of data: in the form of a contingency table (categorical features), of aggregate statistics (ordinal features), or non-aggregated data (for regression). The main method called on a Metric is compute, which calculates a confidence interval and p-value and stores these as the class attribute stats. The abstract Metric class provides a default compute method that calls instance specific methods for computing either exact or approximate statistics. Subclasses of Metric can either implement these specific methods (see fairtest.modules.metrics.mutual_info.NMI for instance) or redefine the computemethod entirely (see for example fairtest.modules.metrics.regression.REGRESSION).

Logging

FairTest uses Python's standard logging module to log simple information about ongoing investigations, as well as more fine-grained debug information (mainly for the guided tree learning algorithm).

Code Organisation

Directory or File Description
data Demo datasets
src/apps Demo apps
src/fairtest/tests Benchmarks
src/fairtest/examples Examples
src/fairtest/modules/bug_report Bug filter, rank and report module
src/fairtest/modules/context_discovery Guided tree construction module
src/fairtest/modules/metrics Fairness metrics module
src/fairtest/modules/statistics Statistical tests module
src/fairtest/discovery.py Discovery Investigations
src/fairtest/error_profiling.py ErrorProfiling Investigations
src/fairtest/investigation.py Train, Test, Report for arbitrary Investigations
src/fairtest/testing.py Testing Investigations
src/fairtest/service Online service module

Reproducing Results

To reproduce the results from our paper above, you can run the IPython notebooks medical.ipynb, recommender.ipynb and test.ipynb. Make sure to restart the notebooks for each experiment, to ensure that you start from a freshly fixed random seed.

Reading Our Paper

FairTest: Discovering Unwarranted Associations in Data-Driven Applications

Citing This Work

If you use FairTest for academic research, you are highly encouraged to cite the following paper:

@article{tramer2015fairtest,
  title={FairTest: Discovering Unwarranted Associations in Data-Driven Applications},
  author={Tramer, Florian and Atlidakis, Vaggelis and Geambasu, Roxana and Hsu, Daniel
          and Hubaux, Jean-Pierre and Humbert, Mathias and Juels, Ari and Lin, Huang},
  journal={arXiv preprint arXiv:1510.02377},
  year={2015}
}

fairtest's People

Contributors

derekhxz avatar kangkookjee avatar ephiepark avatar djhsu avatar roxanageambasu avatar

Stargazers

Tony Gu avatar  avatar Chase M Fensore avatar  avatar richie_ob avatar Anirudh Swaminathan avatar Keon Hee Park avatar Pratik Agrawal avatar Hannah Pullen-Blasnik avatar Callysthenes avatar Linda Fernsel avatar  avatar Joachim Baumann avatar  avatar Gauri Gupta avatar Reem avatar  avatar Daniel Kershaw avatar  avatar Justin Gould avatar Bishwamittra Ghosh avatar Hironobu Kawaguchi avatar Corinna Hertweck avatar Juri Hahn avatar Andrew Kane avatar  avatar Roya Pakzad avatar Masoud Hashemi avatar Pomin Wu avatar Dexter Fichuk avatar  avatar Chris Alert avatar Ashraf Abdul avatar Jake Sherman avatar Chris avatar Victoria Duthie avatar Katie Clark avatar Ansh avatar HaiQW avatar Marius Miron avatar Peter Byfield avatar Shlomi Hod avatar Donald Pinckney avatar  avatar Joey avatar Akash Levy avatar Mark avatar Kacper Sokol avatar Icaro Seara avatar Stefan Otte avatar Vaggelis Atlidakis avatar Katharine avatar Amit Datta avatar (╯°□°)╯︵ uᴉǝssnH ɐɟɐʇsoW avatar

Watchers

 avatar Rodrigo Mizobe avatar Riley Spahn avatar James Cloos avatar Kangkook Jee avatar Mathias avatar  avatar Claudia Scholz avatar  avatar  avatar  avatar Gail Kaiser avatar  avatar  avatar Yannis Spiliopoulos avatar  avatar Florian Tramer avatar  avatar JP Hubaux avatar Victoria Duthie avatar

fairtest's Issues

error in p-values order

in multiple_testing.py code, compute_investigation_stats function, the order of the p-values do not match the all_stats due to the dictionary rearrangements happening in python2.

CSV parse error while generating report.

I encountered the following exception while I process an input from Taintmark experiment.

  • Error message (Exception trace)
Traceback (most recent call last):
  File "fairtest_driver.py", line 93, in main
    driver(conf, fpath, sens, target)
  File "fairtest_driver.py", line 119, in driver
    report([inv], "testing", conf.OUTPUT_DIR)
  File "build/bdist.linux-x86_64/egg/fairtest/investigation.py", line 402, in report
    plot_dir=sub_plot_dir)
  File "build/bdist.linux-x86_64/egg/fairtest/modules/bug_report/report.py", line 240, in bug_report
    output_stream)
  File "build/bdist.linux-x86_64/egg/fairtest/modules/bug_report/report.py", line 333, in print_context_ct
    print >> output_stream, pretty_ct(ct)
  File "build/bdist.linux-x86_64/egg/fairtest/modules/bug_report/report.py", line 584, in pretty_ct
    pretty_table = prettytable.from_csv(output)
  File "build/bdist.linux-x86_64/egg/prettytable.py", line 1337, in from_csv
    dialect = csv.Sniffer().sniff(fp.read(1024))
  File "/usr/lib/python2.7/csv.py", line 188, in sniff
    raise Error, "Could not determine delimiter"

To reproduce the problem, you can use the attached file (test0.csv) as an input to fairtest with the following setting.

SENS = ['dest', 'msg_type', 'url', 'app_version', 'cc', 'device_type', 'efs', 'lang', 'nonce', 'signature', 'time', 'ywsid', 'application/x-www-form-urlencoded', 'email', 'password']
TARGET = 'input_gps'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.