fairlearn / fairlearn Goto Github PK

A Python package to assess and improve fairness of machine learning models.

License: MIT License

Python 99.56% PowerShell 0.09% Makefile 0.07% Shell 0.28%

ai ai-systems artificial-intelligence fairness fairness-ai fairness-assessment fairness-ml group-fairness harms machine-learning responsible-ai unfairness-mitigation

fairlearn's Introduction

Fairlearn

Fairlearn is a Python package that empowers developers of artificial intelligence (AI) systems to assess their system's fairness and mitigate any observed unfairness issues. Fairlearn contains mitigation algorithms as well as metrics for model assessment. Besides the source code, this repository also contains Jupyter notebooks with examples of Fairlearn usage.

Website: https://fairlearn.org/

Current release
What we mean by *fairness*
Overview of Fairlearn
Fairlearn metrics
Fairlearn algorithms
Install Fairlearn
Usage
Contributing
Maintainers
Issues

Current release

The current stable release is available on PyPI.
Our current version may differ substantially from earlier versions. Users of earlier versions should visit our version guide to navigate significant changes and find information on how to migrate.

What we mean by fairness

An AI system can behave unfairly for a variety of reasons. In Fairlearn, we define whether an AI system is behaving unfairly in terms of its impact on people – i.e., in terms of harms. We focus on two kinds of harms:

Allocation harms. These harms can occur when AI systems extend or withhold opportunities, resources, or information. Some of the key applications are in hiring, school admissions, and lending.
Quality-of-service harms. Quality of service refers to whether a system works as well for one person as it does for another, even if no opportunities, resources, or information are extended or withheld.

We follow the approach known as group fairness, which asks: Which groups of individuals are at risk for experiencing harms? The relevant groups need to be specified by the data scientist and are application specific.

Group fairness is formalized by a set of constraints, which require that some aspect (or aspects) of the AI system's behavior be comparable across the groups. The Fairlearn package enables assessment and mitigation of unfairness under several common definitions. To learn more about our definitions of fairness, please visit our user guide on Fairness of AI Systems.

Note: Fairness is fundamentally a sociotechnical challenge. Many aspects of fairness, such as justice and due process, are not captured by quantitative fairness metrics. Furthermore, there are many quantitative fairness metrics which cannot all be satisfied simultaneously. Our goal is to enable humans to assess different mitigation strategies and then make trade-offs appropriate to their scenario.

Overview of Fairlearn

The Fairlearn Python package has two components:

Metrics for assessing which groups are negatively impacted by a model, and for comparing multiple models in terms of various fairness and accuracy metrics.
Algorithms for mitigating unfairness in a variety of AI tasks and along a variety of fairness definitions.

Fairlearn metrics

Check out our in-depth guide on the Fairlearn metrics.

Fairlearn algorithms

For an overview of our algorithms please refer to our website.

Install Fairlearn

For instructions on how to install Fairlearn check out our Quickstart guide.

Usage

For common usage refer to the Jupyter notebooks and our user guide. Please note that our APIs are subject to change, so notebooks downloaded from main may not be compatible with Fairlearn installed with pip. In this case, please navigate the tags in the repository (e.g. v0.7.0) to locate the appropriate version of the notebook.

Contributing

To contribute please check our contributor guide.

Maintainers

A list of current maintainers is on our website.

Issues

Usage Questions

Pose questions and help answer them on Stack Overflow with the tag fairlearn or on Discord.

Regular (non-security) issues

Issues are meant for bugs, feature requests, and documentation improvements. Please submit a report through GitHub issues. A maintainer will respond promptly as appropriate.

Maintainers will try to link duplicate issues when possible.

Reporting security issues

To report security issues please send an email to [email protected].

fairlearn's People

Contributors

Stargazers

Watchers

Forkers

beygel priya-gittest akbari59 ahoyosid ikhushbu bettier hollyxie csergeb bee-zest mvanveen jdnklau lukehare adewin olliethomas bethz rihorn2 seongl nguyentaduy imatiach-msft arita37 dsolanno jding0 adrinjalali chisingh sdjoko borisruf pandinosaurus mmm84766 aprilxiaoyanliu enixon44 kevinrobinson srayagarwal xrosliang parul100495 mirodudik safwanhossain lixinlu1997 hannanabdul55 umass-responsible-ai marksoper romanlutz afehmiu yuewangluisa dgiova fjstinar janhavi13 huseyinelci goodluckwlx gautamtverma sanktrex99 koaning jdhazard lukemudhara zengshuwei bharatr21 haibinzheng rezacsedu rahul-balaji6 keshava sanjoy-bose y12uc231 stjordanis cacosta33 jp-ghub zairahms malikamalik amitsingh0823 moacybarros arjsingh hildeweerts srinivest rina-manjeswar aishpn5 sinarj claudio-toledo ykt345 wildessilva gaugup givgramacho damladdo rmoin pnps7076 marcofernandez007 aazibak marwan1023 hiteshshettyk90 angelacib msrnyc dref360 giuseppecg mdivyanshu rohit967 leijie-wang acholonu basifrank dnzengou crojase muskanmahajan37 hahabettier folarin14

fairlearn's Issues

this is not an issue

Create makefile and replace powershell

All popular repos provide makefiles, so we should, too.

Apart from that we should move away from powershell since we want to be inclusive to all kinds of users & contributors.

project_lambda, signed_weights, gamma need descriptions

This is for exponentiated gradient.

ExponentiatedGradient needs comprehensive tests

The current set of tests is too limited. We need to extend that asap.

UPDATE: argument tests are now there thanks to @riedgar-ms but we still need functional tests

Add CDF curves to report

For regression models, add view with CDF curves and minimum distance between sub-group metric

Run perf tests on large datasets / stress test

We should test the limits. If you're interested in tackling this please reach out.

Allow grouping by multiple sensitive features

Multiple sensitive features are a common scenario. The only other way to capture intersectionality is by creating a 1-dimensional grouping vector manually and providing that to Fairlearn. We want to make this easier and allow users to provide an m-dimensional matrix. This needs to be enabled on three levels:

mitigation techniques
metrics
dashboard

The current implementation for mitigation techniques simply considers all combinations from the sensitive_features matrix, e.g. if column A has 4 distinct values and column B has 3 distinct values and all combinations occur we have 4*3=12 groups.

It is possible that users want more fine-grained control, but we'll wait for that feedback. This could mean specifying multiple constraints (demographic parity w.r.t. column A and demographic parity w.r.t. column B, but not intersections of A and B, for example).

Unify string-metric mappings

Currently the GroupMetricSet and FairlearnDashboard implement the same mapping of strings to metric functions, but do so separately. This is obviously a source of potential bugs an heartache in the future, so some common mapping would be desirable.

Postprocessing - Separate plotting

Having the decision on plotting done in the constructor of the ThresholdOptimizer object is rather odd. The plotting should be moved to a separate routine

How should the probabilities generated by best_classifier be turned in predictions?

The returned best_classifier gives the probability of a sample belonging to class 1. However, I am conflicted in how I should turn this probability into a prediction.

Intuitively, it would make sense to binarize this probability at a 0.5 threshold. Not doing this would result in probability matching, which would result in suboptimal accuracy.

However, in practice, I find that simply sampling from the probability (predicting class 0 if a random number in [0,1] exceeds the probability) results in better fairness.

I am wondering what the correct method is.

Could fairlearn work with random forest or xgboost?

I'm looking to use fairlearn with the most powerful algorithm possible for binary classification. How can I integrate the weights to work with one of these algorithms?

setup a readthedocs documentation

It may be useful to have the API docs compiled on readthedocs for people to easily go through them.

The returned best_classifier

When I used the returned best_classifier to do prediction. The prediction scores are binary not probability based. How could I generate the probability based prediction? Thanks

python tests instead of notebooks

Would it be okay to replace the notebooks tests with .py tests? It'd be much easier to work with, and it'll also remove a bunch of test dependencies we have right now.

What if I have my own train sample weights?

For Fairlearn, we can simply regard it as changing the sample weights and targets(Y) over iterations to make our classifier fair. What if I have my own train sample weights and want the returned algorithm also considers this? Thanks.

Expectations for predict, predict_proba, etc.

Since fairlearn uses randomized classifiers we need to set expectations for what predict, predict_proba etc. return.

Separate fairlearn into fairlearn-core and fairlearn including visualizations

test_bgl_unfair fails consistently on Windows with Python3.6

Example: https://dev.azure.com/responsibleai/fairlearn/_build/results?buildId=3657&view=logs&jobId=a846d25a-e32c-5640-1b53-e815fab94407&j=f2ae15cd-30d2-5fb2-07f7-558e9be6b891&s=778b61a2-25e7-5600-2cb0-dd2dc8e23627

The affected code:

assert logging_all_close([[3.2, 11.2],
                              [-3.47346939, 10.64897959],
                              [-2.68, 10.12],
                              [-1.91764706, 9.61176471],
                              [-1.18461538,  9.12307692],
                              [-0.47924528,  8.65283019],
                              [0.2, 0.7]],
                             all_predict)

and only the first entry fails.

E        +  where False = logging_all_close([[3.2, 11.2], [-3.47346939, 10.64897959], [-2.68, 10.12], [-1.91764706, 9.61176471], [-1.18461538, 9.12307692], [-0.47924528, 8.65283019], ...], [array([ 3.03010885, 11.2       ]), array([-3.47346939, 10.64897959]), array([-2.68, 10.12]), array([-1.91764706,  9.61176471]), array([-1.18461538,  9.12307692]), array([-0.47924528,  8.65283019]), ...])

test\unit\reductions\grid_search\test_grid_search_regression.py:72: AssertionError
---------------------------- Captured stdout call -----------------------------
a mismatches:  [3.2]
b mismatches:  [3.03010885]
mismatch indices:  (array([0], dtype=int64), array([0], dtype=int64))

rationale for default values of constants in exponentiated gradient

Would be awesome to have a description of where these numbers come from or why they might be reasonable.

add github badges at top of readme

add usual GitHub badges at top of readme.
Additional badges can be added as needed.

simplex method failed

Sometimes I get this error. Sometimes it's fine. It seems the error is from /scipy/optimized/_linprog.py

ValueError: Phase 1 of the simplex method failed to find a feasible solution. The pseudo-objective function evaluates to 1.7e-12 which exceeds the required tolerance of 1e-12 for a solution to be considered 'close enough' to zero to be a basic solution. Consider increasing the tolerance to be greater than 1.7e-12. If this tolerance is unacceptably large the problem may be infeasible.

float as supported type for labels in classification scenarios

We should think about whether that's something we want to support, or whether we want to stick with int. If we don't want to support it we should at least check for the type and raise an exception with clear instructions for the user.

Add tests with DNNs

We don't currently have tests with pytorch or tensorflow (or other popular packages for that matter). We should figure out a standard way to test all of them with all mitigation techniques to ensure that always works and we catch breaking changes.

Specific cutoff for Fair Regression

I just read the new paper (Fair Regression). Is it possible to just optimize some specific cutoff in Fair Regression? (e.g. 0.65 approval cutoff in lending business).

Make fairlearn available through conda-forge

https://conda-forge.org/

Mapping unweighted estimators to weighted estimators

I am wondering if there are any underlying requisites to apply any sklearn model, when I applied sklearn.neural_network.MLPClassifier() it says

But when I applied sklearn.svm.SVC(), this issues doe not arise.
It seems to me that the input object should have class_weight attribute, which MLPClassifier() does not have. So is this true?

DOC How is randomization handled

The docs say:

Randomization. In contrast with scikit-learn, estimators in fairlearn can produce randomized predictors. Randomization of predictions is required to satisfy many definitions of fairness. Because of randomization, it is possible to get different outputs from the predictor's predict method on identical data. For each of our methods, we provide explicit access to the probability distribution used for randomization.

sklearn does have randomization in many estimators (RandomForests as an example :P). But randomness is always controlled by a random_seed parameter. Reproducibility requires setting this parameter, to be able to go back and reproduce the results.

It is understandable if in the context of fairness the RNG shouldn't be fixed, but shouldn't the user be able to feed in a seed or a seed and have reproducible results?

Also, the user can set the RNG, and still get probabilistic output given the same input. I could have:

clf = MyClassifier(random_seed=42)
clf.fit(X, y)
clf.predict(x0) -> returns 0
clf.predict(x0) -> returns 1

but if the user runs the same script again, they'll get the same output as before.

Add Error bars

Show error bars in reports when available

Dashboard has no tests

We need at least basic validation to make sure the dashboard shows up at all. This could be as simple as checking that there's some HTML field in the browser, or maybe this can be checked with papermill/nbformat/scrapbook. Any direction that seems promising is appreciated.

Release versioning

We currently push a version to test pypi with dev0 appended, and then another one to pypi. See

https://test.pypi.org/project/fairlearn/#history
and
https://pypi.org/project/fairlearn/#history
for examples.

@adrinjalali has proposed to simply do a rc release to pypi, and once it's stable release without rc. If there are still issues then release with patch. Let's use this thread as a discussion forum.

At the end we need to update the documentation (CONTRIBUTING.md) accordingly.

Enable switching between metrics within a report

Allow user to select multiple metrics of interest and switch between them inside of the report view.

cannot run tests w/o azure authentication

Running tests fail with:

______________________ ERROR at setup of test_perf[[dataset:adult_uci,predictor:rbm_svm,mitigator:ThresholdOptimizer,disparity_metric:equalized_odds]] ______________________

self = <msrestazure.azure_active_directory.AdalAuthentication object at 0x7fd9e7c793a0>, session = <requests.sessions.Session object at 0x7fd9e7c791c0>

    def signed_session(self, session=None):
        """Create requests session with any required auth headers applied.
    
        If a session object is provided, configure it directly. Otherwise,
        create a new session and return it.
    
        :param session: The session to configure for authentication
        :type session: requests.Session
        :rtype: requests.Session
        """
        session = super(AdalAuthentication, self).signed_session(session)
    
        try:
>           raw_token = self._adal_method(*self._args, **self._kwargs)

../.venv/lib/python3.8/site-packages/msrestazure/azure_active_directory.py:448: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <adal.authentication_context.AuthenticationContext object at 0x7fd9e563ed30>, resource = 'https://management.core.windows.net/', client_id = None
client_secret = None

    def acquire_token_with_client_credentials(self, resource, client_id, client_secret):
        '''Gets a token for a given resource via client credentials.
    
        :param str resource: A URI that identifies the resource for which the
            token is valid.
        :param str client_id: The OAuth client id of the calling application.
        :param str client_secret: The OAuth client secret of the calling application.
        :returns: dict with several keys, include "accessToken".
        '''
        def token_func(self):
            token_request = TokenRequest(self._call_context, self, client_id, resource)
            return token_request.get_token_with_client_credentials(client_secret)
    
>       return self._acquire_token(token_func)

../.venv/lib/python3.8/site-packages/adal/authentication_context.py:179: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <adal.authentication_context.AuthenticationContext object at 0x7fd9e563ed30>
token_func = <function AuthenticationContext.acquire_token_with_client_credentials.<locals>.token_func at 0x7fd9e45d6790>, correlation_id = None

    def _acquire_token(self, token_func, correlation_id=None):
        self._call_context['log_context'] = log.create_log_context(
            correlation_id or self.correlation_id, self._call_context.get('enable_pii', False))
        self.authority.validate(self._call_context)
>       return token_func(self)

../.venv/lib/python3.8/site-packages/adal/authentication_context.py:128: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <adal.authentication_context.AuthenticationContext object at 0x7fd9e563ed30>

    def token_func(self):
        token_request = TokenRequest(self._call_context, self, client_id, resource)
>       return token_request.get_token_with_client_credentials(client_secret)

../.venv/lib/python3.8/site-packages/adal/authentication_context.py:177: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <adal.token_request.TokenRequest object at 0x7fd9e45cad00>, client_secret = None

    def get_token_with_client_credentials(self, client_secret):
        self._log.debug("Getting token with client credentials.")
        try:
            token = self._find_token_from_cache()
            if token:
                return token
        except AdalError:
            self._log.exception('Attempt to look for token in cache resulted in Error')
    
        oauth_parameters = self._create_oauth_parameters(OAUTH2_GRANT_TYPE.CLIENT_CREDENTIALS)
        oauth_parameters[OAUTH2_PARAMETERS.CLIENT_SECRET] = client_secret
    
>       token = self._oauth_get_token(oauth_parameters)

../.venv/lib/python3.8/site-packages/adal/token_request.py:310: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <adal.token_request.TokenRequest object at 0x7fd9e45cad00>
oauth_parameters = {'client_secret': None, 'grant_type': 'client_credentials', 'resource': 'https://management.core.windows.net/'}

    def _oauth_get_token(self, oauth_parameters):
        client = self._create_oauth2_client()
>       return client.get_token(oauth_parameters)

../.venv/lib/python3.8/site-packages/adal/token_request.py:112: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <adal.oauth2_client.OAuth2Client object at 0x7fd9e45ca550>
oauth_parameters = {'client_secret': None, 'grant_type': 'client_credentials', 'resource': 'https://management.core.windows.net/'}

    def get_token(self, oauth_parameters):
        token_url = self._create_token_url()
        url_encoded_token_request = urlencode(oauth_parameters)
        post_options = util.create_request_options(self, _REQ_OPTION)
    
        operation = "Get Token"
    
        try:
            resp = requests.post(token_url.geturl(),
                                 data=url_encoded_token_request,
                                 headers=post_options['headers'],
                                 verify=self._call_context.get('verify_ssl', None),
                                 proxies=self._call_context.get('proxies', None),
                                 timeout=self._call_context.get('timeout', None))
    
            util.log_return_correlation_id(self._log, operation, resp)
        except Exception:
            self._log.exception("%(operation)s request failed", {"operation": operation})
            raise
    
        if util.is_http_success(resp.status_code):
            return self._handle_get_token_response(resp.text)
        else:
            if resp.status_code == 429:
                resp.raise_for_status()  # Will raise requests.exceptions.HTTPError
            return_error_string = _ERROR_TEMPLATE.format(operation, resp.status_code)
            error_response = ""
            if resp.text:
                return_error_string = u"{} and server response: {}".format(return_error_string,
                                                                           resp.text)
                try:
                    error_response = resp.json()
                except ValueError:
                    pass
>           raise AdalError(return_error_string, error_response)
E           adal.adal_error.AdalError: Get Token request returned http error: 400 and server response: {"error":"invalid_request","error_description":"AADSTS900144: The request body must contain the following parameter: 'client_id'.\r\nTrace ID: 9ff1d754-5f79-403a-9ae7-31cd01d02800\r\nCorrelation ID: 8eeb21da-25ab-4993-8ed7-76b06b6837b4\r\nTimestamp: 2020-01-16 16:47:32Z","error_codes":[900144],"timestamp":"2020-01-16 16:47:32Z","trace_id":"9ff1d754-5f79-403a-9ae7-31cd01d02800","correlation_id":"8eeb21da-25ab-4993-8ed7-76b06b6837b4","error_uri":"https://login.microsoftonline.com/error?code=900144"}

../.venv/lib/python3.8/site-packages/adal/oauth2_client.py:289: AdalError

During handling of the above exception, another exception occurred:

    @pytest.fixture(scope="session")
    def workspace():
>       return get_workspace()

test/perf/conftest.py:71: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../.venv/lib/python3.8/site-packages/tempeh/execution/azureml/workspace.py:48: in get_workspace
    resource_management_client = resource_client_factory(auth, SUBSCRIPTION_ID)
../.venv/lib/python3.8/site-packages/azureml/_base_sdk_common/common.py:354: in resource_client_factory
    return auth._get_service_client(ResourceManagementClient, subscription_id)
../.venv/lib/python3.8/site-packages/azureml/core/authentication.py:146: in _get_service_client
    return _get_service_client_using_arm_token(self, client_class, subscription_id,
../.venv/lib/python3.8/site-packages/azureml/core/authentication.py:1590: in _get_service_client_using_arm_token
    adal_auth_object = auth._get_adal_auth_object()
../.venv/lib/python3.8/site-packages/azureml/core/authentication.py:191: in _get_adal_auth_object
    token = self.get_authentication_header()["Authorization"].split(" ")[1]
../.venv/lib/python3.8/site-packages/azureml/core/authentication.py:88: in get_authentication_header
    auth_header = {"Authorization": "Bearer " + self._get_arm_token()}
../.venv/lib/python3.8/site-packages/azureml/core/authentication.py:863: in wrapper
    new_token = actual_function(self, *args, **kwargs)
../.venv/lib/python3.8/site-packages/azureml/core/authentication.py:962: in _get_arm_token
    header = execute_func(self._get_sp_credential_object().signed_session).headers['Authorization']
../.venv/lib/python3.8/site-packages/azureml/_restclient/clientbase.py:51: in execute_func
    return ClientBase._execute_func_internal(
../.venv/lib/python3.8/site-packages/azureml/_restclient/clientbase.py:294: in _execute_func_internal
    left_retry = cls._handle_retry(back_off, left_retry, total_retry, error, logger, func)
../.venv/lib/python3.8/site-packages/azureml/_restclient/clientbase.py:345: in _handle_retry
    raise error
../.venv/lib/python3.8/site-packages/azureml/_restclient/clientbase.py:292: in _execute_func_internal
    return func(*args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <msrestazure.azure_active_directory.AdalAuthentication object at 0x7fd9e7c793a0>, session = <requests.sessions.Session object at 0x7fd9e7c791c0>

    def signed_session(self, session=None):
        """Create requests session with any required auth headers applied.
    
        If a session object is provided, configure it directly. Otherwise,
        create a new session and return it.
    
        :param session: The session to configure for authentication
        :type session: requests.Session
        :rtype: requests.Session
        """
        session = super(AdalAuthentication, self).signed_session(session)
    
        try:
            raw_token = self._adal_method(*self._args, **self._kwargs)
        except adal.AdalError as err:
            # pylint: disable=no-member
            if 'AADSTS70008:' in ((getattr(err, 'error_response', None) or {}).get('error_description') or ''):
                raise Expired("Credentials have expired due to inactivity.")
            else:
>               raise AuthenticationError(err)
E               msrest.exceptions.AuthenticationError: Get Token request returned http error: 400 and server response: {"error":"invalid_request","error_description":"AADSTS900144: The request body must contain the following parameter: 'client_id'.\r\nTrace ID: 9ff1d754-5f79-403a-9ae7-31cd01d02800\r\nCorrelation ID: 8eeb21da-25ab-4993-8ed7-76b06b6837b4\r\nTimestamp: 2020-01-16 16:47:32Z","error_codes":[900144],"timestamp":"2020-01-16 16:47:32Z","trace_id":"9ff1d754-5f79-403a-9ae7-31cd01d02800","correlation_id":"8eeb21da-25ab-4993-8ed7-76b06b6837b4","error_uri":"https://login.microsoftonline.com/error?code=900144"}

../.venv/lib/python3.8/site-packages/msrestazure/azure_active_directory.py:454: AuthenticationError

Documentation - argument types

Our documentation:
https://fairlearn.readthedocs.io/en/latest/index.html
does not consistently provide links to definitions of Estimators (which should come from sklearn), numpy.ndarray and pandas. These should be remedied.

Support for Equality of Opportunity constraint

Is there a way to use Equality of Opportunity or True Positive Rate equality constraint?
Any quick change to ConditionalSelectionRate implementation?

Extend perf metrics to show overhead without estimator fit time

absolute/relative time metrics are currently not correct

Question: are other types of fairness in scope?

The terminology doc says:

There are many approaches to conceptualizing fairness. In fairlearn, we follow the approach known as group fairness, which asks: Which groups of individuals are at risk for experiencing harms?

Does that mean other definitions of fairness are out of scope, or is that they're not included yet?

Also, are mitigation methods included here inherently incapable of working with other forms of fariness definitions?

Shouldn't it be up to the ethical counsel and lawyers and the local laws to define which types of fairness should be enforced? I'm not sure why this library should take a stand on that.

Performance tests fail in image build stage

https://ml.azure.com/experiments/id/737565a7-514b-4dde-9e69-6b2644929eb1/runs/perftest_1579846054_95b06609?wsid=/subscriptions/cecafb73-04ae-4432-9f96-d96925d28058/resourcegroups/fairlearn-automation/workspaces/fairlearnperf&tid=72f988bf-86f1-41af-91ab-2d7cd011db47#logs

try to pass sklearn's check_estimator

This page explains how you can write a test to pass tests included in check_estimator. We can see which ones fail and see if we can fix them.

Test dependencies

conftest.py under test/perf checks for azureml, but would also give the same error if the user doesn't have tempeh installed.

I encountered this error since I avoid installing everything which is in requirements.txt not to bloat my env unnecessarily.

Also, we could have the performance tests optional and skip them if the user doesn't have required dependencies. They're not an integral part of the functionality test anyway, are they?

Trouble with Linux and Python 3.5 in builds

For some reason shap is failing to install on the "Install Requirements" stage on Linux with Python 3.5. Have turned off for now, but need to track this down

add performance tests

Removed NotFittedException

Our NotFittedException should be retired in favour of the NotFittedError from sklearn; having our own type doesn't achieve anything.

It would also be good to make sure that all our Estimators implement this check (and are tested for it).

ModuleNotFoundError: No module named 'azureml.contrib.explain'

Hi guys,

I am testing the tool using the Threshold Optimization Post-Processing for Binary notebook and i am continuously getting the following error:

I read about a similar issue and they recommend to:
pip install 'azureml-sdk[notebooks]'
But i get the same error.
Could you let me know what i could do? Please

Terminology - differentiate between group criteria and individual criteria

BGL is a group criterion, while SP is an individual criterion. Make that clear in the terminology section.

Chg PyPi badge in readme to autoupdate the version number

Modify badge in readme.md

Naming Conventions

We should review our property naming conventions. For example, sklearn puts trailing underscores on attributes which have been calculated by operation of the class. See e.g.:
https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html

Moments use dataX/A/Y instead of X, group_data, and y

e.g.

def init(self, dataX, dataA, dataY):

For consistency this should be adjusted.

How to handle unknown/missing value in sensitive features

In real life application, sometimes we have some unknown/missing value in the class attribute.
For example, dataA = pd.Series([np.nan, 0, 0, 1, np.nan]).

What if I just care about non-missing values disparity but still want to optimize on the whole data? Is it equal to change the loss function by
min ERROR (using the whole data) subject to fairness constraints(using the non-missing data) ?
Could you provide some suggestion?

Pin all pip packages for release pipelines

We were just burned by a bad version of the colorama package coming out during a release. Since each job does its own pip install this bad package got picked up mid-release, and killed it.

We should have the "Prevalidation" stage run pip freeze to get a complete package list, and share this as a Build artifact for subsequent steps.

Release Pipeline Improvements

In addition to the pip freeze proposal, the 0.4.0 release has highlighted a weakness in our release automation. This is around the

Required updates to ReadMe.md (not helped by GitHub and PyPI having difference concepts of markdown syntax)
Branching and tagging of the release

We need to get all these automated so they are done and done consistently.