Git Product home page Git Product logo

alibi's People

Contributors

abs428 avatar ahousley avatar alexcoca avatar arnaudvl avatar ascillitoe avatar christophergs avatar daavoo avatar dependabot[bot] avatar gipster avatar jesse-c avatar jimbudarz avatar jklaise avatar lakshmankishore avatar majolo avatar marcogorelli avatar mauicv avatar oscarfco avatar robertsamoilescu avatar sanjass avatar vinceshieh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

alibi's Issues

Interpretation of the empty anchor

It is possible for the anchor algorithms to return an empty anchor. The interpretation here is that any feature/word/superpixel could act as an anchor as the sampling procedure couldn't produce examples of a different class, therefore there is no particularly important subset of features to produce the same prediction. We should document this in detail as well as decide if returning the "empty anchor" makes the most sense in such cases.

Feature attribution method (e.g. SHAP)

For 0.1 we should include a feature attribution method. Currently SHAP seems to be state-of-the-art. The package itself looks production-ready so as a first step we could include this in Alibi under a thin wrapper conforming to a common explainer API. Post 0.1 we can revisit and consider our own implementation and improvements to do with categorical variables.

Custom return types for explainers

There is some consensus that we should return a custom object for each explainer method as the returns are quite different. This is more motivated by eventual integration with Seldon Deploy.

Allow skip predict_fn setup in Explainer init

For AnchorImage the model needs to be running at the time of the explainer creation as the predict_fn is called:

if np.argmax(predict_fn(np.zeros((1,) + image_shape)).shape) == 0:
self.predict_fn = predict_fn
else:
self.predict_fn = lambda x: np.argmax(predict_fn(x), axis=1)

For use in kfserving it would be preferred if this was skipped as the predict_fn will be replaced later at inference time. Could we have a flag to skip this and rerun on first explain call if it was skipped?

This will allow an explainer to be saved without needing access to the model and better separate the explainer from the model.

This could be considered for all explainers?

Atr present it also means when we replace the predict_fn at inference time we need to rerun the above code which is copied into the kfserving library. This is error prone and a duplication of code.

Add notebooks showing pros/cons of different techniques

Highlight pros and cons for different explanation techniques by comparison using examples. Compare open source libraries (e.g. LIME, SHAP, PDP etc) with techniques in alibi and illustrate where for instance approaches which do not respect the data distribution through perturbations or where local linearizations generate misleading explanations.

Data transformations in the 'income prediction' example are performed on the entire dataset, rather than just on the train one

The anchor_text_movie example correctly fits CountVectorizer(min_df=1) to the train set and then uses it to transform both train and test sets.

In the anchor_tabular_adult example, on the other hand, we have

preprocessor = ColumnTransformer(transformers=[('num', ordinal_transformer, ordinal_features),
                                               ('cat', categorical_transformer, categorical_features)])
preprocessor.fit(data)

and then

clf.fit(preprocessor.transform(X_train), Y_train)
predict_fn = lambda x: clf.predict(preprocessor.transform(x))

.

In Hands-On Machine Learning with Scikit-Learn & TensorFlow, it says

As with all the transformations, it is important to fit the scalers to the training data only, not to the full dataset (including the test set). Only then can you use them to transform the training set and the test set (and new data)

Although it hardly makes any difference here, it's probably a better practice to fit the preprocessing to the train set rather than to the entire dataset.

Ambiguous tokenization for AnchorText

We use spacy to produce AnchorText explanations, however sometimes the tokenization does not match onto some of the metadata we return (e.g. positions of words in the anchor under exp['raw']['features']) in cases were spacy creates tokens from shortened phrases (e.g. doesn't->[does, n't] will result in off-by-one). We need to provide an unambiguous representation of which tokens are in the anchor, e.g. could be a list of tuples of indices (ix1, ix2) denoting the start and end of each word in the anchor wrt the original text.

Also, commas currently seem to be tokenized too.

Top nav bar colour

The docs website is responsive and on mobile/narrow windows the default theme colour pops up on the top navigation bar, this should be changed to Seldon blue #5159ff

Add hard target class constraint for prototypical counterfactuals

Currently, adding specific target classes for prototypical counterfactuals ensure that the prototype used to guide the counterfactual belongs to a class in the target class list, but does not guarantee that the final counterfactual will belong to the specified target class (although it usually does). This hard constraint can be an optional argument and also requires adjusting the predict loss term.

Add warning when no anchor is found

Currently the Anchor explainer returns an anchor of maximum precision when the maximum number of elements in the anchor is reached. We should warn the user that the desired threshold has not been reached via e.g. a logger.warn message as well as a metadata flag in the return.

Lexeme.similarity and AnchorText

Hi !
I'm using the AnchorText movie reviews example as a starting point for a blog post on explainable AI. I've run into two minor issues but I'd be interested in understanding them / maybe improving on them.

  1. When I am in synonym / use_proba=True mode, my CLI gets 1000s of lines of this warning - at least once for every movie review:
    <stdin>:1: UserWarning: [W008] Evaluating Lexeme.similarity based on empty vectors.

  2. When I made a super-easy classifier (sentences beginning with Apples, Oranges, or neither), the neither category is more of an absence-of-anchors, so predictions for it return an almost empty object. Could there be a better way to represent the 'null' category here?

{'names': [], 'precision': 1.0, 'coverage': 1, 'raw': {'feature': [], 'mean': [], 'precision': [], 'coverage': [], 'examples': [], 'all_precision': 1.0, 'num_preds': 101, 'names': [], 'positions': [], 'instance': 'This is a good book .', 'prediction': 2}}

Expose Tensorboard for internal optimization algorithms

We use TensorFlow for internal optimization problems (e.g. CEM and upcoming Counterfactual instances) and exposing metrics to Tensorboard is crucial for developing and tuning the algorithms. It would be helpful to expose tensorboard access within the library so the advanced user can deep-dive into the methods. This could also be useful for data scientists wanting to tune the hyperparameters of explainer methods specific to their model and dataset.

Allow explain parameters in init call to AnchorImage and other explainers

To allow users to completely specify their explainer before explain is called we would like to have all hyperparameters be able to be specified in the init function for the class.

This will simplify integrations with inference engines such as kfserving.

Example, for AnchorImage:

anchor_exp = self.anchors_image.explain(arr[0],threshold=.95, p_sample=.5, tau=0.15)

We need to be able to specify threshold, p_sample and tau at class instance creation.

Extend fetch_imagenet to handle arbitrary categories

Currently the fetch_imagenet data loading utility only has a few hard-coded categories (https://github.com/SeldonIO/alibi/blob/master/alibi/datasets.py#L50), would be good to extend to handle arbitrary categories. This may involve having to pass the category identifier to the function, alternatively, we can look into producing a complete mapping between categories and identifiers (image-net.org doesn't seem to have it) and host it on the alibi repo.

Unified explainer API

We should settle on a unified API. Currently the thinking is that an explainer has to implement an explain method and optionally a fit method (for cases where there is computation/dependency on the training set). We could write this as an abstrast BaseExplainer from which every other explainer inherits. To support multiple inheritance, we could use mixin classes (c.f. scikit-learn BaseEstimator).

Factor out FISTA as a stand-alone solver

The FISTA optimization algorithm is currently used to optimize the CEM loss, but it could be useful for other methods (e.g. counterfactual loss), it would be nice to eventually have a generic TensorFlow implementation

Manage TF sessions internally

We don't want the user to have to manage TF sessions externally from the library, so the logic should be internal to the algorithms (currently counterfactuals, cfproto and cem). However, this is not straightforward due to the way sessions are kept alive. On first investigation there seem to be two choices:

  1. Use custom decorators (e.g. autoflow in GPFlow)
  2. Rewrite everything using eager mode (could align with #108)

Influence functions

This method requires taking Hessians of the loss function, so is not model-agnostic. Implementations will rely on ML frameworks with autograd. We can do Tensorflow for 0.1.0 and Pytorch afterwards.

Remove dependency on LIME

Currently we use methods from the lime package to discretize ordinal features, we should include this in core alibi to avoid the dependency on lime.

Don't require Spacy for NLP model explanations

The "ELI5" people who implement LIME in the form of TextExplainer offer a great blueprint on how to use the scikit learn tool-chain (which itself is extremely easy to work with) for allowing LIME explanations of any NLP classifier.

I've succesfully used it for a multitude # of tasks including explaining word-embedding powered Keras NLP models and for multilabel model explanations (after a few hacks)

I'd love to try out this anchor based method but the API expecting a Spacy object is a dealbreaker for me.

Could the CEM explanations contain extra colours?

The CEM paper shows an example image which is made very clear via the use of different colours (cyan for pertinent positive mode, pink for pertinent negative):
Screenshot from 2019-09-17 19-13-13

On the other hand, the cem_mnist example from the alibi docs contains black-and-white images. I think this makes the pertinent negative example a little hard to understand, as it's not obvious which are the extra pixels which are minimally but critically absent. Could they be coloured in?

I can work on a PR if you're interested in this.

pytest collection could be much faster

Currently, just collecting the tests for the travis CI job takes about 4 minutes. This could be sped up, e.g. by searching for tests in the right folders.

I'll verify what's going on exactly and aim to submit a pull request tomorrow.

Create custom docker image for CI

Our CI requires a lot of custom dependencies and even datasets (e.g. Spacy corpus) which get downloaded and installed every time on the current base Python CI image. Creating our own image which has all the dependencies would make the process significantly faster.

Migrate Keras dependency to tf.keras

Since we are using TensorFlow for some optimization algorithms and with TF 2.0 on the horizon it probably makes sense to use tf.keras exclusively over the standalone Keras library.

Update examples to reflect positional Session parameter

Currently the CEM and CF example notebooks are missing the Session parameter. K.get_session() could be added to the examples so they align with the requirements.

Followup question - would it not be worth having the session parameter as kwarg?

Document the performance of Anchors on imbalanced datasets

Currently the main example on anchor explanations on tabular datasets will often fail to find a valid anchor for the positive (>$50k) class. This is due to the imbalanced dataset (roughly 25:75 high:low earner proportion), so during the sampling stage feature ranges corresponding to low-earners will be oversampled. This is a feature because it can point out an imbalanced dataset, but it can also be fixed by producing balanced datasets to enable anchors to be found for either class.

Roadmap for dealing with categorical variables

Categorical variables present a lot of difficulties for algorithms which rely on perturbations or sampling around a test instance. We should create a roadmap for a principled treatment of categorical variables in our explainer algorithms.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.