Git Product home page Git Product logo

lime-experiments's Introduction

This repository contains the code to run the experiments present in this paper. The code here is frozen to what it was when we originally wrote the paper. If you're interested in using LIME, check out this repository, where we have packaged it up, improved the code quality, added visualizations and other improvements.

Running the commands below should be enough to get all of the results. You need specific versions python, sklearn, numpy, scipy. Install requirements in a virtualenv using:

pip install -r requirements.txt

If we forgot something, please email the first author.

Experiment in section 5.2:

  • DATASET -> 'multi_polarity_books', 'multi_polarity_kitchen', 'multi_polarity_dvd', 'multi_polarity_kitchen'

  • ALGORITHM -> 'l1logreg', 'tree'

  • EXPLAINER -> 'lime', 'parzen', 'greedy' or 'random'

      python evaluate_explanations.py --dataset DATASET --algorithm ALGORITHM --explainer EXPLAINER 
    

Experiment in section 5.3:

  • DATASET -> 'multi_polarity_books', 'multi_polarity_kitchen', 'multi_polarity_dvd', 'multi_polarity_kitchen'

  • ALGORITHM -> 'logreg', 'random_forest', 'svm', 'tree' or 'embforest', although you would need to set up word2vec for embforest

      python data_trusting.py -d DATASET -a ALGORITHM -k 10 -u .25 -r NUM_ROUNDS
    

Experiment in section 5.4:

  • NUM_ROUNDS -> Desired number of rounds

  • DATASET -> 'multi_polarity_books', 'multi_polarity_kitchen', 'multi_polarity_dvd', 'multi_polarity_kitchen'

  • PICK -> 'submodular' or 'random' Run the following with the desired number of rounds:

      mkdir out_comparing
    
      python generate_data_for_compare_classifiers.py -d DATASET -o out_comparing/ -k 10 -r NUM_ROUNDS
    
      python compare_classifiers.py -d DATASET -o out_comparing/ -k 10 -n 10 -p PICK
    

Religion dataset:

Available here

Multi-polarity datasets:

I got them from here

lime-experiments's People

Contributors

marcotcr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lime-experiments's Issues

Questions for Experiment 5.2

Hello,

Thank you for sharing this great work. I have some question marks related with experiment 5.2. I'm not clear what is trying to be achieved in this experiment. As far as I understand, you train a base model that is naturally interpretable model(logistic regression or decision tree) to compare with the explainers like LIME, Parzen etc. For each instance, from these base model you get maximum 10 features as "gold features" and you check how many of these gold featueres are recovered by the explainers. If my understanding is correct, I have this question: If we have complex dataset so that logistic regression or decision tree gives a poor performance on the dataset, then are these selected gold features relaible to compare with explainers?

ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (1600,19902) and requested shape (1,19902)

File "/home/marianne/lime-experiments/parzen_windows.py", line 67, in explain_instance
minus = self.X - x
File "/home/marianne/.local/lib/python2.7/site-packages/scipy/sparse/base.py", line 408, in rsub
other = broadcast_to(other, self.shape)
File "/home/marianne/.local/lib/python2.7/site-packages/numpy/lib/stride_tricks.py", line 173, in broadcast_to
return _broadcast_to(array, shape, subok=subok, readonly=True)
File "/home/marianne/.local/lib/python2.7/site-packages/numpy/lib/stride_tricks.py", line 128, in _broadcast_to
op_flags=[op_flag], itershape=shape, order='C').itviews[0]

To reproduce the error: python evaluate_explanations.py --dataset multi_polarity_books --algorithm l1logreg --explainer parzen

Code freeze

Hi,

I'm very surprised to see you use GitHub in this way, to publish a single, apparently frozen version of research code. It is somewhat inadequate, offers no guarantee of preservation or frozen-ness, no unique URL (should you someday decide to change your GitHub handle, this URL will die).

Have you considered something like Zenodo, which preserves artifacts on CERN's repository, is free, mints DOIs, and integrates with GitHub to import code automatically when you tag?

Another solution is the OSF, which is also free, mints DOIs and integrates with GitHub, though it is meant for more than snapshotting.

I guess it is too late now that your paper refers to this repository -- I just thought I'd offer some unwanted advice. Nevertheless, thank you for your concern and efforts about reproducibility.

Cheers!

Experiment in section 5.2 dataset

There are three version dataset of multi_polarity the link have provide

  • unprocessed.tar.gz
  • processed_acl.tar.gz
  • processed_star

VUheqs.md.png

so which one you had use in your experiment in section 5.2

VUhpVI.md.png

Datasets?

Hi Marco,

I'm wondering, in order to actually run the experiments on our end, where we can get the "multi_polarity_" datasets?

Thank you!

Error when using scikit learn version 0.19

AttributeError: 'LogisticRegression' object has no attribute 'transform'

How to reproduce the error:

  1. Install scikit learn 0.19: pip install 'scikit-learn==0.19' --force-reinstall
  2. Run: python evaluate_explanations.py --dataset multi_polarity_books --algorithm l1logreg --explainer random

How to fix it?

Install scikit learn 0.16 (not sure which version is the best in this case, but version 0.16 has the transform method) + cross_val_predict

Data for the "Husky vs Wolf"

Hi! Thanks for the great work!
I was wondering if you published the data used for the "Husky vs Wolf" or could share it?

Thanks and best regards
Verena

Building SP LIME from experiment 5.4

Hello,

I'm trying to implement SP LIME using the code you have in compare_classifiers.py. I was planning on re-using the functions submodular_fn, submodular_pick and greedy. Then just call pick = pick_function(pickled_map, 'lime', B) as you have it (but with the explanations instead of pickled_map). But I have noticed that you have re-implemented LIME in explainers.py and you use the GeneralizedLocalExplainer which does things different to LIME (shifts the labels with the mean, uses Lasso instead of Ridge, etc). I was planning on using LIME from the lime repo to pass the explanations (I guess not the object but the list of tuples (feature_id, weights)) to pick = pick_function(pickled_map, 'lime', B).

My question is, is there anything in the GeneralizedLocalExplainer that is important for SP LIME and I should take into account to build SP LIME? I cannot figure out exactly why using the mean or other thins is done since there aren't many comments. Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.