Comments (3)
Good questions!
-
All the background examples are evaluated for each sample. In the case you describe there are
n * k
evaluations of the predictor (passed as a batch to the provider predictor function). I should also note that ifk >= 2^d
then the entire space is evaluated directly (rather than random sampling subsets from the kernel distribution) and this leads to an exact computation. -
Very close except for step 2. As mentioned above, in step 2 we produce
n
copies with each copy using a differentx_i
from the background dataset. The expected value of the prediction is taken over thesen
copies. -
The paper presents two approximations to conditional expectations that make them easier to compute. The first is feature independence, which allows us to integrate out each feature independent of the others, and the second is model linearity (as an approximation), which allows us to only provide a single reference value for the background. Tree SHAP does not make either of these assumptions (it just assumes the tree has captured any feature dependence), but Kernel SHAP as implemented here always assumes feature independence, and may assume linearity depending on how you use it. If you just pass one reference value for the background then you are assuming linearity, if you pass multiple samples then it will integrate over them to provide a better approximation.
I should make this clearer in the docs, but I recommend using either a single background data point, a small random subset of the true background, or for the best performance a set of k-medians (weighted by how many training points they each represent) designed to represent the background succinctly. Passing a large background dataset wastes lots of effort. Perhaps I'll add a warning about that to the code.
from shap.
I see. It is much clearer now. Thanks!
from shap.
I should also mention that the choice to use a single background reference or multiple samples depends on the type of input data. For images, integrating over a few instances is unlikely to be any better an approximation that just using a single reference. But for structured data using 20 weighted k-medians or a small random subsample might help.
from shap.
Related Issues (20)
- BUG: Pytorch DeepExplainer SHAP explanations do not sum up to the model's output HOT 2
- BUG:
- BUG: NaN values created flattened Beeswarms HOT 1
- BUG: XGBRFRegressor invalid number of trees error with shap>=0.42 HOT 3
- BUG: TreeSHAP Interventional explanations segmentation fault HOT 3
- BUG: Problem of dimension during KernelExplainer HOT 6
- ENH: Not showing nan values on Beeswarm or violin plots
- BUG: Discrepancy among SHAP beeswarm and Seaborn swarmplots
- Intrepreting Siamese Network using SHAP values HOT 5
- ENH: Support SeLU activations for Tensorflow DeepExplainer
- BUG: AssertionError: The SHAP explanations do not sum up to the model's output HOT 1
- [Meta-issue] deprecation tracker for upcoming releases HOT 2
- BUG: missing **dmatrix_props in shape_interaction module HOT 1
- BUG: TreeExplainer ignores "link" argument passed by the Explainer API
- BUG: SHAP Partition explainer fails for a single token text input HOT 5
- BUG: Shap value cannot be calculated according to NLP tutorial on the original website HOT 17
- Segmentation Fault on MacOS with pytorch > 2.2.0 HOT 4
- ENH: Faster import performance HOT 1
- ENH: Label dots in scatterplot according to classes and add a legend
- BUG: SHAP DeepExplainer cannot get SHAP values from TorchScript model HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from shap.