Git Product home page Git Product logo

Comments (8)

slundberg avatar slundberg commented on May 22, 2024 1

@jayden526 SHAP values work well with regressions, in fact the Boston housing example in the read-me is a least squares regression. The SHAP values are in the same units as the model output (for tree SHAP in XGBoost this is before the link function (such as a logistic). So if you are predicting dollars, then the units of the SHAP values will be in dollars and will sum to the output of the model.

As for the error, if there is a simple example of how you got it, please post it and I'll fix it.

FYI...If you are using a tree model I would suggest using XGBoost and getting the exact shap values vs using the model agnostic Kernel SHAP on scikit.

from shap.

JuanCorp avatar JuanCorp commented on May 22, 2024

I've used shap and summary plot for the house list price problem before, which is a regression, and the explanations work just fine, and adjust to what I would expect from a logical standpoint. For example, construction area, distance to certain places of interest, and house geographical sector were all top features. I don't have the plot at hand, but a mini app that uses an XGBoost model for house list price prediction (at least in my city), is available in my profile, albeit with some fixes that I need to do for it.

From what I've understood, the shapley values for each feature is the same as a weight or coefficient, like in regression.There's also the bias or intercept. This bias is the base value for the predictions of the model, for example, the average price of all houses in the dataset. For a single data point, each coefficient represents the impact of the feature on the final prediction. These coefficients and intercept are added, then the sigmoid function is applied to the result of the sum. The result of the sigmoid function is the prediction that the original model gave, which is a probability between 0 and 1. For regression models, the process is the same, except that the sigmoid step is skipped, since the output isn't between 0 and 1, but continuous.

@slundberg Can give you better details though, so you should wait for his output.

from shap.

jayden526 avatar jayden526 commented on May 22, 2024

Thank you @JuanCorp, I think you are right. Even for classification the log odds needs to be computed in order to find the probability. The syntax I tried is referred to the classification example:

shap_values = shap.KernelExplainer(randomforest.predict, X_train).shap_values(X_test)
shap.summary_plot(shap_values, X_test)

is this the same as yours? at least now I can get the shap values.
@slundberg Would you mind to clarify the shap_values in regressions? If it is already mentioned in your paper, please let me know, I can check that! thank you.

from shap.

jayden526 avatar jayden526 commented on May 22, 2024

Sorry for asking again, I sometimes have runtime error when I used different number of samples in my X_test (sometimes is ok, sometimes if I only use 100 sample of the test, this error occurs),

Exception in thread Thread-15
RuntimeError: Set changed size during iteration

Could you help me with this? Thank you!

from shap.

jayden526 avatar jayden526 commented on May 22, 2024

@slundberg Thank you so much! I will definitely try with Xgboost to see whether it works for me.

from shap.

slundberg avatar slundberg commented on May 22, 2024

sounds good

from shap.

andymancodes avatar andymancodes commented on May 22, 2024

@slundberg Hi, thanks for the great package! I am not getting how to use my own dataset with shap? What is the use of *shap.dataset and how can I use my own datasets in the form of (X, y) with SHAP? Thanks :)

from shap.

slundberg avatar slundberg commented on May 22, 2024

Do you have a model and a dataset or just a dataset representing the output of the model? Perhaps clarifying what doesn't make sense about the examples in the README would be helpful.

from shap.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.