dylan-slack / modeling-uncertainty-local-explainability Goto Github PK
View Code? Open in Web Editor NEWLocal explanations with uncertainty ๐!
License: MIT License
Local explanations with uncertainty ๐!
License: MIT License
Hi Dylan!
I was running your code on a number of datapoints to calculate SHAP estimates using KernelSHAP and got "negative dimensions are not allowed" numpy error on a particular datapoint after which the code crashed. The error was raised at line 521 in explanations.py. I believe the error is because the value for n_needed for the particular datapoint came out to be lower than the value of ptg_inital_points which is set it to its default value of 200. This is not happening for all datapoints because the n_needed value caluclated each time is different of course, but adding an if-else statement to handle this edge case and bypass the second call to _shap_tabular_perturb_n_samples fixed it.
Regards
Suchismita
Hi Dylan!
The value for z for the standard normal distribution used in the calculation of the PTG estimate has been hardcoded to 1.96 on line 105 in method get_ptg in regression.py. This would give an erroneous PTG estimate for the desired confidence levels other than 95%, hence this value needs to be dynamically calculated using self.percent.
Regards
Suchismita
Hi Dylan,
When I run BayesSHAP on my regression task, I encounter the convergence error at the self.enumerate_initial_shap
function. Any idea why this happens?
I saw enumerate_initial
is set to True
as default which leads to this self.enumerate_initial_shap
. Is it necessary to do so when performing BayesSHAP?
Also is BayesSHAP scalable to high input dimensions?
Best
Robin
Traceback (most recent call last):
File "/home/miniconda3/envs/hpmodel/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3441, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-15-414d493a31bd>", line 14, in <module>
e_i = exp.explain(X_bayesshap[i], **exp_kwargs)
File "/home/Robin/BNN_Torch/bayesshap/explanations.py", line 651, in explain
l2=l2)
File "/home/Robin/BNN_Torch/bayesshap/explanations.py", line 457, in _explain_bayes_shap
data_init, inverse_init = self._enumerate_initial_shap(data, max_coefs)
File "/home/Robin/BNN_Torch/bayesshap/explanations.py", line 417, in _enumerate_initial_shap
inverse = self.shap_info.discretizer.undiscretize(data)
File "/home/miniconda3/envs/hpmodel/lib/python3.7/site-packages/lime/discretize.py", line 145, in undiscretize
feature, ret[:, feature].astype(int)
File "/home/miniconda3/envs/hpmodel/lib/python3.7/site-packages/lime/discretize.py", line 132, in get_undiscretize_values
random_state=self.random_state
File "/home/miniconda3/envs/hpmodel/lib/python3.7/site-packages/scipy/stats/_distn_infrastructure.py", line 980, in rvs
vals = self._rvs(*args)
File "/home/miniconda3/envs/hpmodel/lib/python3.7/site-packages/scipy/stats/_distn_infrastructure.py", line 913, in _rvs
Y = self._ppf(U, *args)
File "/home/miniconda3/envs/hpmodel/lib/python3.7/site-packages/scipy/stats/_continuous_distns.py", line 7163, in _ppf
return _truncnorm_ppf(q, a, b)
File "/home/miniconda3/envs/hpmodel/lib/python3.7/site-packages/scipy/stats/_continuous_distns.py", line 6933, in vf_wrapper
return vf(*args)
File "/home/miniconda3/envs/hpmodel/lib/python3.7/site-packages/numpy/lib/function_base.py", line 2163, in __call__
return self._vectorize_call(func=func, args=vargs)
File "/homeminiconda3/envs/hpmodel/lib/python3.7/site-packages/numpy/lib/function_base.py", line 2246, in _vectorize_call
outputs = ufunc(*inputs)
File "/home/miniconda3/envs/hpmodel/lib/python3.7/site-packages/scipy/stats/_continuous_distns.py", line 7113, in _truncnorm_ppf
maxiter=TRUNCNORM_MAX_BRENT_ITERS)
File "/home/miniconda3/envs/hpmodel/lib/python3.7/site-packages/scipy/optimize/zeros.py", line 780, in brentq
r = _zeros._brentq(f, a, b, xtol, rtol, maxiter, args, full_output, disp)
RuntimeError: Failed to converge after 40 iterations.
Hi Dylan,
Thanks a lot for releasing the code! I'm looking into applying your BayesSHAP to explain the input feature importance of a high-dimensional regression model. Based on my understanding on your method, the Bayesian inference is directly applicable to regression task. I've made the following minor edits to explanations.py
in order to perform BayesSHAP on regression tasks.
categorical_features
is set to None
as there is no categorical inputs in my caseassert mode in ["classification"]
-> assert mode in ["classification", "regression"]
label
is set to 0 as there is only 1-d output for regressionclassifier_f
with output dimension [n_test, 1]I'd deeply appreciate if you could double-check whether those changes are correct/sufficient. Thanks a lot.
Robin
Hello, will regression versions of Bayes LIME and SHAP be supported in the future? Thank you.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.