Comments (3)
Thank you for your question!
If you are generating counterfactuals for class 1 (i.e., in orig_pd
) and some counterfactuals still belong to the class 1 (i.e., in cf_pd
), then it means that the method couldn't find a counterfactual for your give input. This has to do with the validity score. All counterfactual methods can struggle with that. For more details on the topic see paper here.
If you look in the paper, you will see that the objective function is a mixture of divergence loss (i.e., the one that is flipping the class), sparsity loss (i.e., which makes the counterfactul to be close to the original input) and a consistency loss. There might be some difficult cases in which the method will favour sparsity over divergence which will lead to what you've described.
That being said, to increase the validity of your counterfactuals I recommend you the followings:
- increase the number of
TRAIN_STEPS
. In the example notebook it is set to a very low value for demonstrative purposes. You can set it toTRAIN_STEPS=100000
, evenTRAIN_STEPS=150000
. The more the better. - Decrease the
COEFF_SPARSITY
to a lower value. Currently it is set to 0.5. The lower the better chances you will get a higher validity score. You might actually try in the beginning to also setCOEF_CONSITENCY=0
for ease of hyper-parameter tuning.
I also recommend using the logging functionalities through callbacks presented at the very end of the notebook. This will give you a much better feeling of how the training evolves and you can actually see how the validity increases during training.
There might be other parameters to try if none of the above is working such as action noise (act_noise
), replay_buffer_size
, batch_size
etc., but I would not probably go there yet.
You can also try to train it without an autoencoder (see example here).
If none of the above works, and you really need a counterfactual, then probably the easiest way to get one is to search in your training set an instance that is classified as your intended target and it is close to the input instance w.r.t. some metric of your choice (probably a combination of L1
and L0
). This should work all the time if you didn't impose any constraints on the feature values of the counterfactual.
Although this might not be the case for you, maybe it is worth looking also at this issue - just to avoid an error when computing the validity.
Hope this helps. If you still encounter difficulties, please let me know.
from alibi.
Thanks for your response, I will try to change the configuration.
Meanwhile regarding your comment
"If none of the above works, and you really need a counterfactual, then probably the easiest way to get one is to search in your training set an instance that is classified as your intended target and it is close to the input instance w.r.t. some metric of your choice (probably a combination of L1 and L0). This should work all the time if you didn't impose any constraints on the feature values of the counterfactual."
To measure the closeness where in data both categorical and numerical value combining 30 features ,i tried with cosine similarity earlier but results are not good[similarity is in higher side >.95 for almost all data].
Any metric that i should look in this case?
from alibi.
I would suggest you to try the metric proposed in this paper at the bottom of page 5. The metric is as follows:
- Numeric features: Absolute value of the difference between the two data points divided by the standard deviation of that feature across the entire dataset.
- Categorical features: Data points with the same value have a distance of 0, and other distances are set to the probability that any two examples across the entire dataset would share the same value for that feature.
For categorical features even if you consider a distance equal to 0 if the data points have the same value and 1 otherwise should work fine.
I don't think that cosine similarity would be suited for your problem. One reason is that two data points can have a distance of 0 according to the cosine distance, but be perceptually very different. For example the cosine distance between [0, 1]
and [0, 1000]
is 0, but the cosine distance between [0, 1]
and [0.2, 1.2]
is 0.02. Probably in most case, but of course it depends on your problem, one would say that [0.2, 1.2]
is much closer to [0, 1]
than [0, 1000]
is. Note that the cosine distance is not able to capture that. Anyway, if you think for some reason that the cosine distance is what you need, I would probably apply some simple preprocessing to the dataset first: standardize/normalize numerical features and transform the categorical ones into their one-hot-encoded representation. After that I would try to apply the cosine distance.
from alibi.
Related Issues (20)
- IndexError with AnchorExplainer and Yolov8
- `typing-extensions` 4.6.0 breakage HOT 1
- Predictor attr not properly cleared when saving HOT 2
- `numba` warnings regarding `nopython` keyword HOT 3
- `PartialDependenceVariance` cannot be saved
- CI failing due to use of `np.int` in `shap` HOT 2
- How to pass parameters to the /api/v1.0/explain using AnchorImage? HOT 2
- `KernelShap` returns no explanations when `link='logit'` and predicted proba is 0 or 1
- RuntimeError: The Session graph is empty. Add operations to the graph before calling run() HOT 6
- RuntimeError: tf.placeholder() is not compatible with eager execution.
- RuntimeError: The Session graph is empty. Add operations to the graph before calling run().
- TypeError: 'float' object is not subscriptable
- PDP plots failing with `matplotlib==3.8.0` HOT 1
- `matplotlib` 3.8.0 type hints for public APIs result in type-checking failures
- Logo in README.md HOT 1
- Columns and DataType Not Explicitly Set on line 61 of data.py
- When use explainer.fit(X_train)οΌit went into a loop of error
- After getting anchors, how to use it to predict the label of an instance?
- Mixed continuous and categorical features in the AnchorTabular explainers (my dataset doesn't contain NaNs)
- New release? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from alibi.