salesforce / omnixai Goto Github PK

View Code? Open in Web Editor NEW

860.0 16.0 93.0 66.22 MB

OmniXAI: A Library for eXplainable AI

License: BSD 3-Clause "New" or "Revised" License

Python 5.29% CSS 0.07% JavaScript 0.01% Jupyter Notebook 94.64%

explainable-ai explainable-ml interpretable-machine-learning machine-learning explanation

omnixai's Introduction

OmniXAI: A Library for Explainable AI

Introduction
Installation
Getting Started
Documentation
Tutorials
Deployment
Dashboard Demo
How to Contribute
Technical Report and Citing OmniXAI

What's New

The latest version includes an experimental GPT explainer. This explainer leverages the outcomes produced by SHAP and MACE to formulate the input prompt for ChatGPT. Subsequently, ChatGPT analyzes these results and generates the corresponding explanations that provide developers with a clearer understanding of the rationale behind the model's predictions.

Introduction

OmniXAI (short for Omni eXplainable AI) is a Python machine-learning library for explainable AI (XAI), offering omni-way explainable AI and interpretable machine learning capabilities to address many pain points in explaining decisions made by machine learning models in practice. OmniXAI aims to be a one-stop comprehensive library that makes explainable AI easy for data scientists, ML researchers and practitioners who need explanation for various types of data, models and explanation methods at different stages of ML process:

OmniXAI includes a rich family of explanation methods integrated in a unified interface, which supports multiple data types (tabular data, images, texts, time-series), multiple types of ML models (traditional ML in Scikit-learn and deep learning models in PyTorch/TensorFlow), and a range of diverse explaination methods including "model-specific" and "model-agnostic" methods (such as feature-attribution explanation, counterfactual explanation, gradient-based explanation, feature visualization, etc). For practitioners, OmniXAI provides an easy-to-use unified interface to generate the explanations for their applications by only writing a few lines of codes, and also a GUI dashboard for visualization for obtaining more insights about decisions.

The following table shows the supported explanation methods and features in our library. We will continue improving this library to make it more comprehensive in the future.

Method	Model Type	Explanation Type	EDA	Tabular	Image	Text	Timeseries
Feature analysis	NA	Global	✅
Feature selection	NA	Global	✅
Prediction metrics	Black box	Global		✅	✅	✅	✅
Bias metrics	Black box	Global		✅
Partial dependence plots	Black box	Global		✅
Accumulated local effects	Black box	Global		✅
Sensitivity analysis	Black box	Global		✅
Permutation explanation	Black box	Global		✅
Feature visualization	Torch or TF	Global			✅
Feature maps	Torch or TF	Local			✅
GPT explainer	Black box	Local		✅
LIME	Black box	Local		✅	✅	✅
SHAP	Black box*	Local		✅	✅	✅	✅
What-if	Black box	Local		✅
Integrated gradient	Torch or TF	Local		✅	✅	✅
Counterfactual	Black box*	Local		✅	✅	✅	✅
Contrastive explanation	Torch or TF	Local			✅
Grad-CAM, Grad-CAM++	Torch or TF	Local			✅
Score-CAM	Torch or TF	Local			✅
Layer-CAM	Torch or TF	Local			✅
Smooth gradient	Torch or TF	Local			✅
Guided backpropagation	Torch or TF	Local			✅
Learning to explain	Black box	Local		✅	✅	✅
Linear models	Linear models	Global and Local		✅
Tree models	Tree models	Global and Local		✅

SHAP accepts black box models for tabular data, PyTorch/Tensorflow models for image data, transformer models for text data. Counterfactual accepts black box models for tabular, text and time-series data, and PyTorch/Tensorflow models for image data.

This table shows the comparison between our toolkit/library and other existing XAI toolkits/libraries in literature.

OmniXAI also integrates ChatGPT for generating plain text explanations given a classification/regression model on tabular datasets. The generated results may not be 100% accurate, but it is worth trying this explainer (we will continue improving the input prompts).

Installation

You can install omnixai from PyPI by calling pip install omnixai. You may install from source by cloning the OmniXAI repo, navigating to the root directory, and calling pip install ., or pip install -e . to install in editable mode. You may install additional dependencies:

For plotting & visualization: Calling pip install omnixai[plot], or pip install .[plot] from the root directory of the repo.
For vision tasks: Calling pip install omnixai[vision], or pip install .[vision] from the root directory of the repo.
For NLP tasks: Calling pip install omnixai[nlp], or pip install .[nlp] from the root directory of the repo.
Install all the dependencies: Calling pip install omnixai[all], or pip install .[all] from the root directory of the repo.

Getting Started

For example code and an introduction to the library, see the Jupyter notebooks in tutorials, and the guided walkthrough here.

Some examples:

To get started, we recommend the linked tutorials in tutorials. In general, we recommend using TabularExplainer, VisionExplainer, NLPExplainer and TimeseriesExplainer for tabular, vision, NLP and time-series tasks, respectively, and using DataAnalyzer and PredictionAnalyzer for feature analysis and prediction result analysis. These classes act as the factories of the individual explainers supported in OmniXAI, providing a simpler interface to generate multiple explanations. To generate explanations, you only need to specify

The ML model to explain: e.g., a scikit-learn model, a tensorflow model, a pytorch model or a black-box prediction function.
The pre-processing function: i.e., converting raw input features into the model inputs.
The post-processing function (optional): e.g., converting the model outputs into class probabilities.
The explainers to apply: e.g., SHAP, MACE, Grad-CAM.

Besides using these classes, you can also create a single explainer defined in the omnixai.explainers package, e.g., ShapTabular, GradCAM, IntegratedGradient or FeatureVisualizer.

Let's take the income prediction task as an example. The dataset used in this example is for income prediction. We recommend using data class Tabular to represent a tabular dataset. To create a Tabular instance given a pandas dataframe, you need to specify the dataframe, the categorical feature names (if exists) and the target/label column name (if exists).

from omnixai.data.tabular import Tabular
# Load the dataset
feature_names = [
   "Age", "Workclass", "fnlwgt", "Education",
   "Education-Num", "Marital Status", "Occupation",
   "Relationship", "Race", "Sex", "Capital Gain",
   "Capital Loss", "Hours per week", "Country", "label"
]
df = pd.DataFrame(
  np.genfromtxt('adult.data', delimiter=', ', dtype=str),
  columns=feature_names
)
tabular_data = Tabular(
   df,
   categorical_columns=[feature_names[i] for i in [1, 3, 5, 6, 7, 8, 9, 13]],
   target_column='label'
)

The package omnixai.preprocessing provides several useful preprocessing functions for a Tabular instance. TabularTransform is a special transform designed for processing tabular data. By default, it converts categorical features into one-hot encoding, and keeps continuous-valued features. The method transform of TabularTransform transforms a Tabular instance to a numpy array. If the Tabular instance has a target/label column, the last column of the numpy array will be the target/label. You can apply any customized preprocessing functions instead of using TabularTransform. After data preprocessing, let's train a XGBoost classifier for this task.

from omnixai.preprocessing.tabular import TabularTransform
# Data preprocessing
transformer = TabularTransform().fit(tabular_data)
class_names = transformer.class_names
x = transformer.transform(tabular_data)
# Split into training and test datasets
train, test, train_labels, test_labels = \
    sklearn.model_selection.train_test_split(x[:, :-1], x[:, -1], train_size=0.80)
# Train an XGBoost model (the last column of `x` is the label column after transformation)
model = xgboost.XGBClassifier(n_estimators=300, max_depth=5)
model.fit(train, train_labels)
# Convert the transformed data back to Tabular instances
train_data = transformer.invert(train)
test_data = transformer.invert(test)

To initialize TabularExplainer, the following parameters need to be set:

explainers: The names of the explainers to apply, e.g., ["lime", "shap", "mace", "pdp"].
data: The data used to initialize explainers. data is the training dataset for training the machine learning model. If the training dataset is too large, data can be a subset of it by applying omnixai.sampler.tabular.Sampler.subsample.
model: The ML model to explain, e.g., a scikit-learn model, a tensorflow model or a pytorch model.
preprocess: The preprocessing function converting the raw inputs (A Tabular instance) into the inputs of model.
postprocess (optional): The postprocessing function transforming the outputs of model to a user-specific form, e.g., the predicted probability for each class. The output of postprocess should be a numpy array.
mode: The task type, e.g., "classification" or "regression".

The preprocessing function takes a Tabular instance as its input and outputs the processed features that the ML model consumes. In this example, we simply call transformer.transform. If you use some customized transforms on pandas dataframes, the preprocess function has this format: lambda z: some_transform(z.to_pd()). If the output of model is not a numpy array, postprocess needs to be set to convert it into a numpy array.

from omnixai.explainers.tabular import TabularExplainer
# Initialize a TabularExplainer
explainer = TabularExplainer(
  explainers=["lime", "shap", "mace", "pdp", "ale"], # The explainers to apply
  mode="classification",                             # The task type
  data=train_data,                                   # The data for initializing the explainers
  model=model,                                       # The ML model to explain
  preprocess=lambda z: transformer.transform(z),     # Converts raw features into the model inputs
  params={
     "mace": {"ignored_features": ["Sex", "Race", "Relationship", "Capital Loss"]}
  }                                                  # Additional parameters
)

In this example, LIME, SHAP and MACE generate local explanations while PDP (partial dependence plot) generates global explanations. explainer.explain returns the local explanations generated by the three methods given the test instances, and explainer.explain_global returns the global explanations generated by PDP. TabularExplainer hides all the details behind the explainers, so we can simply call these two methods to generate explanations.

# Generate explanations
test_instances = test_data[:5]
local_explanations = explainer.explain(X=test_instances)
global_explanations = explainer.explain_global(
    params={"pdp": {"features": ["Age", "Education-Num", "Capital Gain",
                                 "Capital Loss", "Hours per week", "Education",
                                 "Marital Status", "Occupation"]}}
)

Similarly, we create a PredictionAnalyzer for computing performance metrics for this classification task. To initialize PredictionAnalyzer, the following parameters need to be set:

mode: The task type, e.g., "classification" or "regression".
test_data: The test dataset, which should be a Tabular instance.
test_targets: The test labels or targets. For classification, test_targets should be integers (processed by a LabelEncoder) and match the class probabilities returned by the ML model.
preprocess: The preprocessing function converting the raw data (a Tabular instance) into the inputs of model.
postprocess (optional): The postprocessing function transforming the outputs of model to a user-specific form, e.g., the predicted probability for each class. The output of postprocess should be a numpy array.

from omnixai.explainers.prediction import PredictionAnalyzer

analyzer = PredictionAnalyzer(
    mode="classification",
    test_data=test_data,                           # The test dataset (a `Tabular` instance)
    test_targets=test_labels,                      # The test labels (a numpy array)
    model=model,                                   # The ML model
    preprocess=lambda z: transformer.transform(z)  # Converts raw features into the model inputs
)
prediction_explanations = analyzer.explain()

Given the generated explanations, we can launch a dashboard (a Dash app) for visualization by setting the test instance, the local explanations, the global explanations, the prediction metrics, the class names, and additional parameters for visualization (optional). If you want "what-if" analysis, you can set the explainer parameter when initializing the dashboard. For "what-if" analysis, OmniXAI also allows you to set a second explainer if you want to compare different models.

from omnixai.visualization.dashboard import Dashboard
# Launch a dashboard for visualization
dashboard = Dashboard(
   instances=test_instances,                        # The instances to explain
   local_explanations=local_explanations,           # Set the local explanations
   global_explanations=global_explanations,         # Set the global explanations
   prediction_explanations=prediction_explanations, # Set the prediction metrics
   class_names=class_names,                         # Set class names
   explainer=explainer                              # The created TabularExplainer for what if analysis
)
dashboard.show()                                    # Launch the dashboard

After opening the Dash app in the browser, we will see a dashboard showing the explanations:

You can also use the GPT explainer to generate explanations in text for tabular models:

explainer = TabularExplainer(
  explainers=["gpt"],                                # The GPT explainer to apply
  mode="classification",                             # The task type
  data=train_data,                                   # The data for initializing the explainers
  model=model,                                       # The ML model to explain
  preprocess=lambda z: transformer.transform(z),     # Converts raw features into the model inputs
  params={
     "gpt": {"apikey": "xxxx"}
  }                                                  # Set the OpenAI API KEY
)
local_explanations = explainer.explain(X=test_instances)

For vision tasks, the same interface is used to create explainers and generate explanations. Let's take an image classification model as an example.

from omnixai.explainers.vision import VisionExplainer
from omnixai.visualization.dashboard import Dashboard

explainer = VisionExplainer(
    explainers=["gradcam", "lime", "ig", "ce", "feature_visualization"],
    mode="classification",
    model=model,                   # An image classification model, e.g., ResNet50
    preprocess=preprocess,         # The preprocessing function
    postprocess=postprocess,       # The postprocessing function
    params={
        # Set the target layer for GradCAM
        "gradcam": {"target_layer": model.layer4[-1]},
        # Set the objective for feature visualization
        "feature_visualization": 
          {"objectives": [{"layer": model.layer4[-3], "type": "channel", "index": list(range(6))}]}
    },
)
# Generate explanations of GradCAM, LIME, IG and CE
local_explanations = explainer.explain(test_img)
# Generate explanations of feature visualization
global_explanations = explainer.explain_global()
# Launch the dashboard
dashboard = Dashboard(
    instances=test_img,
    local_explanations=local_explanations,
    global_explanations=global_explanations
)
dashboard.show()

The following figure shows the dashboard of these explanations:

For NLP tasks and time-series forecasting/anomaly detection, OmniXAI also provides the same interface to generate and visualize explanations. This figure shows a dashboard example of text classification and time-series anomaly detection:

Deployment

The explainers in OmniXAI can be easily deployed via BentoML. BentoML is a popular open-source unified model serving framework, supporting multiple platforms including AWS, GCP, Heroku, etc. We implemented the BentoML-format interfaces for OmniXAI so that users only need few lines of code to deploy their selected explainers.

Let's take the income prediction task as an example. Given the trained model and the initialized explainer, you only need to save the explainer in the BentoML local model store:

from omnixai.explainers.tabular import TabularExplainer
from omnixai.deployment.bentoml.omnixai import save_model

explainer = TabularExplainer(
  explainers=["lime", "shap", "mace", "pdp", "ale"],
  mode="classification",
  data=train_data,
  model=model,
  preprocess=lambda z: transformer.transform(z),
  params={
     "mace": {"ignored_features": ["Sex", "Race", "Relationship", "Capital Loss"]}
  }
)
save_model("tabular_explainer", explainer)

And then create a file (e.g., service.py) for the ML service code:

from omnixai.deployment.bentoml.omnixai import init_service

svc = init_service(
    model_tag="tabular_explainer:latest",
    task_type="tabular",
    service_name="tabular_explainer"
)

The init_service function defines two API endpoints, i.e., /predict for model predictions and /explain for generating explanations. You can start an API server locally to test the service code above:

bentoml serve service:svc --reload

The endpoints can be accessed locally:

import requests
from requests_toolbelt.multipart.encoder import MultipartEncoder

data = '["39", "State-gov", "77516", "Bachelors", "13", "Never-married", ' \
       '"Adm-clerical", "Not-in-family", "White", "Male", "2174", "0", "40", "United-States"]'

# Test the prediction endpoint
prediction = requests.post(
    "http://0.0.0.0:3000/predict",
    headers={"content-type": "application/json"},
    data=data
).text

# Test the explanation endpoint
m = MultipartEncoder(
    fields={
        "data": data,
        "params": '{"lime": {"y": [0]}}',
    }
)
result = requests.post(
    "http://0.0.0.0:3000/explain",
    headers={"Content-Type": m.content_type},
    data=m
).text

# Parse the results
from omnixai.explainers.base import AutoExplainerBase
exp = AutoExplainerBase.parse_explanations_from_json(result)
for name, explanation in exp.items():
    explanation.ipython_plot()

You can build Bento for deployment by following the steps shown in the BentoML repo. For more examples, please check Tabular, Vision, NLP.

How to Contribute

We welcome the contribution from the open-source community to improve the library!

To add a new explanation method/feature into the library, please follow the template and steps demonstrated in this documentation.

Technical Report and Citing OmniXAI

You can find more details in our technical report: https://arxiv.org/abs/2206.01612

If you're using OmniXAI in your research or applications, please cite using this BibTeX:

@article{wenzhuo2022-omnixai,
  author    = {Wenzhuo Yang and Hung Le and Silvio Savarese and Steven Hoi},
  title     = {OmniXAI: A Library for Explainable AI},
  year      = {2022},
  doi       = {10.48550/ARXIV.2206.01612},
  url       = {https://arxiv.org/abs/2206.01612},
  archivePrefix = {arXiv},
  eprint    = {206.01612},
}

Contact Us

If you have any questions, comments or suggestions, please do not hesitate to contact us at [email protected].

License

BSD 3-Clause License

omnixai's People

Contributors

Stargazers

Watchers

Forkers

yk-ren bharatr21 xinyingking malarksn johnaffolter allensmile vincentwei2021 bosben1210 vitasoftgit alokquamrul carbirbal batermj transparentapi qianrenjian rohitpandey13 metavai gpubrr042 rmallof omvishal1 tongli3701 fjmartinmartinez fudp tanmaylaud zqcsrz jvlcc maxp0tter zscsg jxzhangjhu denizkenankilic manjulasandhya aaoobd3 marmarhoun bhagyashri-dande vinaynaman xuannadi theblackcoathunt ttltwlj longshen931 ohhmydude ldidsbury 11301858 yudhiesh stjordanis benachenhoulinda minchoi02 rakhithjk linggom xueyagaga zeruiw techthiyanes delfimpandiani joelx manu87ds sandhiyara dr-alok-tiwari monkeyfox102 rmarquis leonardcser vikneo2017 ildefons turkerbdonmez ruij3250 arcturusmajere sundogs8603 longrookie pei-eng zhangyue325 tomdgr jackman337 yelloozombie teddyseptiadi xu-smog cicc1 thimontenegro praveen686 dearborn-open-ai sumerudataanalaytics spungit djun logan84 crazyivanz qujiji beixiaojie clabra sshuaichai kevinzhu0 vinlucero geniusnhu neithen-lu harel-coffee equitable-ai-research aakash665

omnixai's Issues

A pair of two images as input channel.

Dear Author;

Thanks for sharing the nice tool.

I would like to ask if this tool can receive a pair of two input images. I mean my model takes a pair of images in the input channel of the model. How can I extend your tools to take a pair of images?

Thanks in advance.

Plot limitation

Hi. My question is simple but i can't find out how to solve it. How can i plot something that i saved for later with the method get_explanations()?

In the documentation its written that "Each dict has the following format: {“images”: the input images, “labels”: the predicted labels, “masks”: the masks returned by the explainer, e.g., LIME, “boundary”: the boundaries extracted from “masks”}."

So wow can i plot the original image with and without the mask in this case?

For example
a=explanations.get_explanations(index=None) #now i want to plot later a[0]['images']

Question about target column

I found some codes in TabularExplainer(line 122, 123)

if training_data.target_column is not None:
    self.data = self.data[:, :-1]

Does that mean the target column must be the last column?

GlobalShap method getting error.

How can I use the global shap method with my code?

Does OmniXAI apply for a patent?

Missing features in LIME and SHAP

I followed the example to run the program of table classification, but I found that there are some features that are discarded in LIME and SHAP. I did not set ignored features. The original features of the data are Age, Workclass, fnlwgt, Education, Education-Num, Marital, Status , Occupation, Relationship, Race , Sex, Capital Gain, Capital Loss, Hours per week, Country. The features in the LIME results are Capital Gain, Marital Status, Education-Num, Hours per week, Capital Loss, Age, Occupation, Sex, Race, Relationship. The features in the SHAP are Captial Gain, Age, Education Num, Marital Status, Occupation, Hours per week, Workclass, fnlwgt, Sex, Capital Loss, Education, Country, Relationship.

Create Multiple dashboard in one

Is it possible to use different data or models and display a single dashboard? I would like to create multiple dashboards in one "show()".

Thank you so much!

Compatibility with Python 2

Hi! I would like to use OmniXAI with a project that requires Python 2. Would it be possible to run OmniXAI with this Python version?

AssertionError: Explainer lime -- training_data should be an instance of Tabular.

I'm trying to run OmniXAI in ML Workflow tutorial, in the cell:

explainers = TabularExplainer( explainers=["lime", "shap", "mace", "pdp", "ale"], mode="classification", data=train_data, model=gbtree, preprocess=lambda z: transformer.transform(z), params={ "lime": {"kernel_width": 3}, "shap": {"nsamples": 100}, "mace": {"ignored_features": ["Marital Status"]} } )

I got this error:

AssertionError: Explainer lime -- training_data should be an instance of Tabular.

How can I solve it?

How can we use OmniXai for object detection and Segmentation task?

I was trying to run Integrated Gradient technique for some classification models and i am able to do that.
But how can we use OmniXAI for models that gives bounding box as output?

How to plot ALE for pytorch regression model?

Hi, I have a Pytorch regression model. Could you please help me by providing a generic syntax for using OmniXAI to do so?

How does the pdp algorithm obtain the value range of the feature

What if analysis tool?

Hello everyone and congratulations on the work you have done!

I was wondering if it was possible to create a "What if Analysis"? i.e. when a user views graphs (e.g. Tabular) if the user can manually enter the values of the featuers or change them!

Thank you!

Omnixai is not working with bert fine tunned multitext classification problem.

HI, I am trying to run omnixai explainer over the bert fine tunned multitext classifier model to debug the model. It throwing an error. Could you please look into notebook https://github.com/loni9164/Text-explainer-issues/blob/main/BERT_omnixai.ipynb and help to me fix the issues?

Not able to run Timeseries Explainer incase of anomaly detection

I am trying to perform Explainability for Timeseries Anomaly detection ( DBSCAN model). I have loaded the model and prediction funtion for DBSCAN is model.fit_predict
Here is the code snippet:

And the Final errror : ValueError: Unable to convert array of bytes/strings into decimal numbers with dtype='numeric'
Although my dataframe contains only numeric values.
Here is the view of my data:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 300 entries, 0 to 299
Data columns (total 1 columns):
Column Non-Null Count Dtype
sample 300 non-null float64
dtypes: float64(1)

About the code efficiency

Hi, Thanks for this useful package.

I want to check how long it takes for you to run the 'nlp.ipynb' in tutorials folder.
It seems to take more than 10 mins in my server. Is there anything wrong?

The Kernel appears to have died

I have tried many times to implement OmniXAI Vision tutorial, but when i tried to import the VisionExplainer library the message "The kernel appears to have died" appears before finishing the importation. Is there any solution to solve this problem please?

Error in counterfactual for NLP model

I need to generate counterfactual results for my sentiment analysis model and I tried to use OmniXIA by replicating this example: https://opensource.salesforce.com/OmniXAI/latest/tutorials/nlp/ce_classification.html however I am getting below error while generating the counterfactual. I am using python 3.9 and the latest version of OmniXIA (1.2.3)

INFO:polyjuice.polyjuice_wrapper:Setup Polyjuice.
INFO:polyjuice.polyjuice_wrapper:Setup SpaCy processor.
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[5], line 9
      1 x = Text([
      2     "What a great movie! if you have no taste.",
      3     "it was a fantastic performance!",
   (...)
      7     "i've never watched something as bad"
      8 ])
----> 9 explanations = explainer.explain(x)

File ~\Anaconda3\envs\omnixai\lib\site-packages\omnixai\explainers\nlp\counterfactual\polyjuice.py:173, in Polyjuice.explain(self, X, max_number_examples, **kwargs)
    160 """
    161 Generates the counterfactual explanations for the input instances.
    162 
   (...)
    170 :return: The explanations for all the input instances.
    171 """
    172 if self.mode == "classification":
--> 173     return self._explain_classification(X=X, max_number_examples=max_number_examples, **kwargs)
    174 else:
    175     return self._explain_question_answering(X=X, max_number_examples=max_number_examples, **kwargs)

File ~\Anaconda3\envs\omnixai\lib\site-packages\omnixai\explainers\nlp\counterfactual\polyjuice.py:96, in Polyjuice._explain_classification(self, X, max_number_examples, **kwargs)
     94 for idx, text in enumerate(X.values):
     95     original_label = labels[idx]
---> 96     perturb_texts = self._perturb(text.lower(), **kwargs)
     97     perturb_texts = list(set([t.lower() for t in perturb_texts]))
     98     perturb_predictions, perturb_labels = self._predict(
     99         Text(perturb_texts, tokenizer=tokenizer))

File ~\Anaconda3\envs\omnixai\lib\site-packages\omnixai\explainers\nlp\counterfactual\polyjuice.py:73, in Polyjuice._perturb(self, text, **kwargs)
     71 ce_type = kwargs.get("ce_type", "perturb")
     72 if ce_type == "perturb":
---> 73     perturb_texts = self.explainer.perturb(
     74         text,
     75         num_perturbations=kwargs.get("num_perturbations", 10),
     76         perplex_thred=kwargs.get("perplex_thred", 10)
     77     )
     78 elif ce_type == "blank":
     79     perturb_texts = self.explainer.get_random_blanked_sentences(
     80         sentence=text,
     81         max_blank_sent_count=kwargs.get("num_perturbations", 10),
     82         is_token_only=True,
     83         max_blank_block=1
     84     )

File ~\Anaconda3\envs\omnixai\lib\site-packages\polyjuice\polyjuice_wrapper.py:247, in Polyjuice.perturb(self, orig_sent, blanked_sent, is_complete_blank, ctrl_code, perplex_thred, num_perturbations, verbose, **kwargs)
    245     logger.info("Generating on these prompts:")
    246     for p in prompts: logger.info(f" | {p}")
--> 247 generated = generate_on_prompts(
    248     generator=self.generator, prompts=prompts, **kwargs)
    249 merged = list(np.concatenate(generated))
    251 validated_set = []

File ~\Anaconda3\envs\omnixai\lib\site-packages\polyjuice\generations\generator_helpers.py:59, in generate_on_prompts(generator, prompts, temperature, num_beams, n, do_sample, batch_size, **kwargs)
     57 def generate_on_prompts(generator, prompts, temperature=1, 
     58     num_beams=None, n=3, do_sample=True, batch_size=128, **kwargs):
---> 59     preds_list = batched_generate(generator, prompts,
     60         temperature=temperature, n=n, 
     61         num_beams=num_beams, 
     62         do_sample=do_sample, batch_size=batch_size, **kwargs)
     63     if len(prompts) == 1:
     64         preds_list = [preds_list]

File ~\Anaconda3\envs\omnixai\lib\site-packages\polyjuice\generations\generator_helpers.py:44, in batched_generate(generator, examples, temperature, num_beams, num_return_sequences, do_sample, batch_size, **kwargs)
     42 with torch.no_grad():
     43     for e in (range(0, len(examples), batch_size)):
---> 44         preds_list += generator(
     45             examples[e:e+batch_size],
     46             temperature=temperature,
     47             return_tensors=True,
     48             num_beams=num_beams,
     49             max_length=1000,
     50             early_stopping=None if num_beams is None else True,
     51             do_sample=num_beams is None and do_sample,
     52             num_return_sequences=num_return_sequences, 
     53             **kwargs
     54         )
     55 return preds_list

File ~\Anaconda3\envs\omnixai\lib\site-packages\transformers\pipelines\text_generation.py:202, in TextGenerationPipeline.__call__(self, text_inputs, **kwargs)
    163 def __call__(self, text_inputs, **kwargs):
    164     """
    165     Complete the prompt(s) given as inputs.
    166 
   (...)
    200           ids of the generated text.
    201     """
--> 202     return super().__call__(text_inputs, **kwargs)

File ~\Anaconda3\envs\omnixai\lib\site-packages\transformers\pipelines\base.py:1063, in Pipeline.__call__(self, inputs, num_workers, batch_size, *args, **kwargs)
   1059 if can_use_iterator:
   1060     final_iterator = self.get_iterator(
   1061         inputs, num_workers, batch_size, preprocess_params, forward_params, postprocess_params
   1062     )
-> 1063     outputs = [output for output in final_iterator]
   1064     return outputs
   1065 else:

File ~\Anaconda3\envs\omnixai\lib\site-packages\transformers\pipelines\base.py:1063, in <listcomp>(.0)
   1059 if can_use_iterator:
   1060     final_iterator = self.get_iterator(
   1061         inputs, num_workers, batch_size, preprocess_params, forward_params, postprocess_params
   1062     )
-> 1063     outputs = [output for output in final_iterator]
   1064     return outputs
   1065 else:

File ~\Anaconda3\envs\omnixai\lib\site-packages\transformers\pipelines\pt_utils.py:124, in PipelineIterator.__next__(self)
    121     return self.loader_batch_item()
    123 # We're out of items within a batch
--> 124 item = next(self.iterator)
    125 processed = self.infer(item, **self.params)
    126 # We now have a batch of "inferred things".

File ~\Anaconda3\envs\omnixai\lib\site-packages\transformers\pipelines\pt_utils.py:125, in PipelineIterator.__next__(self)
    123 # We're out of items within a batch
    124 item = next(self.iterator)
--> 125 processed = self.infer(item, **self.params)
    126 # We now have a batch of "inferred things".
    127 if self.loader_batch_size is not None:
    128     # Try to infer the size of the batch

File ~\Anaconda3\envs\omnixai\lib\site-packages\transformers\pipelines\base.py:990, in Pipeline.forward(self, model_inputs, **forward_params)
    988     with inference_context():
    989         model_inputs = self._ensure_tensor_on_device(model_inputs, device=self.device)
--> 990         model_outputs = self._forward(model_inputs, **forward_params)
    991         model_outputs = self._ensure_tensor_on_device(model_outputs, device=torch.device("cpu"))
    992 else:

File ~\Anaconda3\envs\omnixai\lib\site-packages\transformers\pipelines\text_generation.py:244, in TextGenerationPipeline._forward(self, model_inputs, **generate_kwargs)
    242 prompt_text = model_inputs.pop("prompt_text")
    243 # BS x SL
--> 244 generated_sequence = self.model.generate(input_ids=input_ids, attention_mask=attention_mask, **generate_kwargs)
    245 out_b = generated_sequence.shape[0]
    246 if self.framework == "pt":

File ~\Anaconda3\envs\omnixai\lib\site-packages\torch\autograd\grad_mode.py:27, in _DecoratorContextManager.__call__.<locals>.decorate_context(*args, **kwargs)
     24 @functools.wraps(func)
     25 def decorate_context(*args, **kwargs):
     26     with self.clone():
---> 27         return func(*args, **kwargs)

File ~\Anaconda3\envs\omnixai\lib\site-packages\transformers\generation\utils.py:1296, in GenerationMixin.generate(self, inputs, max_length, min_length, do_sample, early_stopping, num_beams, temperature, penalty_alpha, top_k, top_p, typical_p, repetition_penalty, bad_words_ids, force_words_ids, bos_token_id, pad_token_id, eos_token_id, length_penalty, no_repeat_ngram_size, encoder_no_repeat_ngram_size, num_return_sequences, max_time, max_new_tokens, decoder_start_token_id, use_cache, num_beam_groups, diversity_penalty, prefix_allowed_tokens_fn, logits_processor, renormalize_logits, stopping_criteria, constraints, output_attentions, output_hidden_states, output_scores, return_dict_in_generate, forced_bos_token_id, forced_eos_token_id, remove_invalid_values, synced_gpus, exponential_decay_length_penalty, suppress_tokens, begin_suppress_tokens, forced_decoder_ids, **model_kwargs)
   1294 # 0. Validate the `.generate()` call
   1295 self._validate_model_class()
-> 1296 self._validate_model_kwargs(model_kwargs.copy())
   1298 # 1. Set generation parameters if not already defined
   1299 bos_token_id = bos_token_id if bos_token_id is not None else self.config.bos_token_id

File ~\Anaconda3\envs\omnixai\lib\site-packages\transformers\generation\utils.py:993, in GenerationMixin._validate_model_kwargs(self, model_kwargs)
    990         unused_model_args.append(key)
    992 if unused_model_args:
--> 993     raise ValueError(
    994         f"The following `model_kwargs` are not used by the model: {unused_model_args} (note: typos in the"
    995         " generate arguments will also show up in this list)"
    996     )

ValueError: The following `model_kwargs` are not used by the model: ['n'] (note: typos in the generate arguments will also show up in this list)

Why do the ALE plots show outliers that are not possible in my data?

The maximum value of gap in my data is around 120, however, the ALE plot result has 467, what is the reason?

In the PDP, the results is normal.

Which interpretable method can be used for 3D images?

Dear author:
I love your code very much, but I don't know which interpretable method works on 3D images,could you tell me if possible?

Ipython plot error

Hello,
I am running OmniXAI on tabular data for a binary classification task. All the functions are working well except the Ipython plot ones, which all throws this same error (picture attached). I couldn't identify what this error corresponds to. Could you please help me fix this issue ?

Thank you

GradCam explanations plot error

Hi,

I'm using OmniXAI for classification purposes and for my project. I've used the method get_explanations to get a dict of GradCam explanation for one specific image.

Now i'm trying to plot the original image + explanations on it using PIL (instead of using ones presented in docs since i need this as object later) in this way:

...
exp=explanations.get_explanations(index=None)
gradcam_img=exp[0]['image']
Imag1 = PilImage.fromarray(gradcam_img)
plt.imshow(Imag1)

but i obtain the following error: TypeError: Cannot handle this data type: (1, 1, 3), <i8. If i try to use astype(np.uint8) it plots the originale image without explanations on it.

This is problem does not show up by doing the same procedure with lime explanations since i correctly obtain the image with explanations on it. How can i solve this?

Timeseries data type issue

I am using merlion to build a forecasting model and use OmniXAI to explain it. I got an error that my training data is not instance of time-series while it is the same data set I used to train forecasting model with merlion:

import pandas as pd
from datetime import datetime
from merlion.utils.time_series import TimeSeries
from merlion.models.defaults import DefaultForecasterConfig, DefaultForecaster
from omnixai.explainers.timeseries.agnostic.shap import ShapTimeseries

df = pd.read_csv('test.csv')
time = df['Month of Datetime'].tolist()
target = df['#Passengers'].tolist() 

time = [datetime.strptime(date[:10],'%Y-%m-%d').date() for date in time]
df = pd.DataFrame(
        {'Datetime': time,
         'Target': target
        })
df['Datetime'] = pd.to_datetime(df['Datetime'])
df = df.set_index('Datetime')
df.index = pd.to_datetime(df.index)
df.index.freq = 'MS'

train_data = TimeSeries.from_pd(df)

explainers = ShapTimeseries(
        
    training_data =train_data,
    predict_function =my_function,
    mode ="forecasting"    
)

train_data is built with the TimeSeries method from Merlion however this is the error I got:


---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
~\AppData\Local\Temp\1/ipykernel_16344/1198959584.py in <module>
----> 1 explainers = ShapTimeseries(
      2 
      3     training_data =train_data,
      4     predict_function =model,
      5     mode ="forecasting"

~\Anaconda3\lib\site-packages\omnixai\explainers\timeseries\agnostic\shap.py in __init__(self, training_data, predict_function, mode, **kwargs)
     44         """
     45         super().__init__()
---> 46         assert isinstance(training_data, Timeseries), \
     47             "`training_data` should be an instance of Timeseries."
     48         assert mode in ["anomaly_detection", "forecasting"], \

AssertionError: `training_data` should be an instance of Timeseries.

TypeError: Explainer lime-- init() got an unexpected keyword argument 'handle_unknown'

Trying to run https://github.com/salesforce/OmniXAI/blob/main/tutorials/tabular_regression.ipynb and I get:

TypeError: Explainer shap -- __init__() got an unexpected keyword argument 'handle_unknown'

in cell:

# Initialize a TabularExplainer
explainers = TabularExplainer(
    explainers=["lime", "shap", "sensitivity", "pdp", "ale"],
    mode="regression",
    data=train_data,
    model=rf,
    preprocess=preprocess,
    params={
        "lime": {"kernel_width": 3},
        "shap": {"nsamples": 100}
    }
)
# Generate explanations
test_instances = test_data[0:5]
local_explanations = explainers.explain(X=test_instances)
global_explanations = explainers.explain_global(
    params={"pdp": {"features": ["MedInc", "HouseAge", "AveRooms",
                                 "AveBedrms", "Population", "AveOccup",
                                 "Latitude", "Longitude"]}}
)

Error in NLP example

I used the NLP example in https://github.com/salesforce/OmniXAI/blob/main/tutorials/nlp.ipynb, but I got the error

Traceback (most recent call last):
  File "/export/home/x/python_test/omnixai/text/nlp.py", line 42, in <module>
    local_explanations = explainer.explain(x)
  File "/export/home/x/miniconda3/envs/kf/lib/python3.9/site-packages/omnixai/explainers/base.py", line 276, in explain
    explanations = OrderedDict({"predict": self.predict(X)})
  File "/export/home/x/miniconda3/envs/kf/lib/python3.9/site-packages/omnixai/explainers/base.py", line 252, in predict
    predictions = self.predict_function(self._convert_data(X))
  File "/export/home/x/miniconda3/envs/kf/lib/python3.9/site-packages/omnixai/utils/misc.py", line 219, in _predict
    return tensor_to_numpy(postprocess(predict_func(*inputs)))
  File "/export/home/x/miniconda3/envs/kf/lib/python3.9/site-packages/transformers/pipelines/text_classification.py", line 125, in __call__
    result = super().__call__(*args, **kwargs)
  File "/export/home/x/miniconda3/envs/kf/lib/python3.9/site-packages/transformers/pipelines/base.py", line 1026, in __call__
    return self.run_single(inputs, preprocess_params, forward_params, postprocess_params)
  File "/export/home/x/miniconda3/envs/kf/lib/python3.9/site-packages/transformers/pipelines/base.py", line 1032, in run_single
    model_inputs = self.preprocess(inputs, **preprocess_params)
  File "/export/home/x/miniconda3/envs/kf/lib/python3.9/site-packages/transformers/pipelines/text_classification.py", line 134, in preprocess
    return self.tokenizer(inputs, return_tensors=return_tensors, **tokenizer_kwargs)
  File "/export/home/x/miniconda3/envs/kf/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2451, in __call__
    raise ValueError(
ValueError: text input must of type `str` (single example), `List[str]` (batch or single pretokenized example) or `List[List[str]]` (batch of pretokenized examples).

I found that the values of the Text class are used in many places in the program and the Text object is still returned

    @property
    def values(self):
        """
        Returns the raw text data.

        :return: A list of the sentences/texts.
        :rtype: List
        """
        return self.data

Some bugs in MACE

I often get the following two errors when using MACE on tabular data

Traceback (most recent call last):
  File "/export/home/xxx/miniconda3/envs/kf3/lib/python3.8/site-packages/omnixai/explainers/base.py", line 284, in explain
    explanations[name] = self.explainers[name].explain(X=X, **param)
  File "/export/home/xxx/miniconda3/envs/kf3/lib/python3.8/site-packages/omnixai/explainers/tabular/counterfactual/mace/mace.py", line 119, in explain
    candidates, indices = self.recall.get_cf_features(x, desired_label)
  File "/export/home/xxx/miniconda3/envs/kf3/lib/python3.8/site-packages/omnixai/explainers/tabular/counterfactual/mace/retrieval.py", line 187, in get_cf_features
    y, indices = self.get_nn_samples(instance, desired_label)
  File "/export/home/xxx/miniconda3/envs/kf3/lib/python3.8/site-packages/omnixai/explainers/tabular/counterfactual/mace/retrieval.py", line 174, in get_nn_samples
    indices = self._knn_query(query, desired_label, self.num_neighbors)[0]
  File "/export/home/xxx/miniconda3/envs/kf3/lib/python3.8/site-packages/omnixai/explainers/tabular/counterfactual/mace/retrieval.py", line 122, in _knn_query
    indices, distances = self.knn_models[label].knn_query(x, k=k)
RuntimeError: Cannot return the results in a contigious 2D array. Probably ef or M is too small

Traceback (most recent call last):
  File "/export/home/xxx/miniconda3/envs/kf3/lib/python3.8/site-packages/omnixai/explainers/base.py", line 284, in explain
    explanations[name] = self.explainers[name].explain(X=X, **param)
  File "/export/home/xxx/miniconda3/envs/kf3/lib/python3.8/site-packages/omnixai/explainers/tabular/counterfactual/mace/mace.py", line 139, in explain
    cfs_df = cfs.to_pd()
AttributeError: 'NoneType' object has no attribute 'to_pd'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "z/mace.py", line 99, in <module>
    explanations = explainers.explain(test_instances)
  File "/export/home/xxx/miniconda3/envs/kf3/lib/python3.8/site-packages/omnixai/explainers/base.py", line 286, in explain
    raise type(e)(f"Explainer {name} -- {str(e)}")
AttributeError: Explainer mace -- 'NoneType' object has no attribute 'to_pd'

The phenomenon is that I use the same code, but there will be two kinds of unreasonable errors. The previous CPU error #68 seems to be because these tracebacks are not shown in the container.

transformer.invert() does not perform as expected when inverting labels

Hi, wonderful tool you're building here!
I was following the tabular classification example, and seem to run in to an issue with the transformer inverting the labels:

using as reference https://github.com/salesforce/OmniXAI/blob/main/tutorials/tabular_classification.ipynb

After training the xgboost model, you can run transformer.invert(test), and it returns the original (train-split) dataset with 6513 rows.
But transformer.invert(test_labels) returns a Dataframe (?) with the first row populated with features, and the label as a column.
Also, the length of this DF is 6405 rows (< 6513 rows of the test DF).

Apologies for the poor formatting, hopefully with this you can recreate the issue.
Any ideas what may have caused this?
Thanks, Scott

I will be grateful if you share information about supporting of OmniXai for Image Captioning models. Thanks in advance.

OmniXAI Showing "StagingError" error when using Shap Technique with Batchnormlayer

`model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input_shape))
model.add(tf.keras.layers.Conv2D(64, (3, 3), activation='relu'))
model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2)))

model.add(tf.keras.layers.BatchNormalization())

after removing this batchnorm layer it is working

model.add(tf.keras.layers.Dropout(0.1))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(128, activation='relu'))
model.add(tf.keras.layers.Dropout(0.1))
model.add(tf.keras.layers.Dense(num_classes))`
i am trying to test this model with Shap but it is giving "Stagingerror" if i remove the BatchNormalization layer then it is working.

return this error

Grad-CAM for visual language tasks dose not work on the lavis blip model !!

I have installed the lavis and omniXAI from your official repositories and then followed your example for Grad-CAM for visual language tasks by the way it seems your example is out of date when I tried to load the model it gives me this error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [15], in <cell line: 3>()
      1 pretrained_path = \
      2     "https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_base_retrieval_coco.pth"
----> 3 model = BlipITM(pretrained=pretrained_path, vit="base")
      4 image_processor = load_processor("blip_image_eval").build(image_size=384)
      5 text_processor = load_processor("blip_caption")

TypeError: __init__() got an unexpected keyword argument 'pretrained'

So, I come up with a way to load the model as the lavis mentioned it loaded

model = load_model(
        name="blip_image_text_matching", model_type="base", is_eval=True, device=device, checkpoint=ckp_path
    )

but in the explainer method, I got this error!

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Input In [13], in <cell line: 1>()
      1 explainer = GradCAM(
      2     model=model,
----> 3     target_layer=model.text_encoder.base_model.base_model.encoder.layer[6].
      4         crossattention.self.attention_probs_layer,
      5     preprocess_function=preprocess,
      6     tokenizer=tokenizer,
      7     loss_function=lambda outputs: outputs[:, 1].sum()
      8 )

File ~/.conda/envs/lavis/lib/python3.8/site-packages/torch/nn/modules/module.py:1177, in Module.__getattr__(self, name)
   1175     if name in modules:
   1176         return modules[name]
-> 1177 raise AttributeError("'{}' object has no attribute '{}'".format(
   1178     type(self).__name__, name))

AttributeError: 'BertSelfAttention' object has no attribute 'attention_probs_layer'

Could you please help me to solve this issue?

import issue

I am using omnixai 1.3.1 abd get the following error when import the TabularExplainer:

from omnixai.explainers.tabular import TabularExplainer

Error:


---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
File ~/opt/anaconda3/envs/xai/lib/python3.9/site-packages/aiohttp/client_reqrep.py:70
     69 try:
---> 70     import cchardet as chardet
     71 except ImportError:  # pragma: no cover

ModuleNotFoundError: No module named 'cchardet'

During handling of the above exception, another exception occurred:

ImportError                               Traceback (most recent call last)
Cell In[1], line 31
     28 from scipy.stats import chi2_contingency
     29 import xgboost as xgb
---> 31 from omnixai.explainers.tabular import TabularExplainer
     32 from omnixai.data.tabular import Tabular

File ~/opt/anaconda3/envs/xai/lib/python3.9/site-packages/omnixai/explainers/tabular/__init__.py:17
     15 from .agnostic.shap_global import GlobalShapTabular
     16 from .agnostic.bias import BiasAnalyzer
---> 17 from .agnostic.gpt import GPTExplainer
     18 from .counterfactual.mace.mace import MACEExplainer
     19 from .counterfactual.ce import CounterfactualExplainer

File ~/opt/anaconda3/envs/xai/lib/python3.9/site-packages/omnixai/explainers/tabular/agnostic/gpt.py:11
      7 """
      8 The explainer based ChatGPT.
      9 """
     10 import os
---> 11 import openai
     12 from typing import Callable, List
     13 from omnixai.data.tabular import Tabular

Facing problem with explaining test instances

RuntimeError Traceback (most recent call last)
~/anaconda3/envs/ntnu_meticos/lib/python3.7/site-packages/omnixai/explainers/base.py in explain(self, X, params, run_predict)
283 param = params.get(name, {})
--> 284 explanations[name] = self.explainers[name].explain(X=X, **param)
285 except Exception as e:

~/anaconda3/envs/ntnu_meticos/lib/python3.7/site-packages/omnixai/explainers/tabular/counterfactual/mace/mace.py in explain(self, X, y, max_number_examples, **kwargs)
124 # Get candidate features
--> 125 candidates, indices = self.recall.get_cf_features(x, desired_label)
126

~/anaconda3/envs/ntnu_meticos/lib/python3.7/site-packages/omnixai/explainers/tabular/counterfactual/mace/retrieval.py in get_cf_features(self, instance, desired_label)
186 x = instance.to_pd(copy=False)
--> 187 y, indices = self.get_nn_samples(instance, desired_label)
188 cate_candidates, cont_candidates = {}, {}

~/anaconda3/envs/ntnu_meticos/lib/python3.7/site-packages/omnixai/explainers/tabular/counterfactual/mace/retrieval.py in get_nn_samples(self, instance, desired_label)
173 )
--> 174 indices = self._knn_query(query, desired_label, self.num_neighbors)[0]
175 y = self.subset.iloc(indices).to_pd(copy=False)

~/anaconda3/envs/ntnu_meticos/lib/python3.7/site-packages/omnixai/explainers/tabular/counterfactual/mace/retrieval.py in _knn_query(self, x, label, k)
121 """
--> 122 indices, distances = self.knn_models[label].knn_query(x, k=k)
123 neighbors = [[idx[i] for i in range(len(idx)) if dists[i] > 0] for idx, dists in zip(indices, distances)]

RuntimeError: Cannot return the results in a contigious 2D array. Probably ef or M is too small

During handling of the above exception, another exception occurred:

RuntimeError Traceback (most recent call last)
/tmp/ipykernel_3818713/1430780571.py in
14 # Generate explanations
15 test_instances = test_data[0:7]
---> 16 local_explanations = explainers.explain(X=test_instances)
17 global_explanations = explainers.explain_global(
18 params={"pdp": {"features": []}}

~/anaconda3/envs/ntnu_meticos/lib/python3.7/site-packages/omnixai/explainers/base.py in explain(self, X, params, run_predict)
284 explanations[name] = self.explainers[name].explain(X=X, **param)
285 except Exception as e:
--> 286 raise type(e)(f"Explainer {name} -- {str(e)}")
287 return explanations
288

RuntimeError: Explainer mace -- Cannot return the results in a contigious 2D array. Probably ef or M is too small

About Integrated-gradient methods for transformer models

Hi, I find that your implementation has the id2token into IntegratedGradientText explainer. However for transformers, I think the attribution is actually on input_ids (tokenization results from tokenizer). We may need to summarize the attributed value in each word. You may check your code, and consider to implement this feature for transformer models.

TimeseriesExplainer for forecasting?

Thanks for the excellent work!
And I wonder if OmniXAI supports interpretability of time series forecasting?

Persistence of `TabularTransform` constructor default arguments

Hello! I think there is a potential issue when one tries to create several TabularTransform instances (omnixai.preprocessing.tabular.TabularTransform).

Specifically, the __init__ function of TabularTransform has default values for its 3 arguments which are instantiations of transformation classes. Due to python's way of parsing, these instantiations are evaluated when the function is defined, and never again. This means that the default values are objects shared across all instances of TabularTransform.

This leads to highly unintuitive and misleading results in some cases. As an example:

import pandas as pd
from omnixai.data.tabular import Tabular
from omnixai.preprocessing.tabular import TabularTransform

df1 = pd.DataFrame({"cat_feat": ["A", "B", "B", "A"], "num_feat": [1, 2, 3, 4]})
print(df1)
# output:
#   cat_feat  num_feat
# 0        A         1
# 1        B         2
# 2        B         3
# 3        A         4

tabular1 = Tabular(df1, categorical_columns=["cat_feat"])
transform1 = TabularTransform().fit(tabular1)
print(transform1.cate_shape)
# output: 2
print(transform1.cate_transform.get_feature_names())
# output: ['x0_A' 'x0_B']

df2 = pd.DataFrame({"cat_feat_1": ["A", "B", "B", "A"], "cat_feat_2": ["h", "h", "h", "l"], "num_feat": [1, 2, 3, 4]})
print(df2)
#output:
#   cat_feat_1 cat_feat_2  num_feat
# 0          A          h         1
# 1          B          h         2
# 2          B          h         3
# 3          A          l         4

tabular2 = Tabular(df2, categorical_columns=["cat_feat_1", "cat_feat_2"])
transform2 = TabularTransform().fit(tabular2)
print(transform2.cate_shape)
# output: 4
print(transform2.cate_transform.get_feature_names())
# output: ['x0_A' 'x0_B' 'x1_h' 'x1_l']

print(transform1.cate_shape)
# output: 2
print(transform1.cate_transform.get_feature_names())
# output: ['x0_A' 'x0_B' 'x1_h' 'x1_l']

It can be observed that despite not touching the transform1 object, its get_feature_names method returns the same array as transform2 in the end.

Task Type

Currently task type only support "classification" or "regression"?
Will detection be support soon?

Methods for Transformer

Are there some methods for Transformer in OmniXAI?

The meaning of the two classes in the image LIME method

I use the example of image classification in the tutorial. I found that in the LIME method, OmniXAI outputs two classes. What do these two classes mean?

Not able to run the explainers with omnixai

attaching the screenshots, Please give me the resolution

How can we use OmniXAI for pyspark models? Anyone tried it out?

I have pyspark model - RandomForest which is pretrained model....and also i don't want to retrain it.

Illegal instruction (core dumped)

When I use the MACE algorithm in the tabular classification task, Illegal instruction (core dumped) often appears. Why?

Question on the Image class

Hi,

I've a question on the Image class (from omnixai.data.image). I'm working on the WM811K dataset. After some preprocessing, my images are 2-D array (128x128) with values between 0-1.

import matplotlib.pyplot as plt
data1 = falsePositiveDF["WAFER_MAP"].iloc[0]
image = tf.expand_dims(input=data1, axis=2)
plt.axis("off")
plt.imshow(image, origin="lower", cmap="gray")
plt.show()

The above code produces something like this -

But when I convert it to the Image object to use GradCAM (from omnixai.explainers.vision.specific.gradcam), I see the image is distorted (the circular wafer is gone). What am I doing wrong? Do I need to scale the image pixels to 0-255?

import plotly.express as pltex
from omnixai.data.image import Image
from PIL import Image as PilImage
test_img1 = Image(PilImage.fromarray(obj=data1, mode="L"))
fig = pltex.imshow(test_img1.to_pil(), origin="lower", binary_string=True)
fig.show()

The above code produces something like this -

As a result, when I am trying to use the GradCAM, things are not working.

Add a `conda` install option for `omnixai`

A conda installation option could be very helpful. I have already started working on this, to add omnixai to conda-forge.

Conda-forge PR:

conda-forge/staged-recipes#20089

Once the conda-forge PR is merged, you will be able to install the library with conda as follows:

conda install -c conda-forge omnixai

💡 I will push a PR to update the docs once the package is available on conda-forge.

OmniXAI install error

While installing OmniXAI with pip install getting an error: "Building wheel for hnswlib (pyproject.toml) ... error
error: subprocess-exited-with-error"

Here is the error log:
Building wheel for hnswlib (pyproject.toml) ... error
error: subprocess-exited-with-error

× Building wheel for hnswlib (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [57 lines of output]
running bdist_wheel
running build
running build_ext
creating tmp
gcc -pthread -B /opt/conda/envs/omnixai/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /opt/conda/envs/omnixai/include -fPIC -O2 -isystem /opt/conda/envs/omnixai/include -fPIC -I/opt/conda/envs/omnixai/include/python3.10 -c /tmp/tmpy7q_qfxa.cpp -o tmp/tmpy7q_qfxa.o -std=c++14
gcc -pthread -B /opt/conda/envs/omnixai/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /opt/conda/envs/omnixai/include -fPIC -O2 -isystem /opt/conda/envs/omnixai/include -fPIC -I/opt/conda/envs/omnixai/include/python3.10 -c /tmp/tmpdgx0wt9f.cpp -o tmp/tmpdgx0wt9f.o -std=c++11
Traceback (most recent call last):
File "/opt/conda/envs/omnixai/lib/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 363, in
main()
File "/opt/conda/envs/omnixai/lib/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 345, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
File "/opt/conda/envs/omnixai/lib/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 261, in build_wheel
return _build_backend().build_wheel(wheel_directory, config_settings,
File "/tmp/pip-build-env-vkcm82xu/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 244, in build_wheel
return self._build_with_temp_dir(['bdist_wheel'], '.whl',
File "/tmp/pip-build-env-vkcm82xu/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 229, in _build_with_temp_dir
self.run_setup()
File "/tmp/pip-build-env-vkcm82xu/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 174, in run_setup
exec(code, locals())
File "", line 116, in
File "/tmp/pip-build-env-vkcm82xu/overlay/lib/python3.10/site-packages/setuptools/init.py", line 87, in setup
return distutils.core.setup(**attrs)
File "/tmp/pip-build-env-vkcm82xu/overlay/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 177, in setup
return run_commands(dist)
File "/tmp/pip-build-env-vkcm82xu/overlay/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 193, in run_commands
dist.run_commands()
File "/tmp/pip-build-env-vkcm82xu/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 968, in run_commands
self.run_command(cmd)
File "/tmp/pip-build-env-vkcm82xu/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 1217, in run_command
super().run_command(command)
File "/tmp/pip-build-env-vkcm82xu/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
cmd_obj.run()
File "/tmp/pip-build-env-vkcm82xu/overlay/lib/python3.10/site-packages/wheel/bdist_wheel.py", line 299, in run
self.run_command('build')
File "/tmp/pip-build-env-vkcm82xu/overlay/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 317, in run_command
self.distribution.run_command(command)
File "/tmp/pip-build-env-vkcm82xu/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 1217, in run_command
super().run_command(command)
File "/tmp/pip-build-env-vkcm82xu/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
cmd_obj.run()
File "/tmp/pip-build-env-vkcm82xu/overlay/lib/python3.10/site-packages/setuptools/command/build.py", line 24, in run
super().run()
File "/tmp/pip-build-env-vkcm82xu/overlay/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 131, in run
self.run_command(cmd_name)
File "/tmp/pip-build-env-vkcm82xu/overlay/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 317, in run_command
self.distribution.run_command(command)
File "/tmp/pip-build-env-vkcm82xu/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 1217, in run_command
super().run_command(command)
File "/tmp/pip-build-env-vkcm82xu/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
cmd_obj.run()
File "/tmp/pip-build-env-vkcm82xu/overlay/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 79, in run
_build_ext.run(self)
File "/tmp/pip-build-env-vkcm82xu/overlay/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 339, in run
self.build_extensions()
File "", line 103, in build_extensions
File "", line 70, in cpp_flag
RuntimeError: Unsupported compiler -- at least C++11 support is needed!
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for hnswlib
Failed to build hnswlib
ERROR: Could not build wheels for hnswlib, which is required to install pyproject.toml-based projects

omnixai.preprocessing.tabular.TabularTransform changes data dimension

Hello,

Thank you for your amazing repo of XAI. And I am trying to apply it on my data analysis task.
However, I found that omnixai.preprocessing.tabular.TabularTransform.transform() changes the dimension of my data.

Details

I followed your lime tutorial with few modifications:
Instead of using your example dataset, I used my own dataset with a shape of (81, 13), where the dimension [:, -1] is label and all there rest 12 are numerical data. Here are the segments of my codes that raises my concern:

from omnixai.data.tabular import Tabular
from omnixai.preprocessing.tabular import TabularTransform

print(data.shape) # (81, 13)
tab_dat = Tabular(
    data,
    feature_columns=["F1","F2", ..., "F12", "label"],
    target_column="label"
)

print(tab_dat.shape) # (81, 13)
print(TabularTransform().fit(tab_dat).transform(tab_dat).shape) # (81, 25)

So, could you help me determining why TabularTransform changing the dimension of my data?

Thank you!

Cytosine

AttributeError: 'BertSelfAttention' object has no attribute 'attention_probs_layer'

Hello !!

I tried to “tutorials/vision/gradcam_vlm.ipynb”.

But, I got the following ERROR.
“AttributeError: 'BertSelfAttention' object has no attribute 'attention_probs_layer’”

I could not find the attention_probs_layer in the BertSelfAttention code "lavis/models/med.py".

How was attention_probs_layer called?

Any solutions would be very helpful.

Thank you !!

I ran the following code.

# This default renderer is used for sphinx docs only. Please delete this cell in IPython.
import plotly.io as pio
pio.renderers.default = "png"
import os
import torch
import unittest
import numpy as np
from PIL import Image as PilImage
from omnixai.data.text import Text
from omnixai.data.image import Image
from omnixai.data.multi_inputs import MultiInputs
from omnixai.preprocessing.image import Resize
from omnixai.explainers.vision_language.specific.gradcam import GradCAM

from lavis.models import BlipITM
from lavis.models import load_model
from lavis.processors import load_processor

image = Resize(size=480).transform(
    Image(PilImage.open("./demo.jpg").convert("RGB")))
text = Text("A girl playing with her dog on the beach")
inputs = MultiInputs(image=image, text=text)


pretrained_path = \
    "https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_base_retrieval_coco.pth"
model = load_model("blip_image_text_matching", "base", checkpoint=pretrained_path)
image_processor = load_processor("blip_image_eval").build(image_size=384)
text_processor = load_processor("blip_caption")
tokenizer = BlipITM.init_tokenizer()

def preprocess(x: MultiInputs):
    images = torch.stack([image_processor(z.to_pil()) for z in x.image])
    texts = [text_processor(z) for z in x.text.values]
    return images, texts

explainer = GradCAM(
    model=model,
    target_layer=model.text_encoder.base_model.base_model.encoder.layer[6].
        crossattention.self.attention_probs_layer,
    preprocess_function=preprocess,
    tokenizer=tokenizer,
    loss_function=lambda outputs: outputs[:, 1].sum()
)


---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[24], line 3
      1 explainer = GradCAM(
      2     model=model,
----> 3     target_layer=model.text_encoder.base_model.base_model.encoder.layer[6].
      4         crossattention.self.attention_probs_layer,
      5     preprocess_function=preprocess,
      6     tokenizer=tokenizer,
      7     loss_function=lambda outputs: outputs[:, 1].sum()
      8 )

File [~/.pyenv/versions/3.8.10/lib/python3.8/site-packages/torch/nn/modules/module.py:1614](https://file+.vscode-resource.vscode-cdn.net/Users/masudaryuto/Documents/GitHub/NumberOfOrdersImageLanguageBaseModelInvestigation/verificationTask/Models/MDETR/~/.pyenv/versions/3.8.10/lib/python3.8/site-packages/torch/nn/modules/module.py:1614), in Module.__getattr__(self, name)
   1612     if name in modules:
   1613         return modules[name]
-> 1614 raise AttributeError("'{}' object has no attribute '{}'".format(
   1615     type(self).__name__, name))

AttributeError: 'BertSelfAttention' object has no attribute 'attention_probs_layer'

I work with a lot of meteorological data which is often in gridded format (3D).

My question is, does OmniXAI support 3D gridded data?

Thanks

problem using "omnixai" via pip in colab

Hi,

First, I installed the library as follows (in colab) :

!pip install omnixai

But, when I try to do something as simple as:

from omnixai.explainers.vision import VisionExplainer

I get: ContextualVersionConflict: (scikit-learn 1.2.2 (/usr/local/lib/python3.10/dist-packages), Requirement.parse('scikit-learn<1.2,>=0.24'), {'omnixai'})

How should I proceed???
Thanks!