Git Product home page Git Product logo

fzi-forschungszentrum-informatik / tsinterpret Goto Github PK

View Code? Open in Web Editor NEW
102.0 3.0 8.0 132.09 MB

An Open-Source Library for the interpretability of time series classifiers

License: BSD 3-Clause "New" or "Revised" License

Python 97.20% Makefile 0.14% TeX 2.65%
counterfactual-explanations explainable-ai explainable-artificial-intelligence explainable-ml feature-attribution interpretability interpretable-machine-learning interpretable-ml time-series

tsinterpret's People

Contributors

belaboe97 avatar britta-wstnr avatar github-actions[bot] avatar jhoelli avatar kulbachcedric avatar pyaf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

tsinterpret's Issues

LEFTIST Output Is Independent of Data/Model

Describe the bug
Regardless of the model and data, the output of the LEFTIST is either the first few time steps or last few time steps. In a few experiment I did, it seems that it purely depends on the class label, not the model or data. Even in the documentation (https://fzi-forschungszentrum-informatik.github.io/TSInterpret/Notebooks/Leftist_torch/) the same issue presents.
Here are a few examples where the distinguishable feature is the Sine wave segment of the time series.
0
1
2
3

To Reproduce
Steps to reproduce the behavior. Also State the OS and the versions you are using.

Additional context
Add any other context about the problem here.

SETS compatability for Pytorch

Describe the bug
For the pytorch model in SETS, there is an issue in set.py that model.predict doesn't work.

To Reproduce
Steps to reproduce the behavior. Also State the OS and the versions you are using.

Additional context
Add any other context about the problem here.

General Issues

Hello,
Thanks for making this framework available. Its really easy and nice to use!

I faced a few issues in my own usage, which I'd like to bring to your attention:1

  1. When installing, it targets particular package versions. This has the side-effect of affecting my Python environment e.g by downgrading some of my own packages, and upgrading others. This might be troublesome for some setups, where certain libraries are fragile or sensitive. Case in point: I had a particular matplotlib version, which TSInterpret uninstalled and downgraded

  2. I was trying to use some of the TF explainability methods e.g GS with TSR, and kept getting an error. It turns out that in the SaliencyMethods_TF.py module, the import for shap is commented out, which causes these errors. I'm also not sure if shap is automatically installed during the installation process.

  3. Using the in-built plotter e.g int_mod.plot(...), I get a single line plot which is perfectly flat. This was for a univariate dataset, so I'm not sure if this is the reason or something else? This is using pure TSR with IG in TSR mode (i.e TSR = True)

Once again, thank you for the effort of making such a wonderful package.

[Q]Feature importance in TSInterpret

Hi, I wonder that if I can get the feature importance (independent of the time steps) from the TSInterpret. For example, we obtain the 'explanations' which is an array recording the time steps' scores of an instance before visualization in TSR or in LEFTIST. May I average the scores of time steps as the feature importance score?

Installation Error due to "sklearn" package name

Describe the bug
A clear and concise description of what the bug is.

Collecting sklearn
  Downloading sklearn-0.0.post1.tar.gz (3.6 kB)
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  Preparing metadata (setup.py) ... error
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

To Reproduce
Steps to reproduce the behavior. Also State the OS and the versions you are using.

run in a Colab notebook

!pip install TSInterpret

Additional context
Add any other context about the problem here.

Sample and label not aligned in Documentation

Describe the bug
In the how-to documentation:https://fzi-forschungszentrum-informatik.github.io/TSInterpret/Notebooks/TSEvo_torch/

test_loader = torch.utils.data.DataLoader(test_dataset,batch_size=1,shuffle=True)
bservation_01, label_01 = test_dataset[0]
model.eval()
y_pred,labels= get_all_preds(model,test_loader)
label_01=np.array([y_pred[0]])
print(observation_01.shape)
print(label_01.shape)

The test loader will undergo the shuffle and the label of the first batch will not be the first one in the dataset.
In the sample code, the oberservation_01 is from the first of test_dataset and label_01 is given as the value of y_pred[0], they are not aligned I think.

Trying to use COMTE and TSR but facing when plotting features #43

Hello,

I am doing multivariate time series classification with 2 features.
When using the plot function with either COMTE or TSR only a portion of the features are being plotted for each instance.

When I debugged I suspected an issue with the reshape done in the plot.

BEFORE Reshape: In the below screenshot is the item values we can see 2 features starting both from 0 value:

image

AFTER Reshape: here we see that the reshape didn't conserve all the values of the feature (we can that not both features are starting at 0)

image

This is happening inside both plot functions: plot_in_one in CF.py at line 181 when it reshape the item
and plot in Saliency_Base.py at line 61 when it reshape the item

Note: I am using tensorflow

can you please help ? am I missing something or there is a bug ?

Improve Error Handeling and add Warnings

  • Add error handling for incorrectly formated inputs
  • Warning for Method Pitfalls (e.g., choosing the basis of a Perturbation Based Explainer) #34
  • COMTE add dataset warning, if all labels are identical / the desired class not provided

[Q]Implements of TSEvo

It seems that in Evoutils.py line 100,should be
for item in items[0]:
other than
for item in items ,because np.where return a tuple.
Uniform crossover operates on segmented timeseries,
not on whole feature.

Update Docs

  • Update Starting Page
  • Update Algorithm Overview

NativeGuideCF(method=NG) can return a CF of the same class.

Hello,
As is it mentioned in this issue's title, I have tried many times the current CF generation in this configuration, and it often generates a CF of the same label than the one predicted already by the the model.

Hope the issue is clear enough !
Thanks

[Q]TSEVO on Multivariate Time Series

hello, thx for making this framework.
I'm having some difficulty when applying TSEVO on multivariate time series.
Everything works fine when used on a single channel dataset, but on multi-channel, the console always prints "Items are identical".
I find that TSEvo.explain returns a 1-d series as counterfactual
return np.array(ep)[0][0], output,
and when call CF.plot_in_one,
res = (item != exp).any(-1), ind = np.where(res[0]),
res becomes a 1-d array and makes len(ind)===0
which leads above problem.
I'm not sure if there is a problem with my understanding of the algorithm (it does return 'Items are identical') or if there is an identical program, so I hope you can help me.
sorry for my poor English /(ㄒoㄒ)/~~

[Q] Why COMTE sometimes give cf_label same as predicted?

Hello,

I am doing multivariate time series classification with 2 features.
I am trying to use COMTE explain function.

However I am facing issues where the returned label of the explain function is the same as the predicted one (even though I am passing orig_class=np.argmax(prob_item) and trage_class = np.argmin(prob_item)).

I tried debugging the code of the explain and I found that the deduced "other" has a target same as the orig_class and different than the target passed:

image

is this normal ? is the error from my side or do you think it might be a bug ?
Can you please explain to me if i misunderstood something.

Thank you

Native Guide always generate counterfactual at the first

Describe the bug
When using the Native Guide method:
exp_model = NativeGuideCF(model, (train_x, train_y), backend='PYT', mode='feat', method='NUN_CF', max_iter=10000)

The generated results are always swapping components starting at the beginning of the datasets

ChinaTown:
ChinaTown

Coffee:
coffee

GunPoint:
gunpoint

To Reproduce
Steps to reproduce the behavior. Also State the OS and the versions you are using.

Additional context
Add any other context about the problem here.

Do you have any sort of tutorials on how to run a sklearn model with LEFTIST? [Q]

First of all i would like to thank the developers and researchers on building this framework. I have successfully used the LEFTIST explanation method with a tensorflow model and that is going to be of good use on my research, however when i implement a random tree classifier i am not able to run the model, i always get this error:

IndexError Traceback (most recent call last)
in <cell line: 1>()
----> 1 explanations = leftist.explain(explained_instance,label)

/usr/local/lib/python3.10/dist-packages/TSInterpret/InterpretabilityModels/leftist/leftist.py in explain(self, instance, idx_label, random_state)
117
118 if self.mode == "feat":
--> 119 instance = instance.reshape(instance.shape[-1], instance.shape[-2])
120 else:
121 instance = instance.reshape(instance.shape[-2], instance.shape[-1])

IndexError: tuple index out of range

train_x has shape (150,150) and test_y shape (150), i have changed the shape from (150,150,1) in train x to fit the random forest classifier. I can give more details on my code if need be.

Native Guide explainer - issue generating the CF graph

Resource / Tutorial for reference
https://fzi-forschungszentrum-informatik.github.io/TSInterpret/Notebooks/NunCF_torch/

Context :

  • X (2400 time series of 1488 values) - shape=(2400, 1488)

  • We reshaped X according to what is done in the tutorial to have X.shape=(2400, 1, 1488).

  • y (labels) - shape=(2400,1)

  • both of them split in train/validation sets

All went well, including the NativeGuide model declaration.

exp_model = NativeGuideCF(
    model,
    (X_train, y_train),
    backend='PYT', 
    mode='feat',
    method='NUN_CF'
)

However issue starts at this part :

cf_ts, cf_label = exp_model.explain(
    input_x.numpy(),
    np.argmax(model(input_x).detach(), axis=1)
)

where

  • input_x.shape=(1, 1, 1488) is the instance that we want to explain.

The code generates the following error :

    229 while prob_target > 0.5 and counter < max_iter:
    230     subarray_length += 1
--> 231     most_influencial_array = self._findSubarray(
    232         (training_weights), subarray_length
    233     )
    234     starting_point = np.where(training_weights == most_influencial_array[0])[0][
    235         0
    236     ]
    237     X_example = instance.copy().reshape(1, -1)
...
---> 43 result = getattr(asarray(obj), method)(*args, **kwds)
     44 if wrap:
     45     if not isinstance(result, mu.ndarray):

ValueError: attempt to get argmax of an empty sequence 

Only way to make this cell work is to replace np.argmax(...) by one of the label that is not argmax, which doesnt really makes sense imo.

Do you have an idea of what is happening ?
Thanks for your help

Features returned from the Captum package does not match the Captum output

Describe the bug
The attribution returned by Saliency_PTY did not make any sense to me. So, I checked with the original Captum package and I found out that the Captum output is different than the attribution returned by TSInterpret.
Here is the output of TSinterpret
TSInterpret
Here is the output of Captum.
Captum

The dataset is synthetic. Just a portion of Sine wave (distinguishable feature) added to a random noise. So, it is clear that the attribution of the TSInterpret is incorrect. In this example I used IG. But, this is the case for others that I tried as well.

I hope it helps you fix the issue.

kaggle seems to be orphaned dependency

Describe the bug
A clear and concise description of what the bug is.

the kaggle package isn't actually used in the python code, i.e. it's not needed for the installation process (e.g. setup.py).
https://github.com/search?q=repo%3Afzi-forschungszentrum-informatik%2FTSInterpret%20kaggle&type=code

if kaggle is only used as CLI downloader, then let end-users install it separately. Write a note the docs howto.

To Reproduce
Steps to reproduce the behavior. Also State the OS and the versions you are using.

Additional context
Add any other context about the problem here.

TSR generates error

Hello,
Some errors are occurring during TSR execution. We tried different configurations (tried the different saliency methods available, FO, IG, SVS, ...).

Context:

The following cell worked.

int_mod = TSR(
    model, 
    train_x.shape[-1],
    train_x.shape[-2], 
    method='IG',
    mode='feat'
)

But this one didn't :

exp=int_mod.explain(
    input_x,
    labels=class_target,
    TSR=True
)

Error :

    147 elif self.method == "IG":
    148     base = baseline_single
--> 149     attributions = self.Grad.attribute(
    150         input, baselines=baseline_single, target=labels
    151     )
    152 elif self.method == "DL":
    153     base = baseline_single

     41 def wrapper(*args, **kwargs):
---> 42     return func(*args, **kwargs)
...
    527         )
    528 else:
--> 529     raise AssertionError(f"Target type {type(target)} is not valid.")

AssertionError: Target type  is not valid.

Any idea what can be the problem here ?
Model, train_x, labels and so on are not the origin of the problem given the fact that they respect the same format as expected and worked in NativeGuide execution.

Thanks in advance !

[Q] Deprecation of Pandas forced (1.3.5)

hi.
I was trying to use this package for XAI purposes but there is a version management conflict. TSInterpret forces pandas to be <= 1.3.5 when I want/need to use pandas2.

Is there a way to have this with your package ?
Thanks a lot for your help

The naming issue of NativeGuideCF

Thanks for providing this package.

Motivation.
For the counterfactual-based method NativeGuideCF, there are several modes of and I think the naming is quite confusing.
In method=='NUN_CF', it performed counterfactual_generator_swap and in 'NG' it performed finding the nearest unlike neighbor (NUN).
According to the original paper, it seems that NUN_CF as their comparison method is finding NUN and their proposed method NG conducted counterfactual_generator_swap.

Describe the solution you'd like
Swap the names of those methods should solve the problem.

Additional context
Add any other context or screenshots about the feature request here.

SETS method: mode = ‘feat'

Describe the bug
For the code of SETS counterfactual, when mode = 'feat', self.train_x will not be defined, also, it appears that when mode == "time", the self.ts_len here should be train_x.shape[-1] since you swap the channel first.
if mode == "time":
# Parse test data into (1, feat, time):
change = True
self.train_x = np.swapaxes(train_x, 2, 1)
self.ts_len = train_x.shape[1]

To Reproduce
Steps to reproduce the behavior. Also State the OS and the versions you are using.

Additional context
Add any other context about the problem here.

SETS

  • Make SETS available on PyTorch
  • Recheck PYT Implementation with better model
  • Write Tests
  • New Release

Deprecation Warning

Describe the bug
A clear and concise description of what the bug is.

  DEPRECATION: kaggle is being installed using the legacy 'setup.py install' method, because it does not have a 'pyproject.toml' and the 'wheel' package is not installed. pip 23.1 will enforce this behaviour change. A possible replacement is to enable the '--use-pep517' option. Discussion can be found at https://github.com/pypa/pip/issues/8559
  Running setup.py install for kaggle ... done
  DEPRECATION: lime is being installed using the legacy 'setup.py install' method, because it does not have a 'pyproject.toml' and the 'wheel' package is not installed. pip 23.1 will enforce this behaviour change. A possible replacement is to enable the '--use-pep517' option. Discussion can be found at https://github.com/pypa/pip/issues/8559

To Reproduce
Steps to reproduce the behavior. Also State the OS and the versions you are using.

pip install TSInterpret

Additional context
Add any other context about the problem here.

This might become a problem if the lime and kaggle projects doesn't fix it before python users start using the pip 23.1 as default. For example, https://github.com/marcotcr/lime/commits/master looks very undermaintained.

[Q] Temporal Saliency Rescaling (TSR) - unintuitive results

I am using TSR with IG on my multivariate time series dataset that contains padding, since instances are of unequal length. The model that I used is XceptionTime from the tsai library. Testing on a sample from the test set, there is a very big discrepancy between the result with and without TSR (just changed the TSR parameter in the Python interface). What could be a hypothesis for this behaviour? Is this the result of the implementation or just how TSR works? Surely it is not correct that almost all of the attribution goes to the padding only.

IG with TSR:
image

IG without TSR:
image

My current version of TSInterpret is 0.3.2. The call is int_mod = TSR(xceptiontime_model, X.shape[-1], X.shape[-2], method='IG', mode='feat', device='cuda') and int_mod.explain(item,labels=label,TSR = True) or int_mod.explain(item,labels=label,TSR = false)

[Q]Error when applying TSR on MTS

Hello again! Thanks for your attention!
I'm encountering some errors during the execution of TSR. I think I have made input data in the correct data type and shape. However, the code isn't running successfully. Here is the code and the error message:

import h5py
import tensorflow as tf
# Load model
model_to_explain = tf.keras.models.load_model("best_model_weights.h5")

from TSInterpret.InterpretabilityModels.Saliency.TSR import TSR

#int_mod=TSR(model_to_explain, test_x.shape[-1], test_x.shape[-2], method='GRAD', mode='feat')
int_mod=TSR(model_to_explain, test_x.shape[-2],test_x.shape[-1], method='IG',mode='time', device='cuda')

item= test_x[0:1,:,:]
print(item.shape)

label=int(test_y[0])
print(label)

exp=int_mod.explain(item,labels=label,TSR = True)

%matplotlib inline  
int_mod.plot(np.array([test_x[0:1,:,:]]),exp)
Mode in TF Saliency time
(1, 10, 91)
1
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[30], line 18
     15 label=int(test_y[0])
     16 print(label)
---> 18 exp=int_mod.explain(item,labels=label,TSR = True)
     20 get_ipython().run_line_magic('matplotlib', 'inline')
     21 int_mod.plot(np.array([test_x[0:1,:,:]]),exp)

File ~\AppData\Local\anaconda3\envs\TSInterpret\lib\site-packages\TSInterpret\InterpretabilityModels\Saliency\SaliencyMethods_TF.py:107, in Saliency_TF.explain(self, item, labels, TSR)
    105 if self.method == "IG" or self.method == "GRAD" or self.method == "SG":
    106     input = input.reshape(-1, self.NumTimeSteps, self.NumFeatures, 1)
--> 107     attributions = self.Grad.explain(
    108         (input, None), self.model, class_index=labels
    109     )
    110 elif self.method == "DLS" or self.method == "GS":
    111     self.Grad = self.Grad(self.model, input)

File ~\AppData\Local\anaconda3\envs\TSInterpret\lib\site-packages\tf_explain\core\integrated_gradients.py:40, in IntegratedGradients.explain(self, validation_data, model, class_index, n_steps)
     34 images, _ = validation_data
     36 interpolated_images = IntegratedGradients.generate_interpolations(
     37     np.array(images), n_steps
     38 )
---> 40 integrated_gradients = IntegratedGradients.get_integrated_gradients(
     41     interpolated_images, model, class_index, n_steps
     42 )
     44 grayscale_integrated_gradients = transform_to_normalized_grayscale(
     45     tf.abs(integrated_gradients)
     46 ).numpy()
     48 grid = grid_display(grayscale_integrated_gradients)

File ~\AppData\Local\anaconda3\envs\TSInterpret\lib\site-packages\tensorflow\python\util\traceback_utils.py:153, in filter_traceback.<locals>.error_handler(*args, **kwargs)
    151 except Exception as e:
    152   filtered_tb = _process_traceback_frames(e.__traceback__)
--> 153   raise e.with_traceback(filtered_tb) from None
    154 finally:
    155   del filtered_tb

File ~\AppData\Local\Temp\__autograph_generated_file1omsyq8b.py:26, in outer_factory.<locals>.inner_factory.<locals>.tf__get_integrated_gradients(interpolated_images, model, class_index, n_steps)
     24     ag__.converted_call(ag__.ld(tape).watch, (ag__.ld(inputs),), None, fscope)
     25     predictions = ag__.converted_call(ag__.ld(model), (ag__.ld(inputs),), None, fscope)
---> 26     loss = ag__.ld(predictions)[:, ag__.ld(class_index)]
     27 grads = ag__.converted_call(ag__.ld(tape).gradient, (ag__.ld(loss), ag__.ld(inputs)), None, fscope)
     28 grads_per_image = ag__.converted_call(ag__.ld(tf).reshape, (ag__.ld(grads), (-1, ag__.ld(n_steps), *ag__.ld(grads).shape[1:])), None, fscope)

ValueError: in user code:

    File "C:\Users\rz124\AppData\Local\anaconda3\envs\TSInterpret\lib\site-packages\tf_explain\core\integrated_gradients.py", line 71, in get_integrated_gradients  *
        loss = predictions[:, class_index]

    ValueError: slice index 1 of dimension 1 out of bounds. for '{{node strided_slice}} = StridedSlice[Index=DT_INT32, T=DT_FLOAT, begin_mask=1, ellipsis_mask=0, end_mask=1, new_axis_mask=0, shrink_axis_mask=2](sequential/dense/Sigmoid, strided_slice/stack, strided_slice/stack_1, strided_slice/stack_2)' with input shapes: [10,1], [2], [2], [2] and with computed input tensors: input[1] = <0 1>, input[2] = <0 2>, input[3] = <1 1>.

Do you have any insights into what might be causing this issue? Any suggestions on how to resolve it would be greatly appreciated. Thank you in advance!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.