kdd-opensource / deepadots Goto Github PK

Repository of the paper "A Systematic Evaluation of Deep Anomaly Detection Methods for Time Series".

License: MIT License

Python 99.25% Dockerfile 0.56% Shell 0.19%

anomaly-detection time-series timeseries deep-learning tensorflow pytorch

deepadots's Introduction

Anomaly Detection on Time Series: An Evaluation of Deep Learning Methods.

The goal of this repository is to provide a benchmarking pipeline for anomaly detection on time series data for multiple state-of-the-art deep learning methods.

Implemented Algorithms

Name	Paper
LSTM-AD	Long short term memory networks for anomaly detection in time series, ESANN 2015
LSTM-ED	LSTM-based encoder-decoder for multi-sensor anomaly detection, ICML 2016
Autoencoder	Outlier detection using replicator neural networks, DaWaK 2002
Donut	Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications, WWW 2018
REBM	Deep structured energy based models for anomaly detection, ICML 2016
DAGMM	Deep autoencoding gaussian mixture model for unsupervised anomaly detection, ICLR 2018
LSTM-DAGMM	Extension of DAGMM using an LSTM-Autoencoder instead of a Neural Network Autoencoder

Usage

git clone git://github.com/KDD-OpenSource/DeepADoTS.git  
virtualenv venv -p /usr/bin/python3  
source venv/bin/activate  
pip install -r requirements.txt  
python3 main.py

Example

We follow the scikit-learn API by offering the interface methods fit(X) and predict(X). The former estimates the data distribution in an unsupervised way while the latter returns an anomaly score for each instance - the higher, the more certain is the model that the instance is an anomaly. To compare the performance of methods, we use the ROC AUC value.

We use MNIST to demonstrate the usage of a model since it is already available in TensorFlow and does not require downloading external data (even though the data has no temporal aspect).

import pandas as pd
import tensorflow as tf
from sklearn.metrics import roc_auc_score

from src.algorithms import AutoEncoder
from src.datasets import Dataset


class MNIST(Dataset):
    """0 is the outlier class. The training set is free of outliers."""

    def __init__(self, seed):
        super().__init__(name="MNIST", file_name='')  # We do not need to load data from a file
        self.seed = seed

    def load(self):
        # 0 is the outlier, all other digits are normal
        OUTLIER_CLASS = 0
        mnist = tf.keras.datasets.mnist
        (x_train, y_train), (x_test, y_test) = mnist.load_data()
        # Label outliers with 1 and normal digits with 0
        y_train, y_test = (y_train == OUTLIER_CLASS), (y_test == OUTLIER_CLASS)
        x_train = x_train[~y_train]  # Remove outliers from the training set
        x_train, x_test = x_train / 255, x_test / 255
        x_train, x_test = x_train.reshape(-1, 784), x_test.reshape(-1, 784)
        self._data = tuple(pd.DataFrame(data=data) for data in [x_train, y_train, x_test, y_test])


x_train, y_train, x_test, y_test = MNIST(seed=0).data()
# Use fewer instances for demonstration purposes
x_train, y_train = x_train[:1000], y_train[:1000]
x_test, y_test = x_test[:100], y_test[:100]

model = AutoEncoder(sequence_length=1, num_epochs=40, hidden_size=10, lr=1e-4)
model.fit(x_train)

error = model.predict(x_test)
print(roc_auc_score(y_test, error))  # e.g. 0.8614

We can visualize the samples with respective error values as follows

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import offsetbox

"""Borrowed from https://github.com/scikit-learn/scikit-learn/blob/master/examples/manifold/plot_lle_digits.py#L44"""
error = (error - error.min()) / (error.max() - error.min())  # Normalize error
x_test = x_test.values
y_random = np.random.rand(len(x_test)) * 2 - 1
plt.figure(figsize=(20, 10))
ax = plt.subplot(111)
if hasattr(offsetbox, 'AnnotationBbox'):
    shown_images = np.array([[1., 1.]])
    for i in range(len(x_test)):
        X_instance = [error[i], y_random[i]]
        dist = np.sum((X_instance - shown_images) ** 2, 1)
        if np.min(dist) < 4e-5:
            # don't show points that are too close
            continue
        shown_images = np.r_[shown_images, [X_instance]]
        imagebox = offsetbox.AnnotationBbox(offsetbox.OffsetImage(x_test[i].reshape(28, 28), cmap=plt.cm.gray_r), X_instance)
        ax.add_artist(imagebox)
plt.xlim((0, 1.1))
plt.ylim((-1.2, 1.2))
plt.xlabel("Anomaly Score")
plt.title("Predicted Anomaly Score for the Test Set")
plt.show()

Which creates a plot like this We can see that global outliers (zeros) and local outliers (strangely written digits) receive high anomaly scores.

Deployment

docker build -t deep-adots .
docker run -ti deep-adots /bin/bash -c "python3.6 /repo/main.py"

Authors/Contributors

Team:

Supervisors:

Credits

Base implementation for DAGMM
Base implementation for Donut
Base implementation for Recurrent EBM
Downloader for real-world datasets

deepadots's People

Contributors

Stargazers

Watchers

Forkers

zhuyiche muyiben valeman quarkx33 lisha1992 deephao fintrek snowisland1 alvarozornoza kidluo jiangzhongkai cdawei xiaochenzhou zc08174024 mariushegele rob-med aliazg bsaghafi yli96 nadcharin pratyushdeka astha-chem tungk zhaojiachen1994 songaiguo ssangman kalman-dot droiter ml-thinkers varad2305 dherath lukegs7 molly260 songjiao opaquezxd bigxiaofeng lv-tuan ishitwaviranchi tinanguyengc shouwangbuqi raminmrd adityav95 obinnaobeleagu aa3007-hub datakalp hongminwu chrinide xinguozju shuncask innerproduct genana modi975 xujinglin gaodawn dontcryme engdeep kashyapm94 nirvananimbusa homairs beuthbenjamin arunbaruah nlebang mesereta ipa-rar jimmy-inl hell-to-heaven acveah changhwan-joe gzlz michaelyma12 fredtoby yzmshjd gyyixr cupid4 mohcinemadkour amenehforouz yiyg510 hunanhd fancy1573 aichimashen annalixx waitalone viewfuture yaoxy2010 mohammadforouhesh pengkangzaia o7s8r6 agnes-yang fusion-research vivek1240 hwankam overfittingstudyroom montshasta2020 13301338176 doytsujin clairethkim jgqysu chinahappyking yuanjianrui lyzl2010

deepadots's Issues

DAGMM_LSTMAutoEncoder_withWindow: RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at /pytorch/aten/src/THC/generic/THCTensorCopy.c:20

2018-06-21 07:18:44 [ERROR] src.evaluation.evaluator: An exception occurred while training DAGMM_LSTMAutoEncoder_withWindow on Synthetic Variance Outliers: cuda runtime error (77) : an illegal memory access was encountered at /pytorch/aten/src/THC/generic/THCTensorCopy.c:20
2018-06-21 07:18:44 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
  File "/repo/src/evaluation/evaluator.py", line 71, in evaluate
    det.fit(X_train, y_train)
  File "/repo/src/algorithms/dagmm.py", line 192, in fit
    self.dagmm.cuda()
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 249, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 176, in _apply
    module._apply(fn)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 176, in _apply
    module._apply(fn)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/rnn.py", line 111, in _apply
    ret = super(RNNBase, self)._apply(fn)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 182, in _apply
    param.data = fn(param.data)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 249, in <lambda>
    return self._apply(lambda t: t.cuda(device))
RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at /pytorch/aten/src/THC/generic/THCTensorCopy.c:20

In Lstm-Enc-Eec plt.show() in plot_scores doesn't terminate

Training LSTM-Enc-Dec on Missing ouliers (100%) throws RuntimeError

2018-06-13 06:56:46 [ERROR] root: Couldn't take the inverse of cov. Maybe singular?
2018-06-13 06:56:46 [ERROR] src.evaluation.evaluator: An exception occured while training LSTM-Enc-Dec on Syn Extreme Outliers (mis=1.0): Lapack Error getrf : U(5,5) is 0, U is singular at /pytorch/aten/src/TH/generic/THTensorLapack.c:514
2018-06-13 06:56:46 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
  File "../third_party/lstm_enc_dec/anomalyDetector.py", line 84, in anomalyScore
    mult2 = torch.inverse(cov)  # [ prediction_window_size * prediction_window_size ]
RuntimeError: Lapack Error getrf : U(2,2) is 0, U is singular at /pytorch/aten/src/TH/generic/THTensorLapack.c:514

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "../src/evaluation/evaluator.py", line 68, in evaluate
    score = det.predict(X_test)
  File "../src/algorithms/lstm_enc_dec.py", line 109, in predict
    channels_scores = self.predict_channel_scores(X_test)
  File "../src/algorithms/lstm_enc_dec.py", line 105, in predict_channel_scores
    channels_scores, _ = self._predict(test_timeseries_dataset)
  File "../src/algorithms/lstm_enc_dec.py", line 215, in _predict
    self.data, self.filename)
  File "../third_party/lstm_enc_dec/anomaly_detection.py", line 88, in calc_anomalies
    score_predictor=score_predictor, channel_idx=channel_idx,
  File "../third_party/lstm_enc_dec/anomalyDetector.py", line 91, in anomalyScore
    mult2 = torch.inverse(cov)  # [ prediction_window_size * prediction_window_size ]
RuntimeError: Lapack Error getrf : U(5,5) is 0, U is singular at /pytorch/aten/src/TH/generic/THTensorLapack.c:514

LSTMAD sometimes crashes without error message

Evaluate DAGMM × {Sliding Window, LSTM} × Outlier × {Pollution, Missing}

We want to find out what addition lets DAGMM perform better/worse on which dataset setting. This should help us to explain in the end what exact contribution makes us better.

Consistent logging

Initialize logger with file and stdout handler
Create a logger for each class/module/file
Evaluate if we can forbid print() via flake8

Support Multivariate Datasets: Donut

Since the algorithm only supports univariate datasets, apply it independently to each feature and aggregate the anomaly scores using the maximum.

Training Donut on Missing Values Outlier (100%) throws ValueError

2018-06-13 06:56:47 [ERROR] src.evaluation.evaluator: An exception occured while training Donut on Syn Extreme Outliers (mis=1.0): `std` must be positive
2018-06-13 06:56:47 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
  File "../src/evaluation/evaluator.py", line 67, in evaluate
    det.fit(X_train, y_train)
  File "../src/algorithms/donut.py", line 183, in fit
    trainer.fit(features, labels, missing, mean, std)
  File "../src/algorithms/donut.py", line 73, in fit
    aug = MissingDataInjection(mean, std, self._missing_data_injection_rate)
  File "/home/maxi/.local/lib/python3.6/site-packages/donut/augmentation.py", line 81, in __init__
    super(MissingDataInjection, self).__init__(mean, std)
  File "/home/maxi/.local/lib/python3.6/site-packages/donut/augmentation.py", line 19, in __init__
    raise ValueError('`std` must be positive')
ValueError: `std` must be positive

All models should support CUDA

Go go Pytorch magician @xasetl

KeyError: 'timestamps'

See current master

  File "main.py", line 101, in <module>
    main()
  File "main.py", line 12, in main
    run_pipeline()
  File "main.py", line 41, in run_pipeline
    evaluator.evaluate()
  File "/home/circleci/repo/src/evaluation/evaluator.py", line 42, in evaluate
    (X_train, y_train, X_test, y_test) = ds.data()
  File "/home/circleci/repo/src/datasets/dataset.py", line 29, in data
    self.load()
  File "/home/circleci/repo/src/datasets/synthetic_dataset.py", line 42, in load
    y_test = self._label_outliers(self.outlier_config)[train_split_point:]
  File "/home/circleci/repo/src/datasets/synthetic_dataset.py", line 50, in _label_outliers
    for ts in outlier['timestamps']:
KeyError: 'timestamps'

Validate DAGMM

Verify that the implemented model works like the implementation from the paper.

Benchmarking Pipeline

A specified set of algorithms should be evaluated on given data sets.

More experiments

High-Dimensional Data on agots Types
Missing on other agots Types
High-Dimensional Multivariate

Tune DAGMM with Ensemble

DAGMM_NNAutoEncoder_withWindow: RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #4 'mat1'

2018-06-21 07:58:28 [INFO] src.evaluation.evaluator: Training DAGMM_NNAutoEncoder_withWindow on Syn Extreme Outliers (pol=0.25)
2018-06-21 07:58:28 [ERROR] src.evaluation.evaluator: An exception occurred while training DAGMM_NNAutoEncoder_withWindow on Syn Extreme Outliers (pol=0.25): Expected object of type torch.FloatTensor but found t
ype torch.cuda.FloatTensor for argument #4 'mat1'
2018-06-21 07:58:28 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
  File "/repo/src/evaluation/evaluator.py", line 71, in evaluate
    det.fit(X_train, y_train)
  File "/repo/src/algorithms/dagmm.py", line 199, in fit
    self.dagmm_step(input_data.float())
  File "/repo/src/algorithms/dagmm.py", line 169, in dagmm_step
    enc, dec, z, gamma = self.dagmm(input_data)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/repo/src/algorithms/dagmm.py", line 48, in forward
    dec, enc = self.autoencoder(x)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/repo/src/algorithms/autoencoder.py", line 41, in forward
    enc = self._encoder(x)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/container.py", line 91, in forward
    input = module(input)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/linear.py", line 55, in forward
    return F.linear(input, self.weight, self.bias)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 992, in linear
    return torch.addmm(bias, input, weight.t())
RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #4 'mat1'

Training Donut raises "Tensor had NaN values"

While training Donut on Synthetic Shift Outliers an exception occured:gradient for model/donut/p_x_given_z/mean/dense/bias:0 has numeric issue : Tensor had NaN values [[Node: quiet_donut_trainer_9/CheckNumerics_13 = CheckNumerics[T=DT_FLOAT, message="gradient for model/donut/p_x_given_z/mean/dense/bias:0 has numeric issue", _device="/job:localhost/replica:0/task:0/device:CPU:0"](quiet_donut_trainer_9/clip_by_norm_13/truediv)]]

Tune LSTM-Enc-Dec-Ensemble

Training missing outliers (75%) on DAGMM throws Exception

/home/maxi/.local/lib/python3.6/site-packages/numpy/linalg/linalg.py:1874: RuntimeWarning: invalid value encountered in det
  r = _umath_linalg.det(a, signature=signature)
2018-06-13 06:23:52 [ERROR] src.evaluation.evaluator: An exception occured while training DAGMM on Syn Extreme Outliers (mis=0.75): Threshold is NaN
2018-06-13 06:23:52 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
  File "../src/evaluation/evaluator.py", line 68, in evaluate
    score = det.predict(X_test)
  File "../src/algorithms/dagmm.py", line 262, in predict
    raise Exception("Threshold is NaN")
Exception: Threshold is NaN

/home/maxi/.local/lib/python3.6/site-packages/numpy/lib/function_base.py:4291: RuntimeWarning: Invalid value encountered in percentile
  interpolation=interpolation)

Add Sliding Window to DAGMM

Validate LSTM-AD

Verify that the implemented model works like the implementation from the paper.

remove usage of .cpu() in LSTMED and LSTMAD

https://pytorch.org/docs/stable/_modules/torch/distributions/multivariate_normal.html

LSTMAD: RuntimeError: Creating MTGP constants failed. at /pytorch/aten/src/THC/THCTensorRandom.cu:34

Traceback (most recent call last):
  File "main.py", line 123, in <module>
    main()
  File "main.py", line 17, in main
    run_experiments()
  File "main.py", line 93, in run_experiments
    detectors = [RecurrentEBM(num_epochs=15), LSTMAD(), Donut(), LSTM_Enc_Dec(num_epochs=15),
  File "/repo/src/algorithms/lstm_ad.py", line 53, in __init__
    torch.manual_seed(0)
  File "/usr/local/lib/python3.6/dist-packages/torch/random.py", line 33, in manual_seed
    torch.cuda.manual_seed_all(seed)
  File "/usr/local/lib/python3.6/dist-packages/torch/cuda/random.py", line 86, in manual_seed_all
    _lazy_call(lambda: _C._cuda_manualSeedAll(seed))
  File "/usr/local/lib/python3.6/dist-packages/torch/cuda/__init__.py", line 121, in _lazy_call
    callable()
  File "/usr/local/lib/python3.6/dist-packages/torch/cuda/random.py", line 86, in <lambda>
    _lazy_call(lambda: _C._cuda_manualSeedAll(seed))
RuntimeError: Creating MTGP constants failed. at /pytorch/aten/src/THC/THCTensorRandom.cu:34

Tune REBM with Entropy Filters

Add real datasets from chickenbestlover

Related code from his repo

Adapt Donut to Missing Data

Timestamp of saved plots should be an actual date

Having something like
reports/figures/roc_Synthetic\ Extreme\ Outliers-1-1-2018-06-06-081700.pdf
would be easier to interpret and keep track of than
reports/figures/roc_Synthetic\ Extreme\ Outliers-1-1-1528265710.pdf

DAGMM_LSTMAutoEncoder_withoutWindow: RuntimeError: Lapack Error getrf : U(3,3) is 0, U is singular at /pytorch/aten/src/TH/generic/THTensorLapack.c:514

On current master

2018-06-22 10:41:54 [ERROR] src.evaluation.evaluator: An exception occurred while training DAGMM_LSTMAutoEncoder_withoutWindow on Synthetic Combined Outliers: Lapack Error getrf : U(3,3) is 0, U is singular at /
pytorch/aten/src/TH/generic/THTensorLapack.c:514
2018-06-22 10:41:54 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
  File "/home/willi/Documents/MP-2018/src/evaluation/evaluator.py", line 71, in evaluate
    det.fit(X_train, y_train)
  File "/home/willi/Documents/MP-2018/src/algorithms/dagmm.py", line 196, in fit
    self.dagmm_step(input_data.float())
  File "/home/willi/Documents/MP-2018/src/algorithms/dagmm.py", line 172, in dagmm_step
    self.lambda_cov_diag)
  File "/home/willi/Documents/MP-2018/src/algorithms/dagmm.py", line 141, in loss_function
    sample_energy, cov_diag = self.compute_energy(z, phi, mu, cov)
  File "/home/willi/Documents/MP-2018/src/algorithms/dagmm.py", line 107, in compute_energy
    cov_inverse.append(torch.inverse(cov_k).unsqueeze(0))
RuntimeError: Lapack Error getrf : U(3,3) is 0, U is singular at /pytorch/aten/src/TH/generic/THTensorLapack.c:514

Experiment with Entropy-Based Ideas

As discussed in the previous meeting, it might be possible to discard data from training based on certain entropy values to increase the robustness of the algorithm to noise in the training data.

LSTM-AD von Syn Extreme Outliers (mis=0.025)

2018-06-28 11:03:43 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
  File "/repo/src/evaluation/evaluator.py", line 72, in evaluate
    score = det.predict(X_test)
  File "/repo/src/algorithms/lstm_ad.py", line 99, in predict
    scores = -multivariate_normal.logpdf(norm, mean=self.mean, cov=self.cov, allow_singular=True)
  File "/usr/local/lib/python3.6/dist-packages/scipy/stats/_multivariate.py", line 487, in logpdf
    psd = _PSD(cov, allow_singular=allow_singular)
  File "/usr/local/lib/python3.6/dist-packages/scipy/stats/_multivariate.py", line 152, in __init__
    s, u = scipy.linalg.eigh(M, lower=lower, check_finite=check_finite)
  File "/usr/local/lib/python3.6/dist-packages/scipy/linalg/decomp.py", line 374, in eigh
    a1 = _asarray_validated(a, check_finite=check_finite)
  File "/usr/local/lib/python3.6/dist-packages/scipy/_lib/_util.py", line 238, in _asarray_validated
    a = toarray(a)
  File "/usr/local/lib/python3.6/dist-packages/numpy/lib/function_base.py", line 1233, in asarray_chkfinite
    "array must not contain infs or NaNs")
ValueError: array must not contain infs or NaNs

DAGMM_LSTMAutoEncoder_withWindow: RuntimeError: parameter types mismatch

2018-06-21 07:50:55 [ERROR] src.evaluation.evaluator: An exception occurred while training DAGMM_LSTMAutoEncoder_withWindow on Syn Extreme Outliers (pol=0.0): parameter types mismatch
2018-06-21 07:50:55 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
  File "/repo/src/evaluation/evaluator.py", line 71, in evaluate
    det.fit(X_train, y_train)
  File "/repo/src/algorithms/dagmm.py", line 199, in fit
    self.dagmm_step(input_data.float())
  File "/repo/src/algorithms/dagmm.py", line 169, in dagmm_step
    enc, dec, z, gamma = self.dagmm(input_data)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/repo/src/algorithms/dagmm.py", line 48, in forward
    dec, enc = self.autoencoder(x)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/repo/src/algorithms/autoencoder.py", line 77, in forward
    _, enc_hidden = self.encoder(ts_batch.float(), enc_hidden)  # .float() here or .double() for the model
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/rnn.py", line 192, in forward
    output, hidden = func(input, self.all_weights, hx, batch_sizes)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/_functions/rnn.py", line 323, in forward
    return func(input, *fargs, **fkwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/_functions/rnn.py", line 287, in forward
    dropout_ts)
RuntimeError: parameter types mismatch

add log file for used algorithms and their set parameters and the selected datasets

AttributeError: 'LSTM_Enc_Dec' object has no attribute 'seed'

See curret master:

2018-06-09 18:24:26 [ERROR] src.evaluation.evaluator: An exception occured while training LSTM-Enc-Dec on Synthetic Extreme Outliers: 'LSTM_Enc_Dec' object has no attribute 'seed'
2018-06-09 18:24:26 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
  File "/home/circleci/repo/src/evaluation/evaluator.py", line 46, in evaluate
    det.fit(X_train, y_train)
  File "/home/circleci/repo/src/algorithms/lstm_enc_dec.py", line 100, in fit
    self._fit(train_timeseries_dataset)
  File "/home/circleci/repo/src/algorithms/lstm_enc_dec.py", line 193, in _fit
    self._save_checkpoint(epoch, self.best_val_loss, means=means, covs=covs)
  File "/home/circleci/repo/src/algorithms/lstm_enc_dec.py", line 218, in _save_checkpoint
    'seed': self.seed,
AttributeError: 'LSTM_Enc_Dec' object has no attribute 'seed'

Fix type of returned data in `air_quality` dataset

Currently sets _data to a pd.DataFrame, needs to be tuple of (pd.Dataframe, pd.Series, pd.DataFrame, pd.Series) for X_train, y_train, X_test, y_train.

Running main.ipynb raises SystemExit: 2

Can anyone reproduce it?

Validate Donut

Verify that the implemented model works like the implementation from the paper.

Fix third_party code PEP8 violations

Plot tables and figures that are paper-ready

main.py throws "ImportError: cannot import name 'SyntheticData'"

(venv3) ➜  MP-2018 git:(master) python main.py
Traceback (most recent call last):
  File "main.py", line 5, in <module>
    from src.datasets.synthetic_data_generator import SyntheticData
ImportError: cannot import name 'SyntheticData'

Tables & Plots

Generate proper tables and plots from the evaluation results.

ValueError in Donut

This is the output of running Donut on the new multivariate outliers (PR #72 ):

2018-06-13 01:05:21 [ERROR] src.evaluation.evaluator: An exception occured while training Donut on Synthetic Multivariate Outliers: The shape of ``arrays[1]`` does not agree with the shape of `timestamp` ((1000, 1) vs (1000,))
2018-06-13 01:05:21 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
  File "../src/evaluation/evaluator.py", line 67, in evaluate
    det.fit(X_train, y_train)
  File "../src/algorithms/donut.py", line 160, in fit
    timestamps, missing, (features, labels) = complete_timestamp(timestamps, (features, labels))
  File "/home/maxi/.local/lib/python3.6/site-packages/donut/preprocessing.py", line 36, in complete_timestamp
    format(i, array.shape, timestamp.shape))
ValueError: The shape of ``arrays[1]`` does not agree with the shape of `timestamp` ((1000, 1) vs (1000,))

18-06-21 07:12:00 [ERROR] src.evaluation.evaluator: An exception occurred while training Recurrent EBM on Synthetic Variance Outliers: Failed to create session.
2018-06-21 07:12:00 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
  File "/repo/src/evaluation/evaluator.py", line 71, in evaluate
    det.fit(X_train, y_train)
  File "/repo/src/algorithms/rnn_ebm.py", line 43, in fit
    self.tf_session = tf.Session()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1560, in __init__
    super(Session, self).__init__(target, graph, config=config)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 633, in __init__
    self._session = tf_session.TF_NewSession(self._graph._c_graph, opts)
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.

plot_auroc: TypeError: 'NoneType' object is not subscriptable

On the current master:

Traceback (most recent call last):
  File "main.py", line 118, in <module>
    main()
  File "main.py", line 17, in main
    run_experiments()
  File "main.py", line 88, in run_experiments
    steps=1)
  File "/repo/experiments.py", line 49, in run_extremes_experiment
    evaluator.plot_auroc(title='Area under the curve for differing outlier heights')
  File "/repo/src/evaluation/evaluator.py", line 220, in plot_auroc
    aurocs = self.benchmark_results[self.benchmark_results['algorithm'] == det.name]['auroc']
TypeError: 'NoneType' object is not subscriptable

missing data
polluted data
seasonality

DAGMM_LSTMAutoEncoder_withWindow: RuntimeError: CUDNN_STATUS_EXECUTION_FAILED

2018-06-21 07:11:59 [ERROR] src.evaluation.evaluator: An exception occurred while training DAGMM_LSTMAutoEncoder_withWindow on Synthetic Extreme Outliers: CUDNN_STATUS_EXECUTION_FAILED
2018-06-21 07:11:59 [ERROR] src.evaluation.evaluator: Traceback (most recent call last):
  File "/repo/src/evaluation/evaluator.py", line 71, in evaluate
    det.fit(X_train, y_train)
  File "/repo/src/algorithms/dagmm.py", line 199, in fit
    self.dagmm_step(input_data.float())
  File "/repo/src/algorithms/dagmm.py", line 169, in dagmm_step
    enc, dec, z, gamma = self.dagmm(input_data)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/repo/src/algorithms/dagmm.py", line 48, in forward
    dec, enc = self.autoencoder(x)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/repo/src/algorithms/autoencoder.py", line 77, in forward
    _, enc_hidden = self.encoder(ts_batch.float(), enc_hidden)  # .float() here or .double() for the model
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/rnn.py", line 192, in forward
    output, hidden = func(input, self.all_weights, hx, batch_sizes)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/_functions/rnn.py", line 323, in forward
    return func(input, *fargs, **fkwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/_functions/rnn.py", line 287, in forward
    dropout_ts)
RuntimeError: CUDNN_STATUS_EXECUTION_FAILED

kdd-opensource / deepadots Goto Github PK

deepadots's Introduction

Anomaly Detection on Time Series: An Evaluation of Deep Learning Methods.

Implemented Algorithms

Usage

Example

Deployment

Authors/Contributors

Credits

deepadots's People

Contributors

Stargazers

Watchers

Forkers

deepadots's Issues

Recommend Projects

Recommend Topics

Recommend Org