RNN based Time-series Anomaly detector model implemented in Pytorch.
This is an implementation of RNN based time-series anomaly detector, which consists of two-stage strategy of time-series prediction and anomaly score calculation.
- Ubuntu 16.04+ (Errors reported on Windows 10. see issue. Suggesstions are welcomed.)
- Python 3.5+
- Pytorch 0.4.0+
- Numpy
- Matplotlib
- Scikit-learn
1. NYC taxi passenger count
- The New York City taxi passenger data stream, provided by the New York City Transportation Authority
- preprocessed (aggregated at 30 min intervals) by Cui, Yuwei, et al. in "A comparative study of HTM and other neural network models for online sequence learning with streaming data." Neural Networks (IJCNN), 2016 International Joint Conference on. IEEE, 2016. , code
2. Electrocardiograms (ECGs)
- The ECG dataset containing a single anomaly corresponding to a pre-ventricular contraction
3. 2D gesture (video surveilance)
- X Y coordinate of hand gesture in a video
4. Respiration
- A patients respiration (measured by thorax extension, sampling rate 10Hz)
5. Space shuttle
- Space Shuttle Marotta Valve time-series
6. Power demand
- One years power demand at a Dutch research facility
The Time-series 2~6 are provided by E. Keogh et al. in "HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence." In The Fifth IEEE International Conference on Data Mining. (2005) , dataset
-
RNN based Multi-step predictor
-
Multivariate Gaussian distribution based anomaly detector
-
Anomaly score predictor
0. Download the dataset: Download the five kinds of multivariate time-series dataset (ecg, gesture,power_demand, respiration, space_shuttle), and Label all the abnormality points in the dataset.
python 0_download_dataset.py
1. Time-series prediction: Train and save RNN based time-series prediction model on a single time-series trainset
python 1_train_predictor.py --data ecg --filename chfdb_chf14_45590.pkl
python 1_train_predictor.py --data nyc_taxi --filename nyc_taxi.pkl
Train multiple models using bash script
./1_train_predictor_all.sh
2. Anomaly detection: Fit multivariate gaussian distribution and calculate anomaly scores on a single time-series testset
python 2_anomaly_detection.py --data ecg --filename chfdb_chf14_45590.pkl --prediction_window 10
python 2_anomaly_detection.py --data nyc_taxi --filename nyc_taxi.pkl --prediction_window 10
Test multiple models using bash script
./2_anomaly_detection_all.sh
1. Time-series prediction: Predictions from the stacked RNN model
2. Anomaly detection:
Anomaly scores from the Multivariate Gaussian Distribution model
- NYC taxi passenger count
- Electrocardiograms (ECGs) (filename: chfdb_chf14_45590)
If you have any questions, please open an issue.