d-ailin / gdn Goto Github PK
View Code? Open in Web Editor NEWImplementation code for the paper "Graph Neural Network-Based Anomaly Detection in Multivariate Time Series" (AAAI 2021)
License: MIT License
Implementation code for the paper "Graph Neural Network-Based Anomaly Detection in Multivariate Time Series" (AAAI 2021)
License: MIT License
I tried to run the code on Ubuntu 16.04 machine without GPU.
I run CPU version of code, but it does not work.
~/gitlab/GDN-main$ bash run.sh cpu msl
Traceback (most recent call last):
File "main.py", line 18, in
from models.GDN import GDN
File "/home/giorgi/gitlab/GDN-main/models/GDN.py", line 12, in
from .graph_layer import GraphLayer
File "/home/giorgi/gitlab/GDN-main/models/graph_layer.py", line 4, in
from torch_geometric.nn.conv import MessagePassing
File "/home/giorgi/gitlab/GDN-main/gnn_venv/lib/python3.6/site-packages/torch_geometric/init.py", line 2, in
import torch_geometric.nn
File "/home/giorgi/gitlab/GDN-main/gnn_venv/lib/python3.6/site-packages/torch_geometric/nn/init.py", line 2, in
from .data_parallel import DataParallel
File "/home/giorgi/gitlab/GDN-main/gnn_venv/lib/python3.6/site-packages/torch_geometric/nn/data_parallel.py", line 5, in
from torch_geometric.data import Batch
File "/home/giorgi/gitlab/GDN-main/gnn_venv/lib/python3.6/site-packages/torch_geometric/data/init.py", line 1, in
from .data import Data
File "/home/giorgi/gitlab/GDN-main/gnn_venv/lib/python3.6/site-packages/torch_geometric/data/data.py", line 7, in
from torch_sparse import coalesce
File "/home/giorgi/gitlab/GDN-main/gnn_venv/lib/python3.6/site-packages/torch_sparse/init.py", line 13, in
library, [osp.dirname(file)]).origin)
File "/home/giorgi/gitlab/GDN-main/gnn_venv/lib/python3.6/site-packages/torch/_ops.py", line 105, in load_library
ctypes.CDLL(path)
File "/opt/anaconda3/lib/python3.6/ctypes/init.py", line 348, in init
self._handle = _dlopen(self._name, mode)
OSError: libcusparse.so.10: cannot open shared object file: No such file or directory
Hi, thank you so much for your wonderful work!
I'm working on Anomaly Detection too, but I'm a beginner in graph network. I have 2 questions:
Sorry if I asked dummy questions. I'm looking forward to your reply!
Thank you so much!
Hello, friend. Thank you for your wonderful work in open source. I'm a novice programmer. I'd like to ask you how to write code in (eval_scores function). Some of the variables seem difficult to find in the article, such as th_ Steps, fmeas, I am very confused. I look forward to your help. Thank you!
Thank you for sharing the code.
One little question, does the CUDA version have to be 10.2?
Does the OSError has to do with the problem?
Thanks again
Hi, thanks for your excellent work and release your code.
When I process SWaT dataset with process_swat.py, I met this question:
ValueError: could not convert string to float: '124,3135'
I don't know what the meaning of ','. Does this equal to '.'?
I would appreciate if you have advice for me!
Hello, thank you for sharing your work, it is interesting. But when I am trying to reproduce your work on SWAT dataset. I am quite confused about which subset you use to get the results on your paper? Only A1&A2 and I check the original SWaT_Dataset_Normal_v0.xlsx file has 496801 samples, not the one you mentioned in your paper which is 47515. Thanks a lot and looking forward to your reply.
If I got a very big total_error_score in one dimension,what's the meaning?
Hi,
I encounter such error when just running command "bash run.sh 0 msl". Any idea about it? Thanks!
Traceback (most recent call last):
File "main.py", line 256, in
main.run()
File "main.py", line 108, in run
self.train_log = train(self.model, model_save_path,
File "/media/home/jzhu/projects/GDN-main/train.py", line 70, in train
out = model(x, edge_index).float().to(device)
File "/media/home/jzhu/anaconda3/envs/py3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(input, **kwargs)
File "/media/home/jzhu/projects/GDN-main/models/GDN.py", line 167, in forward
gcn_out = self.gnn_layers[i](x, batch_gated_edge_index, node_num=node_numbatch_num, embedding=all_embeddings)
File "/media/home/jzhu/anaconda3/envs/py3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/media/home/jzhu/projects/GDN-main/models/GDN.py", line 73, in forward
out, (new_edge_index, att_weight) = self.gnn(x, edge_index, embedding, return_attention_weights=True)
File "/media/home/jzhu/anaconda3/envs/py3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/media/home/jzhu/projects/GDN-main/models/graph_layer.py", line 72, in forward
out = self.propagate(edge_index, x=x, embedding=embedding, edges=edge_index,
File "/media/home/jzhu/anaconda3/envs/py3/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py", line 253, in propagate
out = self.aggregate(out, **aggr_kwargs)
File "/media/home/jzhu/anaconda3/envs/py3/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py", line 287, in aggregate
return scatter(inputs, index, dim=self.node_dim, dim_size=dim_size,
File "/media/home/jzhu/anaconda3/envs/py3/lib/python3.8/site-packages/torch_scatter/scatter.py", line 153, in scatter
return scatter_sum(src, index, dim, out, dim_size)
File "/media/home/jzhu/anaconda3/envs/py3/lib/python3.8/site-packages/torch_scatter/scatter.py", line 11, in scatter_sum
index = broadcast(index, src, dim)
File "/media/home/jzhu/anaconda3/envs/py3/lib/python3.8/site-packages/torch_scatter/utils.py", line 13, in broadcast
src = src.expand_as(other)
RuntimeError: The expanded size of the tensor (1) must match the existing size (4320) at non-singleton dimension 1. Target sizes: [4320, 1, 64]. Tensor sizes: [1, 4320, 1]
Hi, I'm very interested in your research. Thank you for your great paper.
Could you teach me how to visualize a learned graph layout?
I want to output a directed graph with attention weights like Fig. 3 shown in your paper.
I extracted a learned graph, attention weights and edge index from a learned best model by the following code.
learnd_graph = best_model.learned_graph
att_weight = best_model.gnn_layers[0].att_weight_1
edge_index = best_model.gnn_layers[0].edge_index_1
But I don't know how to correspond a learned graph and attention weights because their dimensions are quite different.
I mean, att_weight and edge_index have more edges than expected.
Is the referenced variable wrong?
It would help me a lot if you provided the code to visualize the graph.
I'm looking forward your reply.
Hi, thanks for your excellent work and release your code.
When I process SWaT dataset with process_swat.py, I met this question:
My swat dataset-SWaT_Dataset_Normal_v0.xlsx has 51 dimension excluding Timestamp and attackLable,but your dataset has 27 dimension。
And also, I met the same problem while using process_wadi.py to process WADI dataset。
Hi, thanks for sharing your excellent work!
I have questions for output visualization as figure3 in your paper.
1)For right figure, I also checked issues #30,
Could you let me know visualization sample code ? (etc, which plot package you uesd,)
2)For left figure, Which data I have to use?
In main.py, I guess I have to use test_result for plotting predicted results.
prediction plotting is test_result[0][:,each sensor]
observation plotting is test_result[1][:,each sensor]
are these right?
Thanks your help
Hello.
We are recently studying the problem of anomaly detection based on time-series data, and in the process of our research, we found your paper, and we are very interested in your GDN model, thank you for sharing it.
We are currently reproducing your GDN model, and the results are very good on the SWaT dataset, but not so good on the WADI dataset (f1:0.45, pre:0.886, recall:0.303), so I would like to ask you what is wrong with my reproduction experiments, and I hope I can get your help. Again, thank you.
Hello,
I am currently working in anomaly detection and I am very interested in your research. Thank you for sharing your work.
I am using your code to perform anomaly detection on the Server Machine Dataset (SMD). However while performing the experiments I observed that the training on GPU is not deterministic (even after setting the random seed and other measures for reproducibility). Have you also observed the same behavior? Are graph neural networks in general non deterministic or is this something specific to your model?
Thank you, once again. I am looking forward to your reply as this will help me further understand the issue and go about with the experimenting.
Best regards,
Rashmi
Hi, thanks so much for your sharing.
I am now trying to reproduce your results on WADI dataset.
Now I use the following hyperparameters as below:
But the results are around 4% below your results in the paper.
May I ask do you have other hyper-parameter to set or there are something I need to change? And I am using the OCT data to train and test( If I use NOV data to train the results are even lower.) I actually achieved the same results on the SWAT with the setting you mentioned in another issue but the WADI seems to be confused. Thanks a lot and looking forward to your reply!
Hello, I just started to contact the time series abnormal detection task recently, and I want to consult you about two basic questions.
During training, there is an edge between two nodes, but after one optimization, the edge between the two nodes disappears. Then how does GDN deal with the attention weight of the new edge?
Does every sliding window produce a graph? Or does GDN optimize only one graph structure from beginning to end until the graph structure becomes stable?
Sorry to bother you!
Looking forward to your reply!
Hi, my name is Alessio and I'm writing my thesis on methods for anomaly detection in time series. I have found your paper and I would like to implement it in tensorflow but I have problems to understand the exact steps you did in this function. Can you explain it quickly?
Line 122 in 47d2663
Thanks in advance for your answer.
Have a nice day!
Hello, I am very interested in your research. When I was trying to understand the code, I had some questions, which are summarized as follows:
Hello, your work is interesting and impressive, which is very useful to me.
I reproduced the SWaT and WADI results, but there were some problems.
Here are the results of the SWaT dataset:
F1 score: 0.7717
Precision: 0.9717
Recall: 0.6398
Auc: 0.8517
Here are the results of the WADI dataset:
F1 score: 0.2442
Precision: 0.2052
Recall: 0.3019
Auc: 0.7083
There is a big gap between the results of WADI and the paper. I tried multiple random seeds, but the results were similar.In addition, AUC is higher than pre and Recall. Is this normal?
Any suggestions would be appreciated!Thank you very much!
Hi, thanks for your excellent work and release your code.
Below are based on the experiments on dataset wadi.
I was confused about the recall and precision shown in my experiments all the time. The recall was always much higher than precision and even though I try other combination of the parameters.
I was unsure whether it lay in my handcraft labling on the test data, so i also use the processed data kindly provided by you. But the result remained the same.
By the way, I am also confused about the way I dealt with dataset, since I just read the attack document you provided and simply set the "attack" to 1 if it is during the time when an anomaly happend as the document described.
I don't know whether it is the correct way to deal with this :(
look forward to your reply!
Hi,First of all, thank you very much for sharing the code, I am trying your code, the environment is configured according to your requirements(PyG=1.5.0,pytorch=1.5.1), using msl data to try to run normally, but when I use wadi data set for training, after each epoch Loss The values of ACU_loss and ACU_loss are both Nan(At least the first 15 epochs are all this result), which seems to be unusual (wadi has been modified to ensure that it is consistent with the msl format,for example, 0 means normal, 1 means attack. contains 127 sequences, and the code has not been changed). I would like to ask if you need to perform the data set Some preprocessing?
Best Regards
Why not use nodes to create the gnn layer, but instead use edges.
self.gnn_layers = nn.ModuleList([ GNNLayer(input_dim, dim, inter_dim=dim+embed_dim, heads=1) for i in range(edge_set_num) ])
Your work is amazing and impressive. When I was trying to reproduce your results on the wadi dataset, there were some problems that I was not clear about.
And I got these results on wadi data:
F1 score: 0.3994959042218021
precision: 0.5382003395585738
recall: 0.31795386158475425
So I was wondering how did you process the dataset? Any suggestions would be appreciated! Thank you so much! :)
During the experiment on SWaT and WADI data, did you use normalization techniques such as min-max normalization?
Hi, do you have the code for latest pytorch version? Currently it works for only old versions of pytorch libraries.
It would be very helpful if you shared the latest version.
According to your code in TimeDataset.py you import a Scikit-learn module to do the normalization of the data, but it is never used. Don't you think it could be harmful for the model if there is a big difference in scale between the different measurements?
Thanks !
Kind regards,
Sébastien de Blois
Thank you very much for sharing the code. According to the paper, I preprocessed the SWaT data, and then ran the code to get the following results:
F1 score: 0.7710816777041943
precision: 0.9705391884380211
recall: 0.6394433253982788.
And the command for running the code is
···
python main.py -dataset SWaT
-save_path_pattern SWaT
-slide_stride 1
-slide_win 5
-batch 16
-epoch 30
-comment SWaT
-random_seed 6799
-decay 0
-dim 64
-out_layer_num 1
-out_layer_inter_dim 1
-val_ratio 0.1
-report best
-topk 15.
···
In addition, I have the use of the verification to select the threshold and set '-report' to 'val', get
F1 score: 0.5206937055206247
precision: 0.4181091877496671
recall: 0.6899835195019227.
So whether my parameter setting and preprocessing are correct, if they are correct, and what I should do to further improve the accuracy. Thanks!
Hello, thank you very much for your work. This paper and code have inspired me a lot. Thank you very much!
In the process of code reproduction, I met some problems. I am not quite clear about the source of MSL data set, and I wonder if you could give a more detailed explanation. I am a beginner of Python and neural networks. I am a novice, so I can't read the code, so I use your code as a black box. For MSL data sets, the F1 score is about 0.88-0.90 after multiple runs. I made some boring attempts to keep the label of your data set unchanged, change all other data to RAND (), random number, and then run the same F1 score as 0.88-0.90. I'm not quite sure why, and I wonder if you could enlighten me. Thank you very much.
In addition, I am really stupid and can't read the code, so I may encounter many problems later. I sincerely want to get in touch with you and hope to get some advice from you. I didn't find your email address, so I don't know how to contact you easily. My email is [email protected] and I look forward to contacting you.
Are you planning update train process to another CUDA i.e on 11.6?
Thanks for your sharing of this wonderful work to the public. To visualize the learned graph from attention weight, I have use your snippet in issue #30 . However, still confused about how to average weight_mat
when batch_size is not equal to 1. Do I just have to divide weight_max
by batch_size like the following snippet?
coeff_weights = model.gnn_layers[0].att_weight_1.cpu().detach().numpy()
edge_index = model.gnn_layers[0].edge_index_1.cpu().detach().numpy()
weight_mat = np.zeros((feature_num, feature_num))
for i in range(len(coeff_weights)):
edge_i, edge_j = edge_index[:, i]
edge_i, edge_j = edge_i % feature_num, edge_j % feature_num
weight_mat[edge_i][edge_j] += coeff_weights[i]
# Next, you could average weight_mat if you use batches or directly use the result if you only use batch=1.
weight_mat /= batch_size
Hello, I have question about WADI dataset.
I requested WADI dataset to iTrust and will follow your preprocessing steps.
There two versions of WADI dataset, WADI.A1_9 Oct 2017 and WADI.A2_19 Nov 2019, and table_WADI.pdf for both versions.
Q1. I think you used the former version (WADI.A1_9 Oct 2017), is it right?
Q2. In the attack description document for the version (table_WADI.pdf), the start date of the 9th attack is 10/10/17 10:55:00, but I think it's a typo. In your implementation, are the start and end time of the 9th attack 11/10/17 10:55:00 and 11/10/17 10:56:27, respectively?
I preprocessed SWaT dataset according to scripts/process_swat.py
I set the hyperparameters for model training according to your comment:
#4 (comment)
I got significantly different scores compared to paper results.
F1 score: 0.7699076110866696
precision: 0.9727626459143969
recall: 0.6371745858365192
Can you confirm that I set the correct hyperparameters?
seed=6799
BATCH_SIZE=16
SLIDE_WIN=5
dim=64
out_layer_num=1
SLIDE_STRIDE=1
topk=15
out_layer_inter_dim=128
val_ratio=0.1
decay=0
EPOCH=50
report='best'
Hello, I want to run your code on SWaT and WADI datasets, but how can I get train.csv, test.csv, and list.txt? Is WADI_14days_new.csv in the WADI dataset the original training set (WADI_attackdataLABLE.csv is the original test set)? Then it is processed by process() in TimeDataset.py to become the training data set used in your article?
Looking forward to your reply, thank you very much!
Hello, thank you for your creative paper, which is very helpful to me.
I had some problems running your codes,may I ask for some advice?
Hello, thanks for sharing your excellent work!
May I ask how to get the attention weights for each edge? I want to try to draw Figure 3 ( Left ).
Another question is about using directed graphs for anomaly localization. As mentioned in the paper, 1_FIT_001_PV is the attacked object, and 1_MV_001_STATUS is detected as abnormal, and we are checking the weight of the edge pointed to 1_MV_001_STATUS (neighbor-->1_MV_001_STATUS). Do we need to check edges (1_MV_001_STATUS-->neighbor) which have high attention weights? Is this the point of the model using a directed graph?
Thank you very much!
Hi, I am very interested in your model.
I would like to know how it should be set up to reproduce the results in table 3.
If possible, I would also like to ask how to get sensor embedding vectors to draw Figure 2.
Hi, I have a question, what is the meaning of a(T) in function 7 in your paper? is it a value or tensor?
Thank you for your excellent work!
I do have a question regarding your implementation of your graph_layer. Why do you need the following three lines?
Lines 61 to 63 in 9853899
Thanks!
Thank you so much for being able to open source the code!
Which lines of code correspond to formulas 5 to 8?
I really don't understand how the attention mechanism is added to the model.
When I run the code, the above line is wrong, I think it should be "alpha = softmax(alpha, edge_index_i, num_nodes=size_i)"?
Hello I get the same error:
RuntimeError: softmax() Expected a value of type 'Optional[Tensor]' for argument 'ptr' but instead found type 'int'.
Position: 2
Value: 48
Declaration: softmax(Tensor src, Tensor? index=None, Tensor? ptr=None, int? num_nodes=None, int dim=0) -> (Tensor)
Cast error details: Unable to cast Python instance to C++ type (compile in debug mode for details)
Why the code cannot work with my environment ? It is the first time I ever encounter that type of error... I have CUDA 11.3 and PyTorch 1.10.2.
Thanks for your help !
Originally posted by @Scienceseb in #10 (comment)
F1 score: 0.6666666666666666
precision: 0.49609375
recall: 0.9844961240310077
When I run the main.py using the default msl dataset. Then this error occured.
My torch_geometric version is 2.0.4. Would you help to fix this ?
Traceback (most recent call last):
File "main.py", line 256, in <module>
main.run()
File "main.py", line 108, in run
self.train_log = train(self.model, model_save_path,
File "/home/kaifazhe4/test/GDN/train.py", line 69, in train
out = model(x, edge_index).float().to(device)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/kaifazhe4/test/GDN/models/GDN.py", line 166, in forward
gcn_out = self.gnn_layers[i](x, batch_gated_edge_index, node_num=node_num*batch_num, embedding=all_embeddings)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/kaifazhe4/test/GDN/models/GDN.py", line 73, in forward
out, (new_edge_index, att_weight) = self.gnn(x, edge_index, embedding, return_attention_weights=True)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/kaifazhe4/test/GDN/models/graph_layer.py", line 65, in forward
out = self.propagate(edge_index, x=x, embedding=embedding, edges=edge_index,
File "/opt/conda/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py", line 317, in propagate
out = self.message(**msg_kwargs)
File "/home/kaifazhe4/test/GDN/models/graph_layer.py", line 110, in message
alpha = softmax(alpha, edge_index_i, size_i)
RuntimeError: softmax() Expected a value of type 'Optional[Tensor]' for argument 'ptr' but instead found type 'int'.
Hi, I think your work is really awesome!
I am working in an anomaly detection domain where the number of sensor sources at each time step may be different. For example, in the world, there may be sensors A, B, C, and D. From time steps 1 to 5, I may only be able to access to sensors A and B. Whereas from time steps 6 to 8, I may only have access to sensors B, C, and D.
My understanding is that your current architecture directly works on a static number of sensors sources at each time step (both in training and testing). Is your architecture able to model a varying number of sensors at each time step?
Also, would you have any suggested readings or resources that helped you to gain a deep understanding of graphical neural networks and graph structure learning in general?
i tried to run the code for a long time, but the code always throw the bug ,can u tell me how to fix it?
alpha = softmax(alpha, edge_index_i,size_i)
RuntimeError: softmax() Expected a value of type 'Optional[Tensor]' for argument 'ptr' but instead found type 'int'.
Position: 2
Value: 864
Declaration: softmax(Tensor src, Tensor? index=None, Tensor? ptr=None, int? num_nodes=None, int dim=0) -> (Tensor)
Cast error details: Unable to cast Python instance to C++ type (compile in debug mode for details)
Hello, thank you for sharing your work, it is interesting and impressive. But I am confused about the evaluation method you used. You calculated scores with test_result and val_result, both include predicted, ground, labels. That's very confused me. why don't you compare the predicted labels with the original labels directly?
By the way, I ran your code and got very high scores when I set '-report=='best'' as follows. Even I change the seed value, it does not change.
F1 score_best: 0.9430060816681147
precision_best: 0.9428323197219809
recall_best: 0.9428323197219809
On the other hand, when I set '-report=='val'', I got very low scores like this:
F1 score_val: 0.01320590790616855
precision_val: 0.01320590790616855
recall_val: 0.01320590790616855
when I change seed value, the scores change a little but still very slow.
Could you tell me why? Thank you very much!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.