microsoft / stemgnn Goto Github PK

Spectral Temporal Graph Neural Network (StemGNN in short) for Multivariate Time-series Forecasting

Python 100.00%

stemgnn's Introduction

Spectral Temporal Graph Neural Network for Multivariate Time-series Forecasting

This repository is the official implementation of Spectral Temporal Graph Neural Network for Multivariate Time-series Forecasting.

Requirements

Recommended version of OS & Python:

OS: Ubuntu 18.04.2 LTS
Python: python3.7 (instructions to install python3.7).

To install python dependencies, virtualenv is recommended, sudo apt install python3.7-venv to install virtualenv for python3.7. All the python dependencies are verified for pip==20.1.1 and setuptools==41.2.0. Run the following commands to create a venv and install python dependencies:

python3.7 -m venv venv
source venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt

Datasets

PEMS03, PEMS04, PEMS07, PEMS08, METR-LA, PEMS-BAY, Solar, Electricity, ECG5000, COVID-19

We can get the raw data through the links above. We evaluate the performance of traffic flow forecasting on PEMS03, PEMS07, PEMS08 and traffic speed forecasting on PEMS04, PEMS-BAY and METR-LA. So we use the traffic flow table of PEMS03, PEMS07, PEMS08 and the traffic speed table of PEMS04, PEMS-BAY and METR-LA as our datasets. We download the solar power data of Alabama (Eastern States) and merge the 5-minute csv files (totally 137 time series) as our Solar dataset. We delete the header and index of Electricity file downloaded from the link above as our Electricity dataset. For COVID-19 dataset, the raw data is under the folder csse_covid_19_data/csse_covid_19_time_series/ of the above github link. We use time_series_covid19_confirmed_global.csv to calculate the daily number of newly confirmed infected people from 1/22/2020 to 5/10/2020. The 25 countries we take into consideration are 'US','Canada','Mexico','Russia','UK','Italy','Germany','France','Belarus ','Brazil','Peru','Ecuador','Chile','India','Turkey','Saudi Arabia','Pakistan','Iran','Singapore','Qatar','Bangladesh','Arab','China','Japan','Korea'.

The input csv file should contain no header and its shape should be T*N, where T denotes total number of timestamps, N denotes number of nodes.

Since complex data cleansing is needed on the above datasets provided in the urls before fed into the StemGNN model, we provide a cleaned version of ECG5000 (./dataset/ECG_data.csv) for reproduction convenience. The ECG_data.csv is in shape of 5000*140, where 5000 denotes number of timestamps and 140 denotes total number of nodes. Run command python main.py to trigger training and evaluation on ECG_data.csv.

Training and Evaluation

The training procedure and evaluation procedure are all included in the main.py. To train and evaluate on some dataset, run the following command:

python main.py --train True --evaluate True --dataset <name of csv file> --output_dir <path to output directory> --n_route <number of nodes> --window_size <length of sliding window> --horizon <predict horizon> --norm_method z_score --train_length 7 --validate_length 2 --test_length 1

The detailed descriptions about the parameters are as following:

Parameter name	Description of parameter
train	whether to enable training, default True
evaluate	whether to enable evaluation, default True
dataset	file name of input csv
window_size	length of sliding window, default 12
horizon	predict horizon, default 3
train_length	length of training data, default 7
validate_length	length of validation data, default 2
test_length	length of testing data, default 1
epoch	epoch size during training
lr	learning rate
multi_layer	hyper parameter of STemGNN which controls the parameter number of hidden layers, default 5
device	device that the code works on, 'cpu' or 'cuda:x'
validate_freq	frequency of validation
batch_size	batch size
norm_method	method for normalization, 'z_score' or 'min_max'
early_stop	whether to enable early stop, default False

Table 1 Configurations for all datasets

Dataset	train	evaluate	node_cnt	window_size	horizon	norm_method
METR-LA	True	True	207	12	3	z_score
PEMS-BAY	True	True	325	12	3	z_score
PEMS03	True	True	358	12	3	z_score
PEMS04	True	True	307	12	3	z_score
PEMS07	True	True	228	12	3	z_score
PEMS08	True	True	170	12	3	z_score
COVID-19	True	True	25	28	28	z_score

Results

Our model achieves the following performance on the 10 datasets:

Table 2 (predict horizon: 3 steps)

Dataset	MAE	RMSE	MAPE(%)
METR-LA	2.56	5.06	6.46
PEMS-BAY	1.23	2.48	2.63
PEMS03	14.32	21.64	16.24
PEMS04	20.24	32.15	10.03
PEMS07	2.14	4.01	5.01
PEMS08	15.83	24.93	9.26

Table 3 (predict horizon: 28 steps)

Dataset	MAE	RMSE	MAPE
COVID-19	662.24	1023.19	19.3

stemgnn's People

Contributors

Stargazers

Watchers

Forkers

pj0616 ascend2018 global-localhost global19 global19-atlassian-net germayneng maciejdomagala lichunfeng2406 cvlinks jingmouren shuowang-ai kialanpillay manzhuyu wtwong316 fan1dy jeongwhanchoi zshwuhan cpeng-pz tylerchoi1224 marlin-github xrosliang christine-tinguo fchest trendingtechnology moqingxinai standardgalactic mhhukiewitz zotko xuanmao dinhtrang24 tripleess nlebang rkainkaryam statmixedml lyzl2010 xenialll valeman xuliji hieutv85 rotcx jimseitif xaviergonzalez carinarer harsha200105 duliangntu rl-master-thesis anandsk98 duykhuongnguyen davideroznowicz vishalbelsare hero-han future-158 tinghainet zhouhnag wdoppenberg hlhang9527 yangyangfu 201528014227051 oceanlpp liuhsincheng zl1665 phoenixes94 cheukwaylee nishr aleksandramitro yuqinghao1 mingyi1995 djkim0516 semeron jwang-paradise luluyao9494 minajwsy ailabteam sizchode mojeedoyedeji zmhhh578 seuwky1021 wefwefwef2 muleina caoalbert liangyangtju stephendzh nothing0121 afigar albertosales fuliuwei yijun021685 tysm-tysm tianhaofu wl2echo 2014402680 kmeimagma liangaomng nhathuynguyen whysirier lixixibj cosmic-kraver lavender-er fbientrigo mr-nobody-dey

stemgnn's Issues

Difference in implementation and paper

Good day,

Thank you for the paper. I noticed that there are some difference in implementation described in the paper as well as the source code,

For example, the spectral sequential cell defined in the paper uses a convo 1D operation but i cant seem to see it in the code (unless i am missing something )
In the training loop, the backcast are not used in the loss function
Another question i have is that i dont understand how the following:

igfted = torch.matmul(gconv_input, self.weight)
igfted = torch.sum(igfted, dim=1)

is actually the IGFT operation. I understand that as opposed to the paper, the Chebyshev Polynomial is used to directly compute the GFT operation without the need for the eigenvectors of L.

Will be good to explain! thanks!

Reproducing ECG5000 with min-max scaler

Hi,

I have been trying to reproduce your results and particularly those on the ECG dataset
Using the mix-max scaler with the norm_method causes the following errors:

python main.py --dataset ECG_data --norm_method min_max

File "main.py", line 61, in <module>
    _, normalize_statistic = train(train_data, valid_data, args, result_train_file)
File "handler.py", line 169, in train
    json.dump(normalize_statistic, f)

TypeError: Object of type ndarray is not JSON serializable

How can we solve this?

And could you tell me the parameters for reproducing metr-la dataset?
There is too much difference from the performance stated in the paper.

Thank you.

Error of "unsupported operand type(s) for -: 'list' and 'list'"

When using min-max method on ECG data, we will encounter error of "unsupported operand type(s) for -: 'list' and 'list'".

Line 10, forecast_dataloader.py

Share your data sets, please

Hi,
There seem to be some preprocessing steps you applied, not all documented. For example the electricity set is of 15-minute grain, but in your paper you describe it as of 1 hour grain. Also you mentioned that you removed the first line and timestamps. It would be great if you just shared your datasets.

Command line arguments do not match the ones in main.py

python main.py --train True --evaluate True --dataset --output_dir --n_route --window_size --horizon --norm_method z_score --train_length 7 --validate_length 2 --test_length 1

--output_dir is not included in the main.py
--validate_length mismatch --valid_length
and so on.

The result can not be reproduced

I have run the command python3 main.py --train True --evaluate True --window_size=12 --horizon=3. However, the performance on ECG is much worse than the paper reported.

This repo is missing a license file

This repository is currently missing a LICENSE.MD file outlining its license. A license helps users understand how to use your project in a compliant manner. You can find the standard MIT license text at the Microsoft repo templates LICENSE file: https://github.com/microsoft/repo-templates/blob/main/shared/LICENSE.
If you would like to learn more about open source licenses, please visit the document at https://aka.ms/license.

Multi to Single(Semi Single)

Thank you for this. I wonder if its possible to lets say having 10 inputs and predicting only two of them .Or even better predicting 2 complete different targets?

Forecast n steps ahead

Hi,

I read the paper for this repo and I was mightily impressed. I've been trying to implement this algorithm on custom data to do forecasting on biological data

What would be the best way going about the code and making that happen. I want to edit as little as possible as I don't have too much time.

I know that you can apply it to testing data, but how do I implement the forecasting n steps into the future without having target data?

Conflicting latent correlation layer implementation and equations from paper

Hello,

if I am not mistaken, you have mixed-up the temporal and the feature dimension in the GRU input of the latent correlation layer.
So, the input data x of the latent_correlation_layer has the shape (batch_size, window_size, number_of_nodes).
Then, the permutated input of the GRU has the shape (number_of_nodes, batch_size, window_size).
Since you do not set the batch_first flag, the PyTorch GRU implementation expects an input of shape (sequence_length, batch_size, input_size) (see documentation ).

Moreover, in the paper you say that you use the last hidden state.
In your code you simply use the output of the GRU, which is the hidden state at every single time step and not just the last one.
If you would actually use the last hidden state and fix the allegedly mixed-up dimensions, you would end up with a single vector of length hidden_size in the PyTorch documentation, which corresponds to your units input, i.e., the number of nodes.
Now, I am wondering about the correct dimensionality of your linear projections W^Q and W^K for the query and key representations.
These are vectors in the current implementation, which would not make much sense in the equations from the paper in combination with a vector R.

My conclusion: To me, it seems like there are major conflicts between the paper and the implementation regarding the latent correlation layer and the formal approach from the paper has issues regarding correct dimensionalities or is formulated inaccurately.

Please correct me, if I made any mistake.

Best regards
Chris

Reproducing the COVID-19 results in your paper

Hi,

Thanks for the paper and for the open source code.
I have been trying to reproduce your results and particularly those on the covid-19 dataset (smallest dataset so fastest to run without a GPU).
Could you share the settings for the hyper-parameters used for this dataset? Those that you think have a meaningful impact on the training. I am sorry; I cannot do a grid search due to very little computation power.

In addition, when we set the horizon to 28 on the covid-19 dataset, we get an error when trying to reproduce.
It occurs at line 71 in Forecast_dataloader:
range(self.window_size, self.df_length - self.horizon + 1)
From the paper and github, I gathered that for the validation set we had a window size of 28, and dataset length of 50 (50 days) and a horizon of 28. With these values, it would seem normal that the code fails as we get a range(28, 23).
Are there any step I could take to make the code work?

Thank you in advance,
Peter

This repo is missing important files

There are important files that Microsoft projects should all have that are not present in this repository. A pull request has been opened to add the missing file(s). When the pr is merged this issue will be closed automatically.

Microsoft teams can learn more about this effort and share feedback within the open source guidance available internally.

Merge this pull request

Can not run on GPU

Hi, thank you for sharing your code!
But, I can not run it on GPU.
The cmd line like python main.py --device 'cuda:0',
and the Traceback message likes:
Training configs: Namespace(batch_size=32, dataset='PeMS07', decay_rate=0.5, device='cuda:0', dropout_rate=0.5, early_stop=False, epoch=25, evaluate=True, exponential_decay_step=5, horizon=12, leakyrelu_rate=0.2, lr=0.0001, multi_layer=5, norm_method='z_score', optimizer='RMSProp', test_length=1, train=True, train_length=7, valid_length=2, validate_freq=1, window_size=3)
/home/cll/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/cuda/init.py:104: UserWarning:
NVIDIA RTX A6000 with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70 sm_75.
If you want to use the NVIDIA RTX A6000 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))
Traceback (most recent call last):
File "main.py", line 57, in
_, normalize_statistic = train(train_data, valid_data, args, result_train_file)
File "/home/cll/SD-GLN/baseline_220/StemGNN-master/models/handler.py", line 106, in train
model.to(args.device)
File "/home/cll/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/nn/modules/module.py", line 612, in to
return self._apply(convert)
File "/home/cll/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/nn/modules/module.py", line 359, in _apply
module._apply(fn)
File "/home/cll/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 162, in _apply
self.flatten_parameters()
File "/home/cll/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 152, in flatten_parameters
self.batch_first, bool(self.bidirectional))
RuntimeError: CUDA error: no kernel image is available for execution on the device

Meanwhile, the dependencies like:
Python 3.7.13 (default, Mar 29 2022, 02:18:16)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.

import torch
t>>> torch.version
'1.7.1'

Waiting for your reply!
Thanks a lot~

ACTION REQUIRED: Microsoft needs this private repository to complete compliance info

There are open compliance tasks that need to be reviewed for your StemGNN repo.

Action required: 4 compliance tasks

To bring this repository to the standard required for 2021, we require administrators of this and all Microsoft GitHub repositories to complete a small set of tasks within the next 60 days. This is critical work to ensure the compliance and security of your microsoft GitHub organization.

Please take a few minutes to complete the tasks at: https://repos.opensource.microsoft.com/orgs/microsoft/repos/StemGNN/compliance

The GitHub AE (GitHub inside Microsoft) migration survey has not been completed for this private repository
No Service Tree mapping has been set for this repo. If this team does not use Service Tree, they can also opt-out of providing Service Tree data in the Compliance tab.
No repository maintainers are set. The Open Source Maintainers are the decision-makers and actionable owners of the repository, irrespective of administrator permission grants on GitHub.
Classification of the repository as production/non-production is missing in the Compliance tab.

You can close this work item once you have completed the compliance tasks, or it will automatically close within a day of taking action.

If you no longer need this repository, it might be quickest to delete the repo, too.

GitHub inside Microsoft program information

More information about GitHub inside Microsoft and the new GitHub AE product can be found at https://aka.ms/gim.

FYI: current admins at Microsoft include @guinao, @moreOver0, @conhua

Configurations

What is the Configurations for elctricity?

Can we extract the "attention matrix" from "latent_correlation_layer"??

Hello I tried to extract the attention matrix which calcuated in the latent correlation layer,

The Python is not familiar to me.
so anyone who can extract the attention matrix from latent correlation layer, please help me.

I tried to change the code in the "base_model.py"

Without using key K and query Q in Latent Correlation Layer gets better results

On the PeMS07 dataset, the original results of this code are:

Performance on test set: MAPE: 0.04 | MAE: 1.70 | RMSE: 2.7136

But if you simply change the key K and query Q as mean value, you can get better results.

That is, change the following code in lines 154 and 155 in models/base_model.py

    key = torch.matmul(input, self.weight_key)
    query = torch.matmul(input, self.weight_query)

    key = input.mean(dim=2).unsqueeze(dim=2)
    query = input.mean(dim=2).unsqueeze(dim=2)

The new results are:

Performance on test set: MAPE: 0.04 | MAE: 1.60 | RMSE: 2.4902

Graph Creation for Different Tasks

Hi, thank you for making the code available. Can you describe the criteria for creating the graph (how to decide if there is an edge between nodes, their weights if they are weighted) for the different tasks?

Thank you.

Which branch is the latest version of StemGNN?

It seems that the released version of StemGNN is not fully implemented.
Is the master branch is the latest version of your model?

Parameter `units` of the model

Hi,

I'd like to experiment with your model with my own data. I notice that there is a parameter called units in your model. what does it mean?

Where is the backcast loss?

Hi,
I'am very interesting in the autoencoding fashion used in the loss function of StemGNN.
Unfortunately, I can not find the backcast loss in the codes.
Could you tell me where the backcast loss are impletemented?

TypeError: fft_rfft() got an unexpected keyword argument 'onesided'

Hi, thanks for the open source code.

For pytorch 1.8.0, its required to use torch.fft.rfft() and torch.fft.irfft() instead of torch.rfft()/torch.irfft(). For the newer versions, there is no onesided argument. How do I update the code for fft to work for pytorch 1.8.0? Thanks.