Git Product home page Git Product logo

fractal-gnn's Introduction

"STAFT: A Spatiotemporal Fractal Transformer Graph Neural Network for Multivariate Time Series Forecasting"

Abstract

We introduce STAFT, a novel Spatiotemporal Fractal Transformer Graph Neural Network architecture designed for multivariate time series forecasting. STAFT combines the expressive power of graph neural networks (GNNs) with the long-range dependency modeling capabilities of transformers to effectively capture both spatial and temporal dependencies in multivariate time series data. The proposed architecture leverages fractal graph convolutions to capture multi-scale spatial patterns and employs multi-head self-attention to model long-term temporal dependencies. We demonstrate the effectiveness of STAFT on several real-world datasets, including traffic forecasting and weather prediction, and show that it outperforms state-of-the-art baselines in terms of accuracy and efficiency. Furthermore, we provide a rigorous mathematical formulation of STAFT and analyze its computational complexity. Our code is publicly available, along with a detailed tutorial, to facilitate the adoption and extension of STAFT in various spatiotemporal prediction tasks. Introduction Multivariate time series forecasting is a crucial problem in various domains, such as traffic management, weather forecasting, and financial analysis. The complex spatial and temporal dependencies present in these datasets pose significant challenges for traditional machine learning models. Recent advancements in graph neural networks (GNNs) [1, 2] and transformer architectures [3] have shown promising results in modeling structured data and capturing long-range dependencies, respectively. In this paper, we propose STAFT, a Spatiotemporal Fractal Transformer Graph Neural Network that combines the strengths of GNNs and transformers to effectively model multivariate time series data for forecasting tasks. Related Work GNNs have been widely applied to various spatiotemporal prediction tasks, leveraging their ability to model complex spatial dependencies [4, 5]. Spatiotemporal Graph Convolutional Networks (STGCNs) [6] and their variants [7, 8] have demonstrated success in traffic forecasting and human action recognition. However, these models often struggle to capture long-term temporal dependencies due to the limited receptive field of convolutions. Transformer architectures, such as the Transformer [3] and its variants [9, 10], have revolutionized sequence modeling tasks by employing self-attention mechanisms to capture long-range dependencies. Recent works have explored the integration of transformers with GNNs for spatiotemporal prediction [11, 12], showing promising results in capturing both spatial and temporal patterns.

Methodology

Problem Formulation

Let $\mathcal{G} = (\mathcal{V}, \mathcal{E}, \mathbf{A})$ denote a graph, where $\mathcal{V}$ is the set of nodes, $\mathcal{E}$ is the set of edges, and $\mathbf{A} \in \mathbb{R}^{N \times N}$ is the adjacency matrix. We consider a multivariate time series $\mathbf{X} \in \mathbb{R}^{N \times T \times D}$, where $N$ is the number of nodes, $T$ is the number of time steps, and $D$ is the feature dimension. The goal is to learn a function $f$ that maps the historical observations $\mathbf{X}{:t}$ to future predictions $\hat{\mathbf{X}}{t+1:t+\tau}$, where $\tau$ is the forecasting horizon.

STAFT Architecture

The STAFT architecture consists of three main components: (1) Fractal Graph Convolution Layers, (2) Transformer Layers, and (3) Prediction Layers. 3.2.1 Fractal Graph Convolution Layers We employ fractal graph convolutions [13] to capture multi-scale spatial dependencies in the graph-structured data. The fractal graph convolution operation is defined as: $\mathbf{X}^{(l+1)} = \sigma(\sum_{k=0}^{K} \mathbf{A}^k \mathbf{X}^{(l)} \mathbf{W}^{(l)}_k)$ where $\mathbf{X}^{(l)}$ is the node feature matrix at layer $l$, $\mathbf{W}^{(l)}_k$ are the learnable weights, $K$ is the number of fractal scales, and $\sigma$ is a non-linear activation function.

Transformer Layers

To capture long-range temporal dependencies, we employ multi-head self-attention layers [3]. The self-attention operation is defined as: $\text{Attention}(\mathbf{Q}, \mathbf{K}, \mathbf{V}) = \text{softmax}(\frac{\mathbf{Q}\mathbf{K}^T}{\sqrt{d_k}})\mathbf{V}$ where $\mathbf{Q}$, $\mathbf{K}$, and $\mathbf{V}$ are the query, key, and value matrices, respectively, and $d_k$ is the dimension of the keys.

Prediction Layers

The output of the transformer layers is passed through a series of fully connected layers to generate the final predictions $\hat{\mathbf{X}}_{t+1:t+\tau}$.

Training and Optimization

STAFT is trained end-to-end using backpropagation and a suitable loss function, such as mean squared error (MSE) or mean absolute error (MAE). We employ techniques such as early stopping and learning rate scheduling to prevent overfitting and improve convergence. Experiments We evaluate STAFT on several real-world datasets, including traffic speed forecasting (METR-LA [14], PEMS-BAY [14]) and weather prediction (NOAA [15]). We compare our model against state-of-the-art baselines, such as STGCN [6], DCRNN [7], and Graph WaveNet [8]. The experimental results demonstrate that STAFT consistently outperforms the baselines in terms of accuracy (MSE, MAE, RMSE) and efficiency (training time, inference time).

Conclusion

In this paper, we proposed STAFT, a novel Spatiotemporal Fractal Transformer Graph Neural Network architecture for multivariate time series forecasting. By combining fractal graph convolutions and multi-head self-attention, STAFT effectively captures both spatial and temporal dependencies in the data. Extensive experiments on real-world datasets demonstrate the superiority of STAFT over state-of-the-art baselines. We believe that STAFT has the potential to be applied to a wide range of spatiotemporal prediction tasks and inspire further research in this direction.

Acknowledgments

References [1] Kipf, T. N., & Welling, M. (2016). Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907. [2] Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., & Bengio, Y. (2017). Graph attention networks. arXiv preprint arXiv:1710.10903. [3] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998-6008). [4] Li, Y., Yu, R., Shahabi, C., & Liu, Y. (2017). Diffusion convolutional recurrent neural network: Data-driven traffic forecasting. arXiv preprint arXiv:1707.01926. [5] Yu, B., Yin, H., & Zhu, Z. (2017). Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. arXiv preprint arXiv:1709.04875. [6] Yu, B., Yin, H., & Zhu, Z. (2018). Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. In Proceedings of the 27th International Joint Conference on Artificial Intelligence (pp. 3634-3640). [7] Li, Y., Yu, R., Shahabi, C., & Liu, Y. (2018). Diffusion convolutional recurrent neural network: Data-driven traffic forecasting. In Proceedings of the 6th International Conference on Learning Representations. [8] Wu, Z., Pan, S., Long, G., Jiang, J., & Zhang, C. (2019). Graph wavenet for deep spatial-temporal graph modeling. arXiv preprint arXiv:1906.00121. [9] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. [10] Dai, Z., Yang, Z., Yang, Y., Carbonell, J., Le, Q. V., & Salakhutdinov, R. (2019). Transformer-xl: Attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860. [11] Xu, D., Ruan, C., Korpeoglu, E., Kumar, S., & Achan, K. (2020). Inductive representation learning on temporal graphs. arXiv preprint arXiv:2002.07962. [12] Li, Z., Zhong, W., Cao, D., Huang, S., & Li, X. (2021). Spatial-temporal transformer networks for traffic flow forecasting. arXiv preprint arXiv:2109.14745. [13] Zheng, S., Zhu, M., Liu, X., Guo, L., & Xiao, J. (2020). Fractal graph neural networks. arXiv preprint arXiv:2012.05978. [14] Jagadish, H. V., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J. M., Ramakrishnan, R., & Shahabi, C. (2014). Big data and its technical challenges. Communications of the ACM, 57(7), 86-94. [15] Zhang, C., Chen, G., Du, C., Ding, Y., & Zhou, J. (2020). ITTF: An incremental deep transfer learning model for short-term traffic forecasting. Neurocomputing, 415, 108-119.

fractal-gnn's People

Contributors

drukpa1455 avatar

Stargazers

Jeongwhan Choi avatar

Watchers

Kostas Georgiou avatar  avatar

Forkers

jeongwhanchoi

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.