Git Product home page Git Product logo

tcct2021's Introduction

TCCT2021

Convolutional Transformer Architectures Complementary to Time Series Forecasting Transformer Models

Paper: TCCT: Tightly-Coupled Convolutional Transformer on Time Series Forecasting https://arxiv.org/abs/2108.12784

It has already been accepted by Neurocomputing:

Journal ref.: Neurocomputing, Volume 480, 1 April 2022, Pages 131-145

doi: 10.1016/j.neucom.2022.01.039

Usage

Copy all files and paste them into Informer2020-main[1][2]. Cover all files if needed.

Three architectures are added:

CSPAttention: A self-attention mechanism mirroring CSPNet belonging to CNN. It cuts down nearly 30% of memory occupation and 50% of time complexity of self-attention mechanism while achieving equivalent or superior forecasting accuracy.

Dilated Causal Convolution: It replaces canonical convolution to connect self-attention blocks. It helps Transformer model acquire exponentially receptive field growth with a little more negligible computation cost. Therefore, the learning capability of Transformer is strengthened.

Passthrough Mechanism: It concatenates feature maps of different scales of self-attention blocks, thus getting more fine-grained information. Alike to feature pyramids commonly used in CNN and image processing, it expands feature maps leading to better forecasting performance of Transformer model.

LogSparse Attention[3][4] is added in models/attn.py

Postscript

Note that TCCT architectures could also be applied to other time series forecasting Transformer or Transformer-like models. However, necessary transformations are needed.

Reference

[1] Zhou, H., Zhang, S., Peng, J., Zhang, S., Li, J., Xiong, H., & Zhang, W. (2020). Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. arXiv preprint arXiv:2012.07436.

[2] https://github.com/zhouhaoyi/Informer2020

[3] Li, S., Jin, X., Xuan, Y., Zhou, X., Chen, W., Wang, Y. X., & Yan, X. (2019). Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. arXiv preprint arXiv:1907.00235.

[4] https://github.com/AIStream-Peelout/flow-forecast

tcct2021's People

Contributors

origamisl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

tcct2021's Issues

Insights about CSP attention

Hi, great work!

But what's the insights about CSP attention? Does that mean some of the features have long-term spatial dependencies while the others are not? And the structure forced those who need long-term dependencies squeeze to the upper half automatically while others moves to the lower half?

It makes more sense to me if its performance tie with vanilla attention's and with fewer parameters. Is that inductive bias that makes CSP outperforms the vanilla one?

Thank you

request

Dear author, I want to run your code because of my research needs, but I found a file missing during the running process. This file is exp.exp_ Informer, can I apply for this file? I really want to know the running results of TCCT.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.