Git Product home page Git Product logo

transdim's Introduction

transdim

MIT License Python 3.7 repo size GitHub stars

logo

Machine learning models make important developments about spatiotemporal data modeling - like how to forecast near-future traffic states of road networks. But what happens when these models are built with incomplete data commonly collected in real-world systems?

About the Project

In the transdim (transportation data imputation) project, we build machine learning models to help address some of the toughest challenges of spatiotemporal data modeling - from missing data imputation to time series prediction. The strategic aim of this project is creating accurate and efficient solutions for spatiotemporal traffic data imputation and prediction tasks.

In a hurry? Please check out our contents as follows.

Tasks and Challenges

Missing data are there, whether we like them or not. The really interesting question is how to deal with incomplete data.

  • Missing data imputation πŸ”₯

    • Random missing (RM): Each sensor lost their observations at completely random. (β˜…β˜…β˜…)
    • Non-random missing (NM): Each sensor lost their observations during several days. (β˜…β˜…β˜…β˜…)

drawing

Example: Tensor completion framework for multi-dimensional missing traffic data imputation.

  • Spatiotemporal prediction πŸ”₯
    • Forecasting without missing values. (β˜…β˜…β˜…)
    • Forecasting with incomplete observations. (β˜…β˜…β˜…β˜…β˜…)

drawing

Example: An illustration of single-step rolling prediction task under a matrix factorization framework.

Implementation

Open data

In this repository, we have adapted the public data sets into our experiments. For example, to read the data set on your console, you may see the following code:

import scipy.io

tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')
tensor = tensor['tensor']
random_matrix = scipy.io.loadmat('../datasets/Guangzhou-data-set/random_matrix.mat')
random_matrix = random_matrix['random_matrix']
random_tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/random_tensor.mat')
random_tensor = random_tensor['random_tensor']

If you want to view the original data, please check out the following links:

Model implementation

In our experiments, we have implemented the machine learning models mainly on Numpy, and written these Python codes with Jupyter Notebook. So, if you want to evaluate these models, you could download and run these notebooks directly (prerequisite: download the data sets before evaluation).

Task Jupyter Notebook link Gdata Bdata Hdata Sdata Ndata
Missing Data Imputation BTMF βœ… βœ… βœ… βœ… πŸ”Ά
BayesTRMF βœ… βœ… βœ… βœ… πŸ”Ά
TRMF βœ… βœ… βœ… βœ… πŸ”Ά
BPMF βœ… βœ… βœ… βœ… πŸ”Ά
BGCP βœ… βœ… βœ… βœ… βœ…
TF-ALS βœ… βœ… βœ… βœ… βœ…
BTTF πŸ”Ά πŸ”Ά πŸ”Ά πŸ”Ά βœ…
BayesTRTF πŸ”Ά πŸ”Ά πŸ”Ά πŸ”Ά βœ…
BPTF πŸ”Ά πŸ”Ά πŸ”Ά πŸ”Ά βœ…
Single-Step Prediction BTMF βœ… βœ… βœ… βœ… πŸ”Ά
BayesTRMF βœ… βœ… βœ… βœ… πŸ”Ά
TRMF βœ… βœ… βœ… βœ… πŸ”Ά
BTTF πŸ”Ά πŸ”Ά πŸ”Ά πŸ”Ά βœ…
BayesTRTF πŸ”Ά πŸ”Ά πŸ”Ά πŸ”Ά βœ…
TRTF πŸ”Ά πŸ”Ά πŸ”Ά πŸ”Ά βœ…
Multi-Step Prediction BTMF βœ… βœ… βœ… βœ… πŸ”Ά
BayesTRMF βœ… βœ… βœ… βœ… πŸ”Ά
TRMF βœ… βœ… βœ… βœ… πŸ”Ά
BTTF πŸ”Ά πŸ”Ά πŸ”Ά πŸ”Ά βœ…
BayesTRTF πŸ”Ά πŸ”Ά πŸ”Ά πŸ”Ά βœ…
TRTF πŸ”Ά πŸ”Ά πŸ”Ά πŸ”Ά βœ…
  • βœ… β€” Covered
  • πŸ”Ά β€” Does not cover
  • 🚧 β€” Under development

If you have any suggestion, please feel free to contact Xinyu Chen (email: [email protected]) and send your suggestions.

Recommended email subject: Suggestions on transdim from [+ your name].

Imputation/Prediction performance

  • Imputation example

example (a) Time series of actual and estimated speed within two weeks from August 1 to 14.

example (b) Time series of actual and estimated speed within two weeks from September 12 to 25.

The imputation performance of BGCP (CP rank r=15 and missing rate Ξ±=30%) under the fiber missing scenario with third-order tensor representation, where the estimated result of road segment #1 is selected as an example. In the both two panels, red rectangles represent fiber missing (i.e., speed observations are lost in a whole day).

  • Prediction example

example

example

example

References

Our Publications

  • Xinyu Chen, Lijun Sun (2019). Bayesian temporal factorization for multidimensional time series prediction. arxiv. 1910.06366. [preprint] [slide] [data & Python code]

  • Xinyu Chen, Zhaocheng He, Yixian Chen, Yuhuan Lu, Jiawei Wang (2019). Missing traffic data imputation and pattern discovery with a Bayesian augmented tensor factorization model. Transportation Research Part C: Emerging Technologies, 104: 66-77. [preprint] [doi] [slide] [data] [Matlab code]

  • Xinyu Chen, Zhaocheng He, Lijun Sun (2019). A Bayesian tensor decomposition approach for spatiotemporal traffic data imputation. Transportation Research Part C: Emerging Technologies, 98: 73-84. [preprint] [doi] [data] [Matlab code] [Python code]

  • Xinyu Chen, Zhaocheng He, Jiawei Wang (2018). Spatial-temporal traffic speed patterns discovery and incomplete data recovery via SVD-combined tensor decomposition. Transportation Research Part C: Emerging Technologies, 86: 59-77. [doi] [data]

    This project originates from our papers, please consider citing our papers if they help your research.

Collaborators

Xinyu Chen
Xinyu Chen

πŸ’»
Jinming Yang
Jinming Yang

πŸ’»
Yixian Chen
Yixian Chen

πŸ’»
Lijun Sun
Lijun Sun

πŸ’»
Tianyang Han
Tianyang Han

πŸ’»

See the list of contributors who participated in this project.

License

This work is released under the MIT license.

transdim's People

Contributors

hanty avatar lijunsun avatar vadermit avatar xinychen avatar yxnchen avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.