Git Product home page Git Product logo

thequantscientist / lightweight-transformer Goto Github PK

View Code? Open in Web Editor NEW
11.0 2.0 1.0 65.65 MB

A lightweight but efficient Transformer model for accurate univariate stock price forecasting, designed for real-time trading applications. This project transforms the vanilla Transformer architecture for higher-precision financial time series analysis with minimal computational demands.

Home Page: https://dx.doi.org/10.2139/ssrn.4729648

License: GNU General Public License v3.0

Jupyter Notebook 100.00%
learning-rate-scheduling dynamic-masking stock-prices-prediction lightweight-transformer

lightweight-transformer's Introduction

Lightweight Transformer

This project is conducted from November 2023 to April 2024 and submitted to ICPR 2024. This repository contains the implementation of a novel lightweight Transformer model designed specifically for the task of univariate stock price forecasting. Leveraging the advanced capabilities of Transformer architectures, traditionally renowned for their success in natural language processing (NLP), our model adapts this powerful mechanism to the time series forecasting domain, focusing on predicting the future closing prices of stocks (e.g. IBM, AMZN, INTC, CSCO) with high accuracy and efficiency at minimal computational power.

Project Overview

Witnessing the growth and application of AI in live trading machines of the financial industry, this research proposes a lightweight Transformer model with meticulous architecture consisting mainly of positional encoding and renowned training techniques to mitigate model overfitting, hence offering prompt forecasting results through a univariate approach on the closing price of stocks. Employing MSE for loss alongside MAE and RMSE as core evaluation metrics, the proposed Transformer consistently surpasses renowned time series analysis models such as LSTM, SVR, CNN-LSTM, and CNN-BiLSTM, averaging a reduction in forecasting errors by over 50%. Likewise, the single-step Transformer is justified to be the most efficient model among others. After being trained across AMZN, INTC, CSCO, and IBM 20-year daily stock datasets, the Transformer demonstrates a high degree of accuracy in capturing instant downturn shocks, cyclical or seasonal patterns, and long-term dependencies. Thus, it only takes the model 19.36 seconds to generate forecasting results on a non-high-end local machine, fitting into the 1-minute trading window.

Features

  • Lightweight and Optimized Architecture: Specifically designed for stock price forecasting, this model reduces computational requirements without sacrificing accuracy, enabling its use on machines with limited processing power.
  • Univariate Time Series Forecasting: Employs a focused approach on predicting the closing prices of stocks, utilizing historical data to forecast future prices with high precision.
  • Advanced Model Training Techniques: Incorporates dropout regularization, layer normalization, and early stopping to fine-tune the training process, enhancing model performance and preventing overfitting.
  • Dynamic Learning Rate Adjustment: Utilizes the OneCycleLR scheduler for optimal learning rate adjustments during training, facilitating faster convergence and improved model accuracy.
  • Positional Encoding: Integrates temporal information into the model, allowing it to capture time-dependent patterns in the stock market data effectively.
  • Reproducibility and Consistency: Ensures reliable and reproducible results through fixed random seed initialization and detailed documentation of the data processing and model training pipeline.

Technologies Used

  • Python: The primary programming language used for implementing the model and preprocessing data.
  • PyTorch: A powerful, flexible deep learning library utilized for building and training the Transformer model.
  • Pandas: For efficient data manipulation and analysis, particularly useful for handling time series data.
  • Scikit-learn: Employed for data preprocessing tasks, such as scaling and normalization, to prepare data for model training.
  • NumPy: Essential for handling numerical operations, array manipulations, and transformations.
  • Matplotlib/Seaborn (Optional): For visualizing forecasting results and model performance, enhancing interpretability and analysis.
  • Torch.optim: Provides the AdamW optimizer, a variant of the Adam optimizer with improved weight regularization for training.
  • Torch.utils.data DataLoader: Facilitates efficient batching, shuffling, and loading of data during the model training and evaluation process.

Getting Started

Follow the instructions in the subsequent sections to set up your environment, train the model with your dataset, and evaluate its performance on stock price forecasting tasks through evaluation metrics of MAE and RMSE. This model has been tested on various datasets of prominent technology companies, demonstrating its capability to accurately capture market trends and instant downtrends, surpassing RNNs-based model.

lightweight-transformer's People

Contributors

thequantscientist avatar

Stargazers

 avatar Beckett Whritenour avatar Pengüin1866 avatar  avatar Janghwan Kim avatar  avatar  avatar Nguyễn Quốc Anh  avatar  avatar  avatar  avatar

Watchers

Mustapha Mayo avatar  avatar

Forkers

salahnouari

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.