Git Product home page Git Product logo

lobs5's Introduction

Generative AI for End-to-End Limit Order Book Modelling

A Token-Level Autoregressive Generative Model of Message Flow Using a Deep State Space Network

This repository provides the implementation for the paper: Generative AI for End-to-End Limit Order Book Modelling. The preprint is available here.

The repository is a fork of the original S5 repository.

Developing a generative model of realistic order flow in financial markets is a challenging open problem, with numerous applications for market participants. Addressing this, we propose the first end-to-end autoregressive generative model that generates tokenized limit order book (LOB) messages. These messages are interpreted by a Jax-LOB simulator, which updates the LOB state. To handle long sequences efficiently, the model employs \emph{simplified structured state-space layers} to process sequences of order book states and tokenized messages. Using LOBSTER data of NASDAQ equity LOBs, we develop a custom tokenizer for message data, converting groups of successive digits to tokens, similar to tokenization in large language models.

Requirements & Installation

To install required packages, run pip install -r requirements.txt.

The GPU installation of JAX can cause problems, further instructions are available here.

Data Download

The data used is NASDAQ LOB data from LOBSTER. After downloading and unpacking the data files, they need to be pre-processed for model training. An example command for GOOG is as follows:

python lob/preproc.py --data_dir /path/to/LOBS5/data/GOOG/ --save_dir /path/to/LOBS5/data/GOOG/ --n_tick_range 500 --use_raw_book_repr

Repository Structure

Directories and files that ship with GitHub repo:

lob/                    Source code for LOB models, datasets, etc.
    ar_pred.ipynb           Test set evaluations and plotting for trained model.
    dataloading.py          Dataloading functions.
    encoding.py             Message tokenization: encoding and decoding
    evaluation.py           Model evaluation logic
    inference.py            Logic for model inference loop
    init_train.py           Train state initialisation
    lob_seq_model.py        Defines LOB deep sequence model that consist of stacks of S5
    lobster_dataloader.py   Defines dataset and dataloading
    preproc.py              Pre-processes LOBSTER data for the model
    run_eval.py             Script to run model inference with trained model
    sweep.py                Hyperparamter sweep (WanDB)
    train_helpers.py        Functions for optimization, training and evaluation steps.
    train.py                Training loop code.
    validation_helpers.py   Helper functions for model validation
s5/                     Original S5 code
bin/                    Shell scripts for downloading data and running experiments.
requirements.txt            Package requirements
run_train.py            Training loop entrypoint.

Experiments / Paper Results

Paper results and plots are calculated in lob\ar_pred.ipynb.

Citation

Please use the following when citing our work:

@article{nagy2023generative,
  doi = {},
  url = {https://arxiv.org/abs/xxxx.xxxxx},
  author = {Nagy, Peer and Frey, Sascha and Sapora, Silvia and Li, Kang and Calinescu, Anisoara and Zohren, Stefan and Foerster, Jakob},
  keywords = {generative AI, structured state space models, limit order books, ML},
  title = {Generative AI for End-to-End Limit Order Book Modelling: A Token-Level Autoregressive Generative Model of Message Flow Using a Deep State Space Network},
  publisher = {arXiv},
  year = {2023},
  copyright = {Creative Commons Attribution 4.0 International}
}

Please reach out if you have any questions.

lobs5's People

Contributors

jimmysmith1919 avatar peernagy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.