This jupyter notebook is used to demonstrate our recent work, "DeepLOB: Deep Convolutional Neural Networks for Limit Order Books", published in IEEE Transactions on Singal Processing. We use FI-2010 dataset and present how model architecture is constructed here. The FI-2010 is publicly avilable and interested readers can check out their paper.

Jupyter Notebook 100.00%

deeplob-deep-convolutional-neural-networks-for-limit-order-books's People

Contributors

Stargazers

Watchers

Forkers

spmuppar missju bigandsweet cryptokr crayonzx saum25 vincentmele zergey yuriyarabskyy ninges bardolfranck elnazsn1988 leersaam laokpa yuliucu reinhardhsu sakastlord longbowstudio qiuwei mzs0207 coderfi onepear96 danlanchen scanfyu jayjinseokkim emendamus battyone niccolocastelli johnsyin lossmaster wizcap lurium croc-gen youjp ssh352 linuxpowerludo xmurobi cl19951225 wheredyoufindthis dzy39 zoist martinfranzner potatoaim101 mrmushfiq yutiansut alexdachen joukahainen natu33 jingxuan-l jiwooshin-git liamsun gaojn bychjzh xinglong012 farazbhatti westernsalt threeeyelids guome lorenzwalthert rayaldeng dominicshanshan andrea94c boyangzhou qiancapital zz236 zelinwang123 micseb rogerli2019 arshpreetsingh hansruec hushush verakumova qwzhong1988 tonylibing adgeese karetnikovn algoskynet pkuniiiiice tsxgithub01 seanahmad ray-0403 lomascolo ussyorktowncv10 lgh0504 benwaldner shivasj smalltarget108 pig618 rovedream crneville h1u2i3 azurite-r lsy617004926 limore0129 titicaca quant-views luzhang907 yingzhox elokow x-zho14

deeplob-deep-convolutional-neural-networks-for-limit-order-books's Issues

about mc-dropout

hello zihao, in the tensorflow(v2) source code you provided, I noticed that you use mc-dropout for training and validating(by enable the dropout in the predict process).
But I'm wondering that considering the predict score by mc-dropout is uncertain, why not do the predict K times and calculate the mean predict score as the final score?

another question regarding alpha paramtres for dataset labeling

Why not using a window to calculate the average price in order to label the dataset ?

Query on recreation issue - Macro F1 or weighted F1 usesd? How skewness is handled?

I am trying to recreate your scores on F1. Can you please kindly provide insights into how the F1 is calculated on F1-2010 dataset scores? You got 83.4% on k=10 as the F1.

Is the above F1 value taken by simple averaging (macro average) between per class F1 values? or by weighted averaging (weighted average) of per class F1 values?
k=10 is somewhat 1:3:1 like class distribution. Is your model performs equally better on all classes or performs better for stationary class more?
Did you do any class weighting for training on k=10? If not can I know the characteristics of your model which handles the class skewness?

LSE data avaliable?

你好，LSE LOB 数据目前公开了吗？在哪里可以找到呢？

Why the normlization method is different to the paper?

In the code, Train_Dst_NoAuction_DecPre_CF_{num}.txt is used, which employs the decimal precision normalisation, but in the paper, you mentioned the normalisation method is z-score, why?

acc很高, 老哥label分布是咋样的

请问老哥label有没有倾斜

Question about the label in code

Sorry to take up your time, I have a question about the label that you choose, why is the 3rd column, since in paper "Benchmark Dataset for Mid-Price Forecasting of Limit Order Book Data with Machine Learning Methods" page 13 saying that these 5 labels are based on k=1, 2, 3, 5 and 10, not for 100 as the "lookback" in your code, or there are some mistakes I have made......

Which price do you use in the Simple Trading Simulation

'Suppose our model produces a prediction of +1 at time t, we then buy shares at time t+5(taking slippage into account), and hold until -1 appears to sell all u shares (we do nothing if 0 appears).'
Would you like to tell which price you use at time t+5? the mid price at t+5 or others? and why do action buy at t+5, not at t?

Another question for help

'The labels are then decided based on a threshold (a) for the percentage change (lt).' What is the method which can be used to set the value of the threshold?

How do you implement the z-score normalization in the pre-processing progress?

Thanks for your nice paper and codes.
I am a little confused about the z-score normalization in the pre-processiong progress. Did you implement it independently for each feature (40 features in total)? e.g. for feature "AskPrice5", you collect all "AskPrice5" in the past 5 days to calculate mean and std? If so, the original order of price levels may be destroyed after z-score normalization.

the accuracy is very low when applying to 1 second ES futures orderbook data

Hi, this work is really promising. I tried to apply the deeplob to 1 second ES orderbook data(10 levels each side) from 2022.1.2 to 2022.1.6 with alpha=0.00004 and k=20 and using DecPre normalization (labels are balanced). The best accuracy i can get is around 35%. I can't figure out what is wrong in this experiment.
Any suggestion? Thank you very much.

A screenshot of my dataset:

And the best accuracy is around 40% when using Z-score normalization

CNN Kernel Size

Thank you very much for the Jupyter example. I studied the paper and especially the proposed architecture. However, I cannot get my head around the code implementation and what is described in your paper.

On page 5 it is explained that first CNN convolutes over p() and v(t) using (1, 2) kernel. Input features are shape (Batch, 100, 40, 1)

But in your example code the kernel is defined along the width Conv2D(32, (1, 2) ...? not height. So you already convolute over the levels (horizontially). But your p() and v(t) features are vertically as part of the 40 data feature per time stamp. Shouldn't you define Conv2D(32, (2, 1) ...)?

    conv_first1 = Conv2D(32, (1, 2), strides=(1, 2))(input_lmd) 
    conv_first1 = keras.layers.LeakyReLU(alpha=0.01)(conv_first1) 
    conv_first1 = Conv2D(32, (4, 1), padding='valid')(conv_first1)
    conv_first1 = keras.layers.LeakyReLU(alpha=0.01)(conv_first1)
    conv_first1 = Conv2D(32, (4, 1), padding='same')(conv_first1)
    conv_first1 = keras.layers.LeakyReLU(alpha=0.01)(conv_first1)

    ...

Also in the paper nothing is mentioned about the other two layers with kernel_size = (4,1). Now kernel is vertical, with height=4. Could you shortly explain the reasoning behind? Thank you.

Why only use 40 features on the benchmark dataset?

From here I see that there are 144 features and 5 labels, but in the paper and code it seems you only used 40, is there any reason?

Why the final accuracy become larger when the prediction horizon increase?

I trained the deeplob using custom dataset.
In my dataset, each line is 200ms, with five-level bid/ask.
I try k from 1 to 500. it's very strange the accuracy become larger with the k increase. (accuracy not getting smaller at all)
In the original paper you made , it's opposite.
could you provide some clue to figure out?

besides, I found someone else also encounter this problem,and no reason explains. https://github.com/yuxiangalvin/DeepLOB-Model-Implementation-Project#jnj-lob-1

Look forward to your reply

Fix errors in run_train_represent.ipynb

There are few errors in the notebook in your repo. Please fix them. Also add a direct link to the data source so readers can directly download them.

https://etsin.fairdata.fi/dataset/73eb48d7-4dbc-4a10-a52a-da745b47a649

LSE是等事件宽度截面还是等时间宽度的截面

牛津大佬，论文很厉害，受教了，有两个问题想请教一下。
1、LSE上的预测的时候，输入的数据是按照时间等分取出来的order_book截面，还是跟fi-2010一样，按照事件等分呢。
2、DBLOB里面，bayeis交易策略，出场那个不是很明白，”We exit the position if H < β2. In
next section, we show how these trading parameters affect our profits and risk level.“ 这里是不是要同时满足p(t) < \alpha 呢。

Unable to replicate result

Thanks for the awesome work. I can't replicate the result presented in paper (around 75~85 for both accuracy and F1 score depending on the horizon) following your notebook, is there anything I am missing?

The F1 score I get
Macro: 0.5184215198116572
Micro: 0.5824873824271622
Weighted: 0.5740045835109294

Accuracy is similar to the output of your notebook (~0.58).

A question regarding threshold for the percentage change

Hello,
Can you please explain how you decide to fix the threshold to 0.002 for labelling? is there any logic behind
Best Regards
Lafi RAED

small model size

I am a bit confused why the model size files are in kilobytes. Isnt is too small?? or am i missing something here?

Which version of keras?

While running the code with the latest version of Keras, I get

ImportError: cannot import name 'CuDNNLSTM' from 'keras.layers'

After searching for awhile, I found out that CuDNNLSTM is deprecated, may I know which version of Keras should I use for this notebook?

Thanks!
Dean

Original Dataset

Hi there, great paper. I can't seem to find the original dataset - it may have restricted access. Do you know where we can find the data?
Also, would you consider training on a more imbalanced dataset?

❗ Target leakage in your method

Problem

I was able to successfully transfer the model you proposed to the USD/RUB currency pair, which is traded on the Moscow Exchange, and classify the further behavior of the quotes within next 10 minutes. However, I was constantly failing to create a profitable trading strategy based on your model using the smoothed target you suggested:

${l_t=\frac{m_+(t)-m_-(t)}{m_-(t)}}$

It seemed strange to me, since the following two factors support the assumption that at least some kind of strategy can still be built:

High ROC-AUC score (80%) on a well balanced binary classification.
Beautiful and meaningful coloring of the price plot. Indeed, where there was an upward trend, the target turned out to be much greater than zero (green zones), and where there was a downward trend, it turned out to be much less (red zones):

Then I realized that the target you proposed contains a rather obvious information leak, which can be described by the following formula:

$\lim\limits_{\alpha\to+\infty}\mathbb{P}\{l_t>0\,|\,p(t)-m_-(t)>\alpha\}=1$

Moreover:

$\mathbb{E}\{m_+(t)-m_-(t)\,|\,p(t)-m_-(t)=\alpha\}=\alpha$

And this is what makes your target useless from the trading perspective, since at time t you cannot make deals at price m-minus(t).

To prove this I trained sklearn LogisticRegression using only one feature, which is the difference between the current price and m-minus term in the proposed target. ROC-AUC of such a simple model increased to 81%.

Then I refitted your model on the "noisy" alternative target, which is meaningful from the trading perspective (unlike the previous one):

${l_t=\frac{m_+(t)-p(t)}{p(t)}}$

ROC-AUC dropped to a value slightly greater than 50%, and this was not enough to beat even the bid-ask spread.

Conclusion

My experiment shows that the success you reached using the above target is not due to any "smoothing" as you say and the lack of noise, but only due to the bad target design. You simply made your target conditionally dependent on the trivially extracted feature information. Moreover, it is absolutely useless from the trading perspective, since at time t we cannot make deals at price m-minus(t), but only at something near to p(t).

Unable to replicate dataset's labaling

Hello,

First of all, thank you for this great repo, I really loved your work.

I'm trying to run a POC with your pre-trained network on my data.
When I try to verify my labeling code, I'm testing it on the dataset you have used (DecPre normalization) in order to replicate its labels, but I'm getting wrong labels.
I read your paper and the benchmark dataset paper and used exactly the same methods:

Calculating Mid-Price: p
Calculating next k events Mid-Price mean (k = 10, 20, 30, 50, 100) : mk
Label the data if (mk - p) / p is higher (1) , stationary (2) or higher (3) then alpha, where alpha is 0.00002 (0.002%)

What am I doing wrong?

Thanks ahead

The label Extraction

Hello,

I have a question about the labelling? When you label data, you use the next k timesteps avg price / current price. Are k timesteps here natural timesteps or the next k events(some changes happening in LOB)?

Thank you

how can you train 6m LSE data by the method fit

hi,
deeplob.fit(trainX_CNN, trainY_CNN, epochs=200, batch_size=64, verbose=2, validation_data=(testX_CNN, testY_CNN))
this is the only train method I find in the code. In the paper, you use 6m LSE data as train data and 3m as validation data. But I can not use fit to train them, the data is too huge.

zcakhaa / deeplob-deep-convolutional-neural-networks-for-limit-order-books Goto Github PK