Hello, So I have been trying to learn how to use LSTM with mlpack. I

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-h

Question about the response array for a LSTM (many-to-one) model about models HOT 13 CLOSED

mlpack commented on June 3, 2024

Question about the response array for a LSTM (many-to-one) model

from models.

Comments (13)

rcurtin commented on June 3, 2024 1

Hey @InterTriplete2010, I believe that our RNN responses require a response for each individual time step. So, in your example above, if the input point is [1 2 3] and your intention is that the RNN returns 6 at the end of that sequence, try a responses Cube with 3 slices, where the responses for that point are [1 3 6]. (In essence, at each time step, the response is the sum of the input seen so far. You could also, if you only wanted the RNN to respond at the end of the sequence, make the response [0 0 6], but my intuition is that providing the partial sum at each time step would work better.)

So, the shape of responses would be 1 row by 15 columns by 3 slices (e.g., 1 dimension, 15 samples, 3 time steps).

I believe that that will work... want to give it a try and see what happens?

from models.

InterTriplete2010 commented on June 3, 2024

Hey @InterTriplete2010, I believe that our RNN responses require a response for each individual time step. So, in your example above, if the input point is [1 2 3] and your intention is that the RNN returns 6 at the end of that sequence, try a responses Cube with 3 slices, where the responses for that point are [1 3 6]. (In essence, at each time step, the response is the sum of the input seen so far. You could also, if you only wanted the RNN to respond at the end of the sequence, make the response [0 0 6], but my intuition is that providing the partial sum at each time step would work better.)

So, the shape of responses would be 1 row by 15 columns by 3 slices (e.g., 1 dimension, 15 samples, 3 time steps).

I believe that that will work... want to give it a try and see what happens?

Hi @rcurtin, thank you for your response. Unfortunately, it doesn't seem to be working. I tried both ways ([0 0 6] and [1 3 6]), but no luck. I hope I am not doing something stupid, even though the input and response cube appear to be correct to me.

Something that I was reading in the documentation of RNN (I hope I didn't misinterpreted the parameters of RNN) is that it should be possible to predict only the last output by setting "single" to true: RNN<MeanSquaredError<>, HeInitialization> model(rho,true); in the line of code that I posted. (https://mlpack.org/doc/mlpack-3.1.0/doxygen/classmlpack_1_1ann_1_1RNN.html#aa07a59fdcbe988264200c8f593c73bbf). But when I did it, I could not get any meaningful result.

Also, my final goal is to apply LSTM to MEL coefficients. The reason I decided to run this simple example was to figure out if I was doing something wrong with the setup of the MEL coefficients, since I was not able to get any meaningful result. So ideally, I would really need to predict only the last output.

Anything else that am I missing here or possibly doing wrong? Do you have any example of RNN applied to this type of problems with mlpack?

Thank you!!!
Alex.

from models.

rcurtin commented on June 3, 2024

Ahh, sorry that my example didn't work. Here are a couple of examples of RNNs being used in mlpack:

https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/rnn_reber_test.cpp
https://github.com/mlpack/examples/blob/master/lstm_stock_prediction/lstm_stock_prediction.cpp
https://github.com/mlpack/examples/blob/master/lstm_electricity_consumption/lstm_electricity_consumption.cpp

I'm not sure if any of those use single mode. Anyway, I am not the biggest expert on the RNN code (I was guessing in the last response), but it seems that you can use single mode by having only a single slice in your responses. So, in your situation, your responses would have 1 row, 15 columns, and 1 slice. (So, if the slices of the first column of the input data were [1, 2, 3], then the only slice of the response data for the first column would be [6].)

Maybe give that a shot and see if that helps?

from models.

InterTriplete2010 commented on June 3, 2024

Unfortunately I have already looked at these examples and none of them use a single mode. And I have already tried to use a response with 1 row, 15 columns, and 1 slice (That was the example that I posted). That didn't return any good result. Additionally, when I test the model, it still returns 3 output, instead of one. I really hope this can be done with mlpack, because I really don't want to switch to python.

from models.

rcurtin commented on June 3, 2024

I played with it and wrote this simple example code, which seems to work:

#include <mlpack/core.hpp>
#include <mlpack/methods/ann/rnn.hpp>
#include <mlpack/methods/ann/loss_functions/mean_squared_error.hpp>
#include <mlpack/methods/ann/init_rules/he_init.hpp>

using namespace ens;
using namespace mlpack::ann;

int main()
{
  arma::cube inputData(1, 3, 3);
  inputData.slice(0) = arma::mat("1 2 4");
  inputData.slice(1) = arma::mat("2 2 5");
  inputData.slice(2) = arma::mat("3 3 6");

  arma::cube responses(1, 3, 1);
  responses.slice(0) = arma::mat("6 7 15");

  // Build RNN in single mode.
  RNN<MeanSquaredError<>, HeInitialization> model(3, true);
  model.Add<IdentityLayer<>>();
  model.Add<LSTM<>>(1, 5 /* 5 cells */, 3);
  model.Add<LeakyReLU<>>();
  model.Add<Linear<>>(5, 1);

  // Train for 1000 epochs...
  ens::Adam opt(0.1, 1, 0.9, 0.999, 1e-8, 3000, 1e-8, true);

  model.Train(inputData, responses, opt, ens::ProgressBar());

  arma::cube predictions;

  model.Predict(inputData, predictions);

  std::cout << "Predictions:\n" << predictions;
}

In that case, I set things up exactly like your problem (but with only 3 data points, not 15), and it seems to work. I'm sure the network needs additional tuning to actually perform well---the predictions didn't look great (but they at least went the right direction---the output gets greater for each successive input).

Anyway, maybe you can adapt that example to your case? Or if you are still having trouble, maybe the issue is not the input shape? I'm sure we can get it narrowed down. 👍

from models.

InterTriplete2010 commented on June 3, 2024

Thank you for sharing this code with me. Yes, it seems to be identical to mine. As a matter of fact the results are similar, that is the prediction output increases. As you increase the number of neurons of the LSTM layer (I tried up to 1000), results get better too, but they are still far away from the correct prediction and the validation loss is obviously pretty high. However, what worries me is that if I try to use the exact same parameters in python (I replicated my code and yours in python using Keras) I get an extremely accurate prediction and the validation loss is also in the a good range. For instance, if I try to predict [50 51 52] I get ~154 in python. And I also get a single output, while in mlpack I get 3 outputs per sample. I honestly think that the format of the input and response are correct, at least based on the documentation. But the LSTM doesn't seem to be able to get trained. I assume the algorithm is exactly the same that is used in python, so it should return similar results. I am sure there must be a parameter in the RNN that I am not setting up correctly, but I cannot figure out which one. I tried several things, but none of them seem to be working

from models.

rcurtin commented on June 3, 2024

I'm not sure the network structure I used there is the best for this task, but I agree that if the RNN is in single mode that we should get back only one prediction. So I wonder if maybe there is a bug there; but unfortunately the internal implementation of RNN is not something I know closely, so I am not sure what is going on.

from models.

zoq commented on June 3, 2024

Somehow the issue must have fallen under my table, I'll have to take a closer look at the code, in the meantime do you mind to share the keras example code you used as comparison?

from models.

InterTriplete2010 commented on June 3, 2024

Here it is. Thank you both for your help.

from numpy import array
from keras.preprocessing.text import one_hot
from keras.preprocessing.sequence import pad_sequences
from keras.models import Sequential
from keras.layers.core import Activation, Dropout, Dense
from keras.layers import Flatten, LSTM
from keras.layers import GlobalMaxPooling1D
from keras.models import Model
from keras.layers.embeddings import Embedding
from sklearn.model_selection import train_test_split
from keras.preprocessing.text import Tokenizer
from keras.layers import Input
from keras.layers.merge import Concatenate
from keras.layers import Bidirectional

import pandas as pd
import numpy as np
import re

X = np.array([x+1 for x in range(45)])
X = X.reshape(15,3,1)
#print(X)
Y = list()
for x in X:
Y.append(x.sum())

Y = np.array(Y)
print(Y)
model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(3, 1)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')

history = model.fit(X, Y, epochs=500, validation_split=0.2, verbose=1)

test_input = array([50,51,52])
test_input = test_input.reshape((1, 3, 1))
#print(test_input.shape)
test_output = model.predict(test_input, verbose=0)
print(test_output)

from models.

zoq commented on June 3, 2024

Thanks for the code!

from models.

zoq commented on June 3, 2024

Just a quick update, I started to look into the issue, but it will probably be over the weekend before I can provide a solution here.

from models.

InterTriplete2010 commented on June 3, 2024

Hi @zoq, just wondering if you got a chance to look at the code. No pressure at all. I am just eager to try it, when you are done :-) Thank you so much again for your help.

from models.

mlpack-bot commented on June 3, 2024

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍

from models.

Question about the response array for a LSTM (many-to-one) model about models HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent