Git Product home page Git Product logo

Comments (13)

rcurtin avatar rcurtin commented on June 3, 2024 1

Hey @InterTriplete2010, I believe that our RNN responses require a response for each individual time step. So, in your example above, if the input point is [1 2 3] and your intention is that the RNN returns 6 at the end of that sequence, try a responses Cube with 3 slices, where the responses for that point are [1 3 6]. (In essence, at each time step, the response is the sum of the input seen so far. You could also, if you only wanted the RNN to respond at the end of the sequence, make the response [0 0 6], but my intuition is that providing the partial sum at each time step would work better.)

So, the shape of responses would be 1 row by 15 columns by 3 slices (e.g., 1 dimension, 15 samples, 3 time steps).

I believe that that will work... want to give it a try and see what happens?

from models.

InterTriplete2010 avatar InterTriplete2010 commented on June 3, 2024

Hey @InterTriplete2010, I believe that our RNN responses require a response for each individual time step. So, in your example above, if the input point is [1 2 3] and your intention is that the RNN returns 6 at the end of that sequence, try a responses Cube with 3 slices, where the responses for that point are [1 3 6]. (In essence, at each time step, the response is the sum of the input seen so far. You could also, if you only wanted the RNN to respond at the end of the sequence, make the response [0 0 6], but my intuition is that providing the partial sum at each time step would work better.)

So, the shape of responses would be 1 row by 15 columns by 3 slices (e.g., 1 dimension, 15 samples, 3 time steps).

I believe that that will work... want to give it a try and see what happens?

Hi @rcurtin, thank you for your response. Unfortunately, it doesn't seem to be working. I tried both ways ([0 0 6] and [1 3 6]), but no luck. I hope I am not doing something stupid, even though the input and response cube appear to be correct to me.

Something that I was reading in the documentation of RNN (I hope I didn't misinterpreted the parameters of RNN) is that it should be possible to predict only the last output by setting "single" to true: RNN<MeanSquaredError<>, HeInitialization> model(rho,true); in the line of code that I posted. (https://mlpack.org/doc/mlpack-3.1.0/doxygen/classmlpack_1_1ann_1_1RNN.html#aa07a59fdcbe988264200c8f593c73bbf). But when I did it, I could not get any meaningful result.

Also, my final goal is to apply LSTM to MEL coefficients. The reason I decided to run this simple example was to figure out if I was doing something wrong with the setup of the MEL coefficients, since I was not able to get any meaningful result. So ideally, I would really need to predict only the last output.

Anything else that am I missing here or possibly doing wrong? Do you have any example of RNN applied to this type of problems with mlpack?

Thank you!!!
Alex.

from models.

rcurtin avatar rcurtin commented on June 3, 2024

Ahh, sorry that my example didn't work. Here are a couple of examples of RNNs being used in mlpack:

https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/rnn_reber_test.cpp
https://github.com/mlpack/examples/blob/master/lstm_stock_prediction/lstm_stock_prediction.cpp
https://github.com/mlpack/examples/blob/master/lstm_electricity_consumption/lstm_electricity_consumption.cpp

I'm not sure if any of those use single mode. Anyway, I am not the biggest expert on the RNN code (I was guessing in the last response), but it seems that you can use single mode by having only a single slice in your responses. So, in your situation, your responses would have 1 row, 15 columns, and 1 slice. (So, if the slices of the first column of the input data were [1, 2, 3], then the only slice of the response data for the first column would be [6].)

Maybe give that a shot and see if that helps?

from models.

InterTriplete2010 avatar InterTriplete2010 commented on June 3, 2024

Unfortunately I have already looked at these examples and none of them use a single mode. And I have already tried to use a response with 1 row, 15 columns, and 1 slice (That was the example that I posted). That didn't return any good result. Additionally, when I test the model, it still returns 3 output, instead of one. I really hope this can be done with mlpack, because I really don't want to switch to python.

from models.

rcurtin avatar rcurtin commented on June 3, 2024

I played with it and wrote this simple example code, which seems to work:

#include <mlpack/core.hpp>
#include <mlpack/methods/ann/rnn.hpp>
#include <mlpack/methods/ann/loss_functions/mean_squared_error.hpp>
#include <mlpack/methods/ann/init_rules/he_init.hpp>

using namespace ens;
using namespace mlpack::ann;

int main()
{
  arma::cube inputData(1, 3, 3);
  inputData.slice(0) = arma::mat("1 2 4");
  inputData.slice(1) = arma::mat("2 2 5");
  inputData.slice(2) = arma::mat("3 3 6");

  arma::cube responses(1, 3, 1);
  responses.slice(0) = arma::mat("6 7 15");

  // Build RNN in single mode.
  RNN<MeanSquaredError<>, HeInitialization> model(3, true);
  model.Add<IdentityLayer<>>();
  model.Add<LSTM<>>(1, 5 /* 5 cells */, 3);
  model.Add<LeakyReLU<>>();
  model.Add<Linear<>>(5, 1);

  // Train for 1000 epochs...
  ens::Adam opt(0.1, 1, 0.9, 0.999, 1e-8, 3000, 1e-8, true);

  model.Train(inputData, responses, opt, ens::ProgressBar());

  arma::cube predictions;

  model.Predict(inputData, predictions);

  std::cout << "Predictions:\n" << predictions;
}

In that case, I set things up exactly like your problem (but with only 3 data points, not 15), and it seems to work. I'm sure the network needs additional tuning to actually perform well---the predictions didn't look great (but they at least went the right direction---the output gets greater for each successive input).

Anyway, maybe you can adapt that example to your case? Or if you are still having trouble, maybe the issue is not the input shape? I'm sure we can get it narrowed down. 👍

from models.

InterTriplete2010 avatar InterTriplete2010 commented on June 3, 2024

Thank you for sharing this code with me. Yes, it seems to be identical to mine. As a matter of fact the results are similar, that is the prediction output increases. As you increase the number of neurons of the LSTM layer (I tried up to 1000), results get better too, but they are still far away from the correct prediction and the validation loss is obviously pretty high. However, what worries me is that if I try to use the exact same parameters in python (I replicated my code and yours in python using Keras) I get an extremely accurate prediction and the validation loss is also in the a good range. For instance, if I try to predict [50 51 52] I get ~154 in python. And I also get a single output, while in mlpack I get 3 outputs per sample. I honestly think that the format of the input and response are correct, at least based on the documentation. But the LSTM doesn't seem to be able to get trained. I assume the algorithm is exactly the same that is used in python, so it should return similar results. I am sure there must be a parameter in the RNN that I am not setting up correctly, but I cannot figure out which one. I tried several things, but none of them seem to be working

from models.

rcurtin avatar rcurtin commented on June 3, 2024

I'm not sure the network structure I used there is the best for this task, but I agree that if the RNN is in single mode that we should get back only one prediction. So I wonder if maybe there is a bug there; but unfortunately the internal implementation of RNN is not something I know closely, so I am not sure what is going on.

from models.

zoq avatar zoq commented on June 3, 2024

Somehow the issue must have fallen under my table, I'll have to take a closer look at the code, in the meantime do you mind to share the keras example code you used as comparison?

from models.

InterTriplete2010 avatar InterTriplete2010 commented on June 3, 2024

Here it is. Thank you both for your help.

from numpy import array
from keras.preprocessing.text import one_hot
from keras.preprocessing.sequence import pad_sequences
from keras.models import Sequential
from keras.layers.core import Activation, Dropout, Dense
from keras.layers import Flatten, LSTM
from keras.layers import GlobalMaxPooling1D
from keras.models import Model
from keras.layers.embeddings import Embedding
from sklearn.model_selection import train_test_split
from keras.preprocessing.text import Tokenizer
from keras.layers import Input
from keras.layers.merge import Concatenate
from keras.layers import Bidirectional

import pandas as pd
import numpy as np
import re

X = np.array([x+1 for x in range(45)])
X = X.reshape(15,3,1)
#print(X)
Y = list()
for x in X:
Y.append(x.sum())

Y = np.array(Y)
print(Y)
model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(3, 1)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')

history = model.fit(X, Y, epochs=500, validation_split=0.2, verbose=1)

test_input = array([50,51,52])
test_input = test_input.reshape((1, 3, 1))
#print(test_input.shape)
test_output = model.predict(test_input, verbose=0)
print(test_output)

from models.

zoq avatar zoq commented on June 3, 2024

Thanks for the code!

from models.

zoq avatar zoq commented on June 3, 2024

Just a quick update, I started to look into the issue, but it will probably be over the weekend before I can provide a solution here.

from models.

InterTriplete2010 avatar InterTriplete2010 commented on June 3, 2024

Hi @zoq, just wondering if you got a chance to look at the code. No pressure at all. I am just eager to try it, when you are done :-) Thank you so much again for your help.

from models.

mlpack-bot avatar mlpack-bot commented on June 3, 2024

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍

from models.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.