Comments (13)
Hey @InterTriplete2010, I believe that our RNN responses require a response for each individual time step. So, in your example above, if the input point is [1 2 3]
and your intention is that the RNN returns 6
at the end of that sequence, try a responses Cube with 3 slices, where the responses for that point are [1 3 6]
. (In essence, at each time step, the response is the sum of the input seen so far. You could also, if you only wanted the RNN to respond at the end of the sequence, make the response [0 0 6]
, but my intuition is that providing the partial sum at each time step would work better.)
So, the shape of responses would be 1 row by 15 columns by 3 slices (e.g., 1 dimension, 15 samples, 3 time steps).
I believe that that will work... want to give it a try and see what happens?
from models.
Hey @InterTriplete2010, I believe that our RNN responses require a response for each individual time step. So, in your example above, if the input point is
[1 2 3]
and your intention is that the RNN returns6
at the end of that sequence, try a responses Cube with 3 slices, where the responses for that point are[1 3 6]
. (In essence, at each time step, the response is the sum of the input seen so far. You could also, if you only wanted the RNN to respond at the end of the sequence, make the response[0 0 6]
, but my intuition is that providing the partial sum at each time step would work better.)So, the shape of responses would be 1 row by 15 columns by 3 slices (e.g., 1 dimension, 15 samples, 3 time steps).
I believe that that will work... want to give it a try and see what happens?
Hi @rcurtin, thank you for your response. Unfortunately, it doesn't seem to be working. I tried both ways ([0 0 6] and [1 3 6]), but no luck. I hope I am not doing something stupid, even though the input and response cube appear to be correct to me.
Something that I was reading in the documentation of RNN (I hope I didn't misinterpreted the parameters of RNN) is that it should be possible to predict only the last output by setting "single" to true: RNN<MeanSquaredError<>, HeInitialization> model(rho,true); in the line of code that I posted. (https://mlpack.org/doc/mlpack-3.1.0/doxygen/classmlpack_1_1ann_1_1RNN.html#aa07a59fdcbe988264200c8f593c73bbf). But when I did it, I could not get any meaningful result.
Also, my final goal is to apply LSTM to MEL coefficients. The reason I decided to run this simple example was to figure out if I was doing something wrong with the setup of the MEL coefficients, since I was not able to get any meaningful result. So ideally, I would really need to predict only the last output.
Anything else that am I missing here or possibly doing wrong? Do you have any example of RNN applied to this type of problems with mlpack?
Thank you!!!
Alex.
from models.
Ahh, sorry that my example didn't work. Here are a couple of examples of RNNs being used in mlpack:
https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/rnn_reber_test.cpp
https://github.com/mlpack/examples/blob/master/lstm_stock_prediction/lstm_stock_prediction.cpp
https://github.com/mlpack/examples/blob/master/lstm_electricity_consumption/lstm_electricity_consumption.cpp
I'm not sure if any of those use single
mode. Anyway, I am not the biggest expert on the RNN code (I was guessing in the last response), but it seems that you can use single
mode by having only a single slice in your responses. So, in your situation, your responses would have 1 row, 15 columns, and 1 slice. (So, if the slices of the first column of the input data were [1, 2, 3]
, then the only slice of the response data for the first column would be [6]
.)
Maybe give that a shot and see if that helps?
from models.
Unfortunately I have already looked at these examples and none of them use a single mode. And I have already tried to use a response with 1 row, 15 columns, and 1 slice (That was the example that I posted). That didn't return any good result. Additionally, when I test the model, it still returns 3 output, instead of one. I really hope this can be done with mlpack, because I really don't want to switch to python.
from models.
I played with it and wrote this simple example code, which seems to work:
#include <mlpack/core.hpp>
#include <mlpack/methods/ann/rnn.hpp>
#include <mlpack/methods/ann/loss_functions/mean_squared_error.hpp>
#include <mlpack/methods/ann/init_rules/he_init.hpp>
using namespace ens;
using namespace mlpack::ann;
int main()
{
arma::cube inputData(1, 3, 3);
inputData.slice(0) = arma::mat("1 2 4");
inputData.slice(1) = arma::mat("2 2 5");
inputData.slice(2) = arma::mat("3 3 6");
arma::cube responses(1, 3, 1);
responses.slice(0) = arma::mat("6 7 15");
// Build RNN in single mode.
RNN<MeanSquaredError<>, HeInitialization> model(3, true);
model.Add<IdentityLayer<>>();
model.Add<LSTM<>>(1, 5 /* 5 cells */, 3);
model.Add<LeakyReLU<>>();
model.Add<Linear<>>(5, 1);
// Train for 1000 epochs...
ens::Adam opt(0.1, 1, 0.9, 0.999, 1e-8, 3000, 1e-8, true);
model.Train(inputData, responses, opt, ens::ProgressBar());
arma::cube predictions;
model.Predict(inputData, predictions);
std::cout << "Predictions:\n" << predictions;
}
In that case, I set things up exactly like your problem (but with only 3 data points, not 15), and it seems to work. I'm sure the network needs additional tuning to actually perform well---the predictions didn't look great (but they at least went the right direction---the output gets greater for each successive input).
Anyway, maybe you can adapt that example to your case? Or if you are still having trouble, maybe the issue is not the input shape? I'm sure we can get it narrowed down. 👍
from models.
Thank you for sharing this code with me. Yes, it seems to be identical to mine. As a matter of fact the results are similar, that is the prediction output increases. As you increase the number of neurons of the LSTM layer (I tried up to 1000), results get better too, but they are still far away from the correct prediction and the validation loss is obviously pretty high. However, what worries me is that if I try to use the exact same parameters in python (I replicated my code and yours in python using Keras) I get an extremely accurate prediction and the validation loss is also in the a good range. For instance, if I try to predict [50 51 52] I get ~154 in python. And I also get a single output, while in mlpack I get 3 outputs per sample. I honestly think that the format of the input and response are correct, at least based on the documentation. But the LSTM doesn't seem to be able to get trained. I assume the algorithm is exactly the same that is used in python, so it should return similar results. I am sure there must be a parameter in the RNN that I am not setting up correctly, but I cannot figure out which one. I tried several things, but none of them seem to be working
from models.
I'm not sure the network structure I used there is the best for this task, but I agree that if the RNN is in single
mode that we should get back only one prediction. So I wonder if maybe there is a bug there; but unfortunately the internal implementation of RNN
is not something I know closely, so I am not sure what is going on.
from models.
Somehow the issue must have fallen under my table, I'll have to take a closer look at the code, in the meantime do you mind to share the keras example code you used as comparison?
from models.
Here it is. Thank you both for your help.
from numpy import array
from keras.preprocessing.text import one_hot
from keras.preprocessing.sequence import pad_sequences
from keras.models import Sequential
from keras.layers.core import Activation, Dropout, Dense
from keras.layers import Flatten, LSTM
from keras.layers import GlobalMaxPooling1D
from keras.models import Model
from keras.layers.embeddings import Embedding
from sklearn.model_selection import train_test_split
from keras.preprocessing.text import Tokenizer
from keras.layers import Input
from keras.layers.merge import Concatenate
from keras.layers import Bidirectional
import pandas as pd
import numpy as np
import re
X = np.array([x+1 for x in range(45)])
X = X.reshape(15,3,1)
#print(X)
Y = list()
for x in X:
Y.append(x.sum())
Y = np.array(Y)
print(Y)
model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(3, 1)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
history = model.fit(X, Y, epochs=500, validation_split=0.2, verbose=1)
test_input = array([50,51,52])
test_input = test_input.reshape((1, 3, 1))
#print(test_input.shape)
test_output = model.predict(test_input, verbose=0)
print(test_output)
from models.
Thanks for the code!
from models.
Just a quick update, I started to look into the issue, but it will probably be over the weekend before I can provide a solution here.
from models.
Hi @zoq, just wondering if you got a chance to look at the code. No pressure at all. I am just eager to try it, when you are done :-) Thank you so much again for your help.
from models.
This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍
from models.
Related Issues (20)
- Where to put convert class HOT 6
- make showing error HOT 3
- Implementing YOLOv4 HOT 3
- input for add<Convolution> mlpack HOT 3
- input for RNN (LSTM) when using MEL-coefficients for speech recognition HOT 3
- Alpha matting model HOT 4
- Documentation error in augmentation.hpp HOT 2
- Fix build.
- Error while building on M1 macbook HOT 24
- Namespace and Include guards HOT 3
- Improve description for the repository HOT 7
- Add crons to update dependencies. HOT 2
- Restructure of the model files HOT 6
- Distribution of preTrained models HOT 10
- Addition of MobileNet V1 HOT 6
- Tracker for improvements in models doc. HOT 4
- dataloader.LoadImageDatasetFromDirectory can not be used because error impl in Augmentation::ResizeTransform HOT 5
- mlpack version specific HOT 3
- The YOLO architecture does not match what is mentioned in the YOLOv1 official paper. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from models.