coreylynch / grid-lstm Goto Github PK
View Code? Open in Web Editor NEWTorch7 implementation of Grid LSTM as described here: http://arxiv.org/pdf/1507.01526v2.pdf
Torch7 implementation of Grid LSTM as described here: http://arxiv.org/pdf/1507.01526v2.pdf
This error occurs consistently, but only at seemingly random intervals. It also prevents sampling from or resuming training from that checkpoint.
Here's the traceback:
saving checkpoint to cv/lm_lstm_epoch0.10_1.2833.t7 /home/____/torch/install/bin/luajit: /home/____/torch/install/share/lua/5.1/torch/File.lua:210: write error: wrote 39776036 blocks instead of 41251350 at /tmp/luarocks_torch-scm-1-1419/torch7/lib/TH/THDiskFile.c:340 stack traceback: [C]: in function 'write' /home/____/torch/install/share/lua/5.1/torch/File.lua:210: in function </home/____/torch/install/share/lua/5.1/torch/File.lua:107> [C]: in function 'write' /home/____/torch/install/share/lua/5.1/torch/File.lua:210: in function 'writeObject' /home/____/torch/install/share/lua/5.1/torch/File.lua:235: in function 'writeObject' /home/____/torch/install/share/lua/5.1/nn/Module.lua:154: in function 'write' /home/____/torch/install/share/lua/5.1/torch/File.lua:210: in function 'writeObject' /home/____/torch/install/share/lua/5.1/torch/File.lua:235: in function 'writeObject' /home/____/torch/install/share/lua/5.1/torch/File.lua:235: in function 'writeObject' /home/____/torch/install/share/lua/5.1/torch/File.lua:220: in function 'writeObject' /home/____/torch/install/share/lua/5.1/torch/File.lua:228: in function 'writeObject' ... /home/____/torch/install/share/lua/5.1/torch/File.lua:235: in function 'writeObject' /home/____/torch/install/share/lua/5.1/nn/Module.lua:154: in function 'write' /home/____/torch/install/share/lua/5.1/torch/File.lua:210: in function 'writeObject' /home/____/torch/install/share/lua/5.1/torch/File.lua:235: in function 'writeObject' /home/____/torch/install/share/lua/5.1/torch/File.lua:235: in function 'writeObject' /home/____/torch/install/share/lua/5.1/torch/File.lua:388: in function 'save' train.lua:420: in main chunk [C]: in function 'dofile' ...than/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk [C]: at 0x00406670
Hi,
Training runs fine, but sampling gives the following error:
th ./sample.lua cv/lm_lstm_epoch7.01_1.3587.t7
nngraph/gmodule.lua:362: Got 3 inputs instead of 5
stack traceback:
[C]: in function 'error'
/home/user/torch/install/share/lua/5.1/nngraph/gmodule.lua:362: in function 'forward'
./sample.lua:173: in main chunk
[C]: in function 'dofile'
/home/user/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00406670
char-rnn on the same machine doesn't give error.
Updated the nngraph package:
luarocks install nngraph
The issue remains. Thank you for looking into it and for sharing.
Hi:
I'm new to torch/lua but I'm familiar with running karpathy's original on a gpu with no problems. I've installed the latest cudnn 7.5. The latest version libcudnn.so.5 is but when I test your version I received the following errors referencing libcudnn.so.4 :
$ th train.lua -model grid_lstm
/home/aaron/torch/install/bin/luajit: /home/aaron/torch/install/share/lua/5.1/trepl/init.lua:384: /home/aaron/torch/install/share/lua/5.1/trepl/init.lua:384: /home/aaron/torch/install/share/lua/5.1/cudnn/ffi.lua:1278: 'libcudnn (R4) not found in library path.
Please install CuDNN from https://developer.nvidia.com/cuDNN
My paths are:
$ export CPATH=/usr/local/cuda/include:$CPATH
$ export LIBRARY_PATH=/usr/local/cuda/lib64:$LIBRARY_PATH
$ export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
Is there a change I can make to the code to fix this problem?
Thanks,
th train.lua -model grid.lstm
using CUDA on GPU 0...
loading data files...
cutting off end of data so that the batches/sequences divide evenly
reshaping tensor...
data load done. Number of data batches in train: 423, val: 23, test: 0
vocab size: 65
creating an grid.lstm with 2 layers
number of parameters in the model: 0
cloning criterion
/usr/local/torch/install/bin/luajit: train.lua:315: attempt to index field 'rnn' (a nil value)
stack traceback:
train.lua:315: in function 'opfunc'
/usr/local/torch/install/share/lua/5.1/optim/adam.lua:33: in function 'adam'
train.lua:381: in main chunk
[C]: in function 'dofile'
...ocal/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00405ea0
Hi, I configure as char-RNN's README and I could train char-rnn on my computer (so I suppose I set up the environment properly), but when it comes to grid-lstm, it failed. The error messages are as above.
Could you kindly tell me what's wrong with it?
Thanks!
Hi,
I'm just wondering why you use this form in https://github.com/coreylynch/grid-lstm/blob/master/model/GridLSTM.lua#L31
local next_h = nn.CMulTable()({out_gate, nn.Tanh()(next_c)})
in the paper it is
local next_h = nn.Tanh()(nn.CMulTable()({out_gate, next_c}))
Thank you for your response
I'd like to be able to sample the results of the checkpoints in order to see if the network has developed longer dependency/more creativity than the vanilla LSTM, but there's no sample.lua file, and the default sample.lua can't handle the grid-lstm model.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.