salesforce / pytorch-qrnn Goto Github PK

PyTorch implementation of the Quasi-Recurrent Neural Network - up to 16 times faster than NVIDIA's cuDNN LSTM

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

pytorch-qrnn's Issues

Multi-GPU [Torch DataParallel]

Could you guys get it to work with torch.nn.DataParallel(model).cuda()? I could not, but perhaps did not try hard enough. Can't tell if it's a wrong-GPU problem, or CuPy won't support it.

Runs pretty fast on 1x GPUs though. A bit faster than 4x GPUs for vanilla LSTM, but not by much without scaling to multiple GPUs...

Problem with QRNN num_layers=2, layers=None, and input_size != hidden_size

This code will not work:

import torch
from torchqrnn import QRNN

seq_len, batch_size, hidden_size = 7, 20, 256
size = (seq_len, batch_size, 32)
X = torch.autograd.Variable(torch.rand(size), requires_grad=True).cuda()
print(X.size())

qrnn = QRNN(32, hidden_size, num_layers=2, dropout=0.4)
qrnn.cuda()
output, hidden = qrnn(X)

print(output.size(), hidden.size())

I think the problem is caused by this line:

pytorch-qrnn/torchqrnn/qrnn.py

Line 142 in b646880

 self.layers = torch.nn.ModuleList(layers if layers else [QRNNLayer(input_size, hidden_size, **kwargs) for _ in range(num_layers)]) 

Aren't the initialization parameters supposed to be QRNNLayer(hidden_size, hidden_size, **kwargs) starting from the second layer?

Error when running your example and on AWD-LSTM-LM

I have python 3.5.2, pytorch 0.4.

>>> import torch
>>> from torchqrnn import QRNN
>>> 
>>> seq_len, batch_size, hidden_size = 7, 20, 256
>>> size = (seq_len, batch_size, hidden_size)
>>> X = torch.autograd.Variable(torch.rand(size), requires_grad=True).cuda()
>>> 
>>> qrnn = QRNN(hidden_size, hidden_size, num_layers=2, dropout=0.4)
>>> qrnn.cuda()
QRNN(
  (layers): ModuleList(
    (0): QRNNLayer(
      (linear): Linear(in_features=256, out_features=768, bias=True)
    )
    (1): QRNNLayer(
      (linear): Linear(in_features=256, out_features=768, bias=True)
    )
  )
)
>>> output, hidden = qrnn(X)
/home/oeganea/.local/lib/python3.5/site-packages/torch/nn/functional.py:995: UserWarning: nn.functional.tanh is deprecated. Use torch.tanh instead.
  warnings.warn("nn.functional.tanh is deprecated. Use torch.tanh instead.")
/home/oeganea/.local/lib/python3.5/site-packages/torch/nn/functional.py:1006: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/oeganea/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/torchqrnn/qrnn.py", line 164, in forward
    input, hn = layer(input, None if hidden is None else hidden[i])
  File "/home/oeganea/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/torchqrnn/qrnn.py", line 99, in forward
    C = ForgetMult()(F, Z, hidden, use_cuda=self.use_cuda)
  File "/home/oeganea/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/torchqrnn/forget_mult.py", line 178, in forward
    if hidden_init is None: return GPUForgetMult()(f, x) if use_cuda else CPUForgetMult()(f, x)
  File "/usr/local/lib/python3.5/dist-packages/torchqrnn/forget_mult.py", line 120, in forward
    self.compile()
  File "/usr/local/lib/python3.5/dist-packages/torchqrnn/forget_mult.py", line 102, in compile
    program = Program(kernel.encode(), 'recurrent_forget_mult.cu'.encode())
  File "/usr/local/lib/python3.5/dist-packages/pynvrtc/compiler.py", line 52, in __init__
    include_names)
  File "/usr/local/lib/python3.5/dist-packages/pynvrtc/interface.py", line 200, in nvrtcCreateProgram
    c_char_p(encode_str(src)), c_char_p(encode_str(name)),
  File "/usr/local/lib/python3.5/dist-packages/pynvrtc/interface.py", line 54, in encode_str
    return s.encode("utf-8")
AttributeError: 'bytes' object has no attribute 'encode'

Also, QRNN fails on the AWD-LSTM-LM repo with a similar error:

  File "/home/oeganea/awd-lstm-lm/model.py", line 81, in forward
    raw_output, new_h = rnn(raw_output, hidden[l])
  File "/home/oeganea/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/torchqrnn/qrnn.py", line 99, in forward
    C = ForgetMult()(F, Z, hidden, use_cuda=self.use_cuda)
  File "/home/oeganea/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/torchqrnn/forget_mult.py", line 179, in forward
    return GPUForgetMult()(f, x, hidden_init) if use_cuda else CPUForgetMult()(f, x, hidden_init)
  File "/usr/local/lib/python3.5/dist-packages/torchqrnn/forget_mult.py", line 120, in forward
    self.compile()
  File "/usr/local/lib/python3.5/dist-packages/torchqrnn/forget_mult.py", line 102, in compile
    program = Program(kernel.encode(), 'recurrent_forget_mult.cu'.encode())
  File "/usr/local/lib/python3.5/dist-packages/pynvrtc/compiler.py", line 52, in __init__
    include_names)
  File "/usr/local/lib/python3.5/dist-packages/pynvrtc/interface.py", line 200, in nvrtcCreateProgram
    c_char_p(encode_str(src)), c_char_p(encode_str(name)),
  File "/usr/local/lib/python3.5/dist-packages/pynvrtc/interface.py", line 54, in encode_str
    return s.encode("utf-8")
AttributeError: 'bytes' object has no attribute 'encode'
Exception ignored in: <bound method Program.__del__ of <pynvrtc.compiler.Program object at 0x7f5aca839ef0>>
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/pynvrtc/compiler.py", line 56, in __del__
    self._interface.nvrtcDestroyProgram(self._program)
AttributeError: 'Program' object has no attribute '_program'

Any ideas ? Thanks!

Legacy autograd Runtime error

Hi all,

I am currently doing a small project on image captioning. I came across QRNN and thought of replacing LSTM with QRNN. Everything was working fine with LSTM with longer training times but as soon as I replace LSTM with QRNN, I am getting this error.

Legacy autograd function with non-static forward method is deprecated. Please use new-style autograd function with static forward method. (Example: https://pytorch.org/docs/stable/autograd.html#torch.autograd.Function)

Even on running the sample code provided on this repo I am getting the same above error.
import torch
from torchqrnn import QRNN

seq_len, batch_size, hidden_size = 7, 20, 256
size = (seq_len, batch_size, hidden_size)
X = torch.autograd.Variable(torch.rand(size), requires_grad=True).cuda()

qrnn = QRNN(hidden_size, hidden_size, num_layers=2, dropout=0.4)
qrnn.cuda()
output, hidden = qrnn(X)

print(output.size(), hidden.size())

RUntime error:Legacy autograd function with non-static forward method is deprecated. Please use new-style autograd function with static forward method. (Example: https://pytorch.org/docs/stable/autograd.html#torch.autograd.Function)

Please tell me how to get rid of this error. Thanks

Support local development by removing dependency on PyCy when not used

Pull Request: #4

Package install ASCII error from long_description=open('README')?

When pulling the repo in my Docker container, I'm able to run only if I remove the long_description=open('README.md').read() line from the setup.py script.
https://github.com/salesforce/pytorch-qrnn/blob/master/setup.py#L8

Not sure what I could do in the Docker to fix this, without making a fork and removing that line. Running in Python 3.5.

Have you gotten this error? I guess Python/ASCII reading problems will always be with us. Would love to find a workaround and not have to fork. Thanks!

Step 3/3 : RUN pip install cupy pynvrtc git+https://github.com/Smerity/pytorch-qrnn
 ---> Running in 25baaa71b04c
Collecting git+https://github.com/Smerity/pytorch-qrnn
  Cloning https://github.com/Smerity/pytorch-qrnn to /tmp/pip-a0ij_ogt-build
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-a0ij_ogt-build/setup.py", line 8, in <module>
        long_description=open('README.md').read(),
      File "/opt/conda/envs/pytorch-py35/lib/python3.5/encodings/ascii.py", line 26, in decode
        return codecs.ascii_decode(input, self.errors)[0]
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 5411: ordinal not in range(128)

Could you update the code to Pytorch 0.4?

Strings are encoded twice by both QRNN and Pynvrtc?

Usually the library runs fine (I often boot up fresh cloud GPU, Ubuntu 16.04 instances on many providers), but, on one occasion - instance by SnarkAI in particular, I saw this error:

'bytes' object has no attribute 'encode'
caused by this line:

pytorch-qrnn/torchqrnn/forget_mult.py

Line 102 in daadb0f

program = Program(kernel.encode(), 'recurrent_forget_mult.cu'.encode())

which calls upon the constructor in pynvrtc

Upon further inspection, it seems that pynvrtc also performs encode() of its own:

https://github.com/NVIDIA/pynvrtc/blob/fffa9f6f4a7ee1d452346cbdf68b84b5246ccffb/pynvrtc/interface.py#L200

Which calls upon the encode_str function - which encodes the string into utf-8 bytes if using python 3

However I'm also running Python 3.6 on all machines...

Removing the encode() in QRNN seems to have made it work on that machine for me - but I have to wonder 1. why only that machine had the issue and 2. would removing the .encode() in QRNN be alright for all other cases?

Edit: Apparently this is caused by this recent PR merge: NVIDIA/pynvrtc#2 Shouldn't QRNN be updated accordingly?

Any updates on bidirectional QRNN?

RuntimeError: matrix and matrix expected

When I run python3 multigpu_dataparallel.py, the following error occured. Can you please help? Thank you so much!

Traceback (most recent call last):
File "multigpu_dataparallel.py", line 51, in
model(x)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "multigpu_dataparallel.py", line 23, in forward
out, hidden = self.rnn(x)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torchqrnn/qrnn.py", line 164, in forward
input, hn = layer(input, None if hidden is None else hidden[i])
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torchqrnn/qrnn.py", line 70, in forward
Y = self.linear(source)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/linear.py", line 54, in forward
return self._backend.Linear()(input, self.weight, self.bias)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/functions/linear.py", line 10, in forward
output.addmm(0, 1, input, weight.t())
RuntimeError: matrix and matrix expected at /b/wheel/pytorch-src/torch/lib/THC/generic/THCTensorMathBlas.cu:237

[WinError 126] The specified module could not be found - Any idea of the error source?

Hi there,

I am trying to replace my LSTM architecture with your interesting QRNN. Following your readme file, everything is installed successfully on my machine. However, while running the example provided, I keep getting this issue. Any idea of the reason?

============================================================

OSError Traceback (most recent call last)
in
8 qrnn = QRNN(hidden_size, hidden_size, num_layers=2, dropout=0.4)
9 qrnn.cuda()
---> 10 output, hidden = qrnn(X)
11
12 print(output.size(), hidden.size())

~\Anaconda3\lib\site-packages\torch\nn\modules\module.py in call(self, *input, **kwargs)
545 result = self._slow_forward(*input, **kwargs)
546 else:
--> 547 result = self.forward(*input, **kwargs)
548 for hook in self._forward_hooks.values():
549 hook_result = hook(self, input, result)

~\Anaconda3\lib\site-packages\torchqrnn\qrnn.py in forward(self, input, hidden)
162
163 for i, layer in enumerate(self.layers):
--> 164 input, hn = layer(input, None if hidden is None else hidden[i])
165 next_hidden.append(hn)
166

~\Anaconda3\lib\site-packages\torchqrnn\qrnn.py in forward(self, X, hidden)
97 # Forget Mult
98 # For testing QRNN without ForgetMult CUDA kernel, C = Z * F may be useful
---> 99 C = ForgetMult()(F, Z, hidden, use_cuda=self.use_cuda)
100
101 # Apply (potentially optional) output gate

~\Anaconda3\lib\site-packages\torchqrnn\forget_mult.py in forward(self, f, x, hidden_init, use_cuda)
176 ###
177 # Avoiding 'RuntimeError: expected a Variable argument, but got NoneType' when hidden_init is None
--> 178 if hidden_init is None: return GPUForgetMult()(f, x) if use_cuda else CPUForgetMult()(f, x)
179 return GPUForgetMult()(f, x, hidden_init) if use_cuda else CPUForgetMult()(f, x, hidden_init)
180

~\Anaconda3\lib\site-packages\torchqrnn\forget_mult.py in forward(self, f, x, hidden_init)
118
119 def forward(self, f, x, hidden_init=None):
--> 120 self.compile()
121 seq_size, batch_size, hidden_size = f.size()
122 result = f.new(seq_size + 1, batch_size, hidden_size)

~\Anaconda3\lib\site-packages\torchqrnn\forget_mult.py in compile(self)
100 def compile(self):
101 if self.ptx is None:
--> 102 program = Program(kernel.encode(), 'recurrent_forget_mult.cu'.encode())
103 GPUForgetMult.ptx = program.compile()
104

~\Anaconda3\lib\site-packages\pynvrtc\compiler.py in init(self, src, name, headers, include_names, lib_name)
47 headers=[], include_names=[],
48 lib_name=''):
---> 49 self._interface = NVRTCInterface(lib_name)
50 self._program = self._interface.nvrtcCreateProgram(src, name,
51 headers,

~\Anaconda3\lib\site-packages\pynvrtc\interface.py in init(self, lib_path)
85 def init(self, lib_path=''):
86 self._lib = None
---> 87 self._load_nvrtc_lib(lib_path)
88
89 def _load_nvrtc_lib(self, lib_path):

~\Anaconda3\lib\site-packages\pynvrtc\interface.py in _load_nvrtc_lib(self, lib_path)
107 name = lib_path
108
--> 109 self._lib = cdll.LoadLibrary(name)
110
111 self._lib.nvrtcCreateProgram.argtypes = [

~\Anaconda3\lib\ctypes_init_.py in LoadLibrary(self, name)
424
425 def LoadLibrary(self, name):
--> 426 return self._dlltype(name)
427
428 cdll = LibraryLoader(CDLL)

~\Anaconda3\lib\ctypes_init_.py in init(self, name, mode, handle, use_errno, use_last_error)
346
347 if handle is None:
--> 348 self._handle = _dlopen(self._name, mode)
349 else:
350 self._handle = handle

OSError: [WinError 126] The specified module could not be found

Can QRNN be used in a online manner?

The dimension of the input fed to QRNN should be [Batch, Time, Channels] or [Time, Batch, Channels]. So all of each sequence should be fed into QRNN when training or testing? I am trying using it for a online task, but I think it may not be able to work in a online manner?

AttributeError: 'bytes' object has no attribute 'encode'

AttributeError Traceback (most recent call last)

in
49 model.train()
50 optimizer.zero_grad()
---> 51 classes = model(data)
52 loss = focal_loss(classes, focal_label) + margin_loss(classes, margin_label)
53 loss.backward()

~/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
487 result = self._slow_forward(*input, **kwargs)
488 else:
--> 489 result = self.forward(*input, **kwargs)
490 for hook in self._forward_hooks.values():
491 hook_result = hook(self, input, result)

in forward(self, batch)
63 # x = torch.nn.utils.rnn.pack_padded_sequence(x, lengths)
64 # self.rnn.flatten_parameters()
---> 65 x, _ = self.rnn(x)
66 # x, _ = torch.nn.utils.rnn.pad_packed_sequence(x)
67 x = self.dropout(x)

~/.local/lib/python3.6/site-packages/torchqrnn/qrnn.py in forward(self, input, hidden)
162
163 for i, layer in enumerate(self.layers):
--> 164 input, hn = layer(input, None if hidden is None else hidden[i])
165 next_hidden.append(hn)
166

~/.local/lib/python3.6/site-packages/torchqrnn/qrnn.py in forward(self, X, hidden)
97 # Forget Mult
98 # For testing QRNN without ForgetMult CUDA kernel, C = Z * F may be useful
---> 99 C = ForgetMult()(F, Z, hidden, use_cuda=self.use_cuda)
100
101 # Apply (potentially optional) output gate

~/.local/lib/python3.6/site-packages/torchqrnn/forget_mult.py in forward(self, f, x, hidden_init, use_cuda)
176 ###
177 # Avoiding 'RuntimeError: expected a Variable argument, but got NoneType' when hidden_init is None
--> 178 if hidden_init is None: return GPUForgetMult()(f, x) if use_cuda else CPUForgetMult()(f, x)
179 return GPUForgetMult()(f, x, hidden_init) if use_cuda else CPUForgetMult()(f, x, hidden_init)
180

~/.local/lib/python3.6/site-packages/torchqrnn/forget_mult.py in forward(self, f, x, hidden_init)
118
119 def forward(self, f, x, hidden_init=None):
--> 120 self.compile()
121 seq_size, batch_size, hidden_size = f.size()
122 result = f.new(seq_size + 1, batch_size, hidden_size)

~/.local/lib/python3.6/site-packages/torchqrnn/forget_mult.py in compile(self)
100 def compile(self):
101 if self.ptx is None:
--> 102 program = Program(kernel.encode(), 'recurrent_forget_mult.cu'.encode())
103 GPUForgetMult.ptx = program.compile()
104

~/.local/lib/python3.6/site-packages/pynvrtc/compiler.py in init(self, src, name, headers, include_names, lib_name)
50 self._program = self._interface.nvrtcCreateProgram(src, name,
51 headers,
---> 52 include_names)
53
54 def del(self):

~/.local/lib/python3.6/site-packages/pynvrtc/interface.py in nvrtcCreateProgram(self, src, name, headers, include_names)
198 include_names_array[:] = encode_str_list(include_names)
199 code = self._lib.nvrtcCreateProgram(byref(res),
--> 200 c_char_p(encode_str(src)), c_char_p(encode_str(name)),
201 len(headers),
202 headers_array, include_names_array)

~/.local/lib/python3.6/site-packages/pynvrtc/interface.py in encode_str(s)
52 if is_python2:
53 return s
---> 54 return s.encode("utf-8")
55
56

AttributeError: 'bytes' object has no attribute 'encode'

RuntimeError: size mismatch if use the window size of 2

Hi,

I use QRNN, and it works well with the window size of 1 (default), but when I tried the window if size 2, it gave me the following error?

Could you please help me with it?

RuntimeErrorTraceback (most recent call last)

/deep_learning/rup/.../model_semi_parallel.py in forward(self, state)
--> 173 h_new, states = self.rnn(input_enc, states)

/opt/conda/lib/python3.5/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
475 result = self._slow_forward(*input, **kwargs)
476 else:
--> 477 result = self.forward(*input, **kwargs)
478 for hook in self._forward_hooks.values():
479 hook_result = hook(self, input, result)

/opt/conda/lib/python3.5/site-packages/torchqrnn/qrnn.py in forward(self, input, hidden)
162
163 for i, layer in enumerate(self.layers):
--> 164 input, hn = layer(input, None if hidden is None else hidden[i])
165 next_hidden.append(hn)
166

/opt/conda/lib/python3.5/site-packages/torchqrnn/qrnn.py in forward(self, X, hidden)
68
69 # Matrix multiplication for the three outputs: Z, F, O
---> 70 Y = self.linear(source)
71 # Convert the tensor back to (batch, seq_len, len([Z, F, O]) * hidden_size)
72 if self.output_gate:

/opt/conda/lib/python3.5/site-packages/torch/nn/modules/linear.py in forward(self, input)
53
54 def forward(self, input):
---> 55 return F.linear(input, self.weight, self.bias)
56
57 def extra_repr(self):

/opt/conda/lib/python3.5/site-packages/torch/nn/functional.py in linear(input, weight, bias)
1024 return torch.addmm(bias, input, weight.t())
1025
-> 1026 output = input.matmul(weight.t())
1027 if bias is not None:
1028 output += bias

RuntimeError: size mismatch, m1: [20 x 1920], m2: [640 x 1920] at /opt/conda/conda-bld/pytorch_1532576276790/work/aten/src/THC/generic/THCTensorMathBlas.cu:249

Is there tensorflow code for QRNN?

Hi, I want to do more research on QRNN, I want to know there are any tensorflow package for QRNN?

ForgetMult equation in code is different from the paper

In this code, ForgetMult computes a simple recurrent equation:
h_t = f_t * x_t + (1 - f_t) * h_{t-1}
but in paper, it is
h_t = f_t * h_{t-1}+ (1 - f_t) * x_t
Which one is correct?

Is there a sample dataset to demo the project on seq2seq model

Bad squeeze in CPUForgetMult

Hi,

It looks like I've encountered a lil bug when batch_size=1 at CPU inference ( haven't checked on GPU yet ). I've found that, whilst forwarding in CPUForgetMult, there is a general squeeze for all dimensions when appending each h to the resulting list of tensors, concretely:

result.append(h.squeeze())

It turns out the size of h at each iteration is (1, batch_size, feats), so when we squeeze with batch_size=1 the resulting tensor is of size (feats,), resulting in a final stack torch.stack(result) of size (seq_len, feats).
This will cause an error when, in QRNN forward, we do C[-1:, :, :] trying to access every sample in batch dimension (i.e. 1) which does not exist because of the squeeze. We can just specify the specific squeeze dimension to be 0 (in batch_first=False option, which is the only one available atm).

Bidirectional QRNN?

When is the bidirectional QRNN going to be ready? It says it would be available in the near future, but I guess I underestimated the span of time represented by 'near'. I'm wondering if it is being developed at all.

Error in executing QRNN

Hello,
I got error when I run my modified code from example.

import time

import numpy as np

import torch
import torch.nn as nn

import torchqrnn.forget_mult
from torchqrnn import QRNN

class Model(nn.Module):

    def __init__(self, hidden_size=1024, layers=3, vocab=100):
        super(Model, self).__init__()

        self.embedding = nn.Embedding(vocab, hidden_size)

        self.rnn = QRNN(hidden_size, hidden_size, num_layers=layers)

    def forward(self, x):
        x = self.embedding(x)
        out, hidden = self.rnn(x)
        return out[:-1]

H = 256
SEQ = 100
BATCH = 64

H = 1024
SEQ = 500
BATCH = 128

LOOPS = 500

np.random.seed(42)
torch.manual_seed(42)
torch.cuda.manual_seed(42)

x = torch.autograd.Variable(torch.LongTensor(np.random.randint(0, 100, [BATCH, SEQ])))
x = x.cuda()

np.random.seed(42)
torch.manual_seed(42)
torch.cuda.manual_seed(42)

model = Model(H)
model = model.cuda()
model(x)

The error message is:

Traceback (most recent call last):
  File "train.py", line 42, in <module>
    model(x)
  File "/home/michael/.virtualenvs/fyp-pytorch/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "train.py", line 22, in forward
    out, hidden = self.rnn(x)
  File "/home/michael/.virtualenvs/fyp-pytorch/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/michael/.virtualenvs/fyp-pytorch/local/lib/python2.7/site-packages/torchqrnn/qrnn.py", line 160, in forward
    input, hn = layer(input, None if hidden is None else hidden[i])
  File "/home/michael/.virtualenvs/fyp-pytorch/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/michael/.virtualenvs/fyp-pytorch/local/lib/python2.7/site-packages/torchqrnn/qrnn.py", line 95, in forward
    C = ForgetMult()(F, Z, hidden, use_cuda=self.use_cuda)
  File "/home/michael/.virtualenvs/fyp-pytorch/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/michael/.virtualenvs/fyp-pytorch/local/lib/python2.7/site-packages/torchqrnn/forget_mult.py", line 175, in forward
    if hidden_init is None: return GPUForgetMult()(f, x) if use_cuda else CPUForgetMult()(f, x)
  File "/home/michael/.virtualenvs/fyp-pytorch/local/lib/python2.7/site-packages/torchqrnn/forget_mult.py", line 127, in forward
    self.forget_mult(grid=grid, block=(grid_hidden_size, 1), args=[result.data_ptr(), f.data_ptr(), x.data_ptr(), seq_size, batch_size, hidden_size], stream=self.stream)
  File "cupy/cuda/function.pyx", line 143, in cupy.cuda.function.Function.__call__
TypeError: 'float' object cannot be interpreted as an index

I am using CUDA 8.0 and python 2.7. Thanks.

salesforce / pytorch-qrnn Goto Github PK

pytorch-qrnn's Issues

============================================================

Recommend Projects

Recommend Topics

Recommend Org