Git Product home page Git Product logo

pytorch-qrnn's Introduction

Quasi-Recurrent Neural Network (QRNN) for PyTorch

Updated to support multi-GPU environments via DataParallel - see the the multigpu_dataparallel.py example.

This repository contains a PyTorch implementation of Salesforce Research's Quasi-Recurrent Neural Networks paper.

The QRNN provides similar accuracy to the LSTM but can be betwen 2 and 17 times faster than the highly optimized NVIDIA cuDNN LSTM implementation depending on the use case.

To install, simply run:

pip install cupy pynvrtc git+https://github.com/salesforce/pytorch-qrnn

If you use this code or our results in your research, please cite:

@article{bradbury2016quasi,
  title={{Quasi-Recurrent Neural Networks}},
  author={Bradbury, James and Merity, Stephen and Xiong, Caiming and Socher, Richard},
  journal={International Conference on Learning Representations (ICLR 2017)},
  year={2017}
}

Software Requirements

This codebase requires Python 3, PyTorch, pynvrtc (NVIDIA's Python Bindings to NVRTC), and CuPy. While the codebase contains a CPU implementation of the QRNN, the GPU QRNN implementation is used by default if possible. Requirements are provided in requirements.txt.

Example Usage

We've updated the previously released Salesforce Research AWD-LSTM language modeling codebase to support use of the AWD-QRNN. With the same number of parameters as the LSTM and less well tuned hyper parameters, the QRNN model trains over twice as quickly and achieves nearly equivalent state-of-the-art language modeling results. For full details, refer to the AWD-LSTM-LM repository.

Usage

The QRNN API is meant to be drop-in compatible with the LSTM for many standard use cases. As such, the easiest thing to do is replace any GRU or LSTM module with the QRNN.

Note: bidirectional QRNN is not yet supported though will be in the near future.

import torch
from torchqrnn import QRNN

seq_len, batch_size, hidden_size = 7, 20, 256
size = (seq_len, batch_size, hidden_size)
X = torch.autograd.Variable(torch.rand(size), requires_grad=True).cuda()

qrnn = QRNN(hidden_size, hidden_size, num_layers=2, dropout=0.4)
qrnn.cuda()
output, hidden = qrnn(X)

print(output.size(), hidden.size())

The full documentation for the QRNN is listed below:

QRNN(input_size, hidden_size, num_layers, dropout=0):
    Applies a multiple layer Quasi-Recurrent Neural Network (QRNN) to an input sequence.

    Args:
        input_size: The number of expected features in the input x.
        hidden_size: The number of features in the hidden state h. If not specified, the input size is used.
        num_layers: The number of QRNN layers to produce.
        layers: List of preconstructed QRNN layers to use for the QRNN module (optional).
        save_prev_x: Whether to store previous inputs for use in future convolutional windows (i.e. for a continuing sequence such as in language modeling). If true, you must call reset to remove cached previous values of x. Default: False.
        window: Defines the size of the convolutional window (how many previous tokens to look when computing the QRNN values). Supports 1 and 2. Default: 1.
        zoneout: Whether to apply zoneout (i.e. failing to update elements in the hidden state) to the hidden state updates. Default: 0.
        output_gate: If True, performs QRNN-fo (applying an output gate to the output). If False, performs QRNN-f. Default: True.
        use_cuda: If True, uses fast custom CUDA kernel. If False, uses naive for loop. Default: True.

    Inputs: X, hidden
        - X (seq_len, batch, input_size): tensor containing the features of the input sequence.
        - hidden (layers, batch, hidden_size): tensor containing the initial hidden state for the QRNN.

    Outputs: output, h_n
        - output (seq_len, batch, hidden_size): tensor containing the output of the QRNN for each timestep.
        - h_n (layers, batch, hidden_size): tensor containing the hidden state for t=seq_len

The included QRNN layer supports convolutional windows of size 1 or 2 but will be extended in the future to support arbitrary convolutions.

If you are using convolutional windows of size 2 (i.e. looking at the inputs from two previous timesteps to compute the input) and want to run over a long sequence in batches, such as when using BPTT, you can set save_prev_x=True and call reset when you wish to reset the cached previous inputs.

If you want flexibility in the definition of each QRNN layer, you can construct individual QRNNLayer modules and pass them to the QRNN module using the layer argument.

Speed

Speeds are between 2 and 17 times faster than NVIDIA's cuDNN LSTM, with the difference as a result of varying batch size and sequence length. The largest gains are for small batch sizes or long sequence lengths, both highlighting the LSTMs parallelization difficulty due to forced sequentiality. For full information, refer to the Quasi-Recurrent Neural Networks paper.

Figure 4 from QRNN paper

Pictured above is Figure 4 from the QRNN paper:
Left: Training speed for two-layer 640-unit PTB LM on a batch of 20 examples of 105 timesteps. “RNN” and “softmax” include the forward and backward times, while “optimization overhead” includes gradient clipping, L2 regularization, and SGD computations.
Right: Inference speed advantage of a 320-unit QRNN layer alone over an equal-sized cuDNN LSTM layer for data with the given batch size and sequence length. Training results are similar.

Extending the QRNN speed advantage to other recurrent architectures with ForgetMult

The QRNN architecture's speed advantage comes from two primary sources: the ability to batch all computations into a few large matrix multiplications and the use of a fast element-wise recurrence function. This recurrence function, named ForgetMult, is general and can be used in other scenarios. The ForgetMult takes two arguments - the candidate input x and forget gates f - and computes h = f * x + (1 - f) * hm1 where hm1 is the previous hidden state output.

The QRNN class is a thin wrapper around this that performs the large matrix multiplications for the candidate x, the forget gates f, and the output gates o. Any other operation which requires recurrence and can have precomputed values for the candidate x and forget gates f can use this fast form of recurrence.

Example usage of the ForgetMult module: output = ForgetMult()(f, x, hidden).

    ForgetMult computes a simple recurrent equation:
    h_t = f_t * x_t + (1 - f_t) * h_{t-1}

    This equation is equivalent to dynamic weighted averaging.

    Inputs: X, hidden
        - X (seq_len, batch, input_size): tensor containing the features of the input sequence.
        - F (seq_len, batch, input_size): tensor containing the forget gate values, assumed in range [0, 1].
        - hidden_init (batch, input_size): tensor containing the initial hidden state for the recurrence (h_{t-1}).
        - cuda: If True, use the fast element-wise CUDA kernel for recurrence. If False, uses naive for loop. Default: True.

Want to help out?

First, thanks! :)

Open tasks that are interesting:

  • Modify the ForgetMult CUDA kernel to produce a BackwardForgetMult. This will enable a bidirectional QRNN. The input should be the same - f and x - but the kernel should walk backwards through the inputs.
  • Bidirectional QRNN support (requires the modification above)
  • Support PyTorch's PackedSequence such that variable length sequences are correctly masked
  • Show how to use the underlying fast recurrence operator ForgetMult in other generic ways

pytorch-qrnn's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytorch-qrnn's Issues

Strings are encoded twice by both QRNN and Pynvrtc?

Usually the library runs fine (I often boot up fresh cloud GPU, Ubuntu 16.04 instances on many providers), but, on one occasion - instance by SnarkAI in particular, I saw this error:

'bytes' object has no attribute 'encode'
caused by this line:

program = Program(kernel.encode(), 'recurrent_forget_mult.cu'.encode())

which calls upon the constructor in pynvrtc

Upon further inspection, it seems that pynvrtc also performs encode() of its own:

https://github.com/NVIDIA/pynvrtc/blob/fffa9f6f4a7ee1d452346cbdf68b84b5246ccffb/pynvrtc/interface.py#L200

Which calls upon the encode_str function - which encodes the string into utf-8 bytes if using python 3

However I'm also running Python 3.6 on all machines...

Removing the encode() in QRNN seems to have made it work on that machine for me - but I have to wonder 1. why only that machine had the issue and 2. would removing the .encode() in QRNN be alright for all other cases?

Edit: Apparently this is caused by this recent PR merge: NVIDIA/pynvrtc#2 Shouldn't QRNN be updated accordingly?

RuntimeError: matrix and matrix expected

When I run python3 multigpu_dataparallel.py, the following error occured. Can you please help? Thank you so much!

Traceback (most recent call last):
File "multigpu_dataparallel.py", line 51, in
model(x)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "multigpu_dataparallel.py", line 23, in forward
out, hidden = self.rnn(x)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torchqrnn/qrnn.py", line 164, in forward
input, hn = layer(input, None if hidden is None else hidden[i])
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torchqrnn/qrnn.py", line 70, in forward
Y = self.linear(source)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/linear.py", line 54, in forward
return self._backend.Linear()(input, self.weight, self.bias)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/functions/linear.py", line 10, in forward
output.addmm
(0, 1, input, weight.t())
RuntimeError: matrix and matrix expected at /b/wheel/pytorch-src/torch/lib/THC/generic/THCTensorMathBlas.cu:237

Error in executing QRNN

Hello,
I got error when I run my modified code from example.

import time

import numpy as np

import torch
import torch.nn as nn

import torchqrnn.forget_mult
from torchqrnn import QRNN

class Model(nn.Module):

    def __init__(self, hidden_size=1024, layers=3, vocab=100):
        super(Model, self).__init__()

        self.embedding = nn.Embedding(vocab, hidden_size)

        self.rnn = QRNN(hidden_size, hidden_size, num_layers=layers)

    def forward(self, x):
        x = self.embedding(x)
        out, hidden = self.rnn(x)
        return out[:-1]

H = 256
SEQ = 100
BATCH = 64

H = 1024
SEQ = 500
BATCH = 128

LOOPS = 500

np.random.seed(42)
torch.manual_seed(42)
torch.cuda.manual_seed(42)

x = torch.autograd.Variable(torch.LongTensor(np.random.randint(0, 100, [BATCH, SEQ])))
x = x.cuda()

np.random.seed(42)
torch.manual_seed(42)
torch.cuda.manual_seed(42)

model = Model(H)
model = model.cuda()
model(x)

The error message is:

Traceback (most recent call last):
  File "train.py", line 42, in <module>
    model(x)
  File "/home/michael/.virtualenvs/fyp-pytorch/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "train.py", line 22, in forward
    out, hidden = self.rnn(x)
  File "/home/michael/.virtualenvs/fyp-pytorch/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/michael/.virtualenvs/fyp-pytorch/local/lib/python2.7/site-packages/torchqrnn/qrnn.py", line 160, in forward
    input, hn = layer(input, None if hidden is None else hidden[i])
  File "/home/michael/.virtualenvs/fyp-pytorch/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/michael/.virtualenvs/fyp-pytorch/local/lib/python2.7/site-packages/torchqrnn/qrnn.py", line 95, in forward
    C = ForgetMult()(F, Z, hidden, use_cuda=self.use_cuda)
  File "/home/michael/.virtualenvs/fyp-pytorch/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/michael/.virtualenvs/fyp-pytorch/local/lib/python2.7/site-packages/torchqrnn/forget_mult.py", line 175, in forward
    if hidden_init is None: return GPUForgetMult()(f, x) if use_cuda else CPUForgetMult()(f, x)
  File "/home/michael/.virtualenvs/fyp-pytorch/local/lib/python2.7/site-packages/torchqrnn/forget_mult.py", line 127, in forward
    self.forget_mult(grid=grid, block=(grid_hidden_size, 1), args=[result.data_ptr(), f.data_ptr(), x.data_ptr(), seq_size, batch_size, hidden_size], stream=self.stream)
  File "cupy/cuda/function.pyx", line 143, in cupy.cuda.function.Function.__call__
TypeError: 'float' object cannot be interpreted as an index

I am using CUDA 8.0 and python 2.7. Thanks.

[WinError 126] The specified module could not be found - Any idea of the error source?

Hi there,

I am trying to replace my LSTM architecture with your interesting QRNN. Following your readme file, everything is installed successfully on my machine. However, while running the example provided, I keep getting this issue. Any idea of the reason?

============================================================

OSError Traceback (most recent call last)
in
8 qrnn = QRNN(hidden_size, hidden_size, num_layers=2, dropout=0.4)
9 qrnn.cuda()
---> 10 output, hidden = qrnn(X)
11
12 print(output.size(), hidden.size())

~\Anaconda3\lib\site-packages\torch\nn\modules\module.py in call(self, *input, **kwargs)
545 result = self._slow_forward(*input, **kwargs)
546 else:
--> 547 result = self.forward(*input, **kwargs)
548 for hook in self._forward_hooks.values():
549 hook_result = hook(self, input, result)

~\Anaconda3\lib\site-packages\torchqrnn\qrnn.py in forward(self, input, hidden)
162
163 for i, layer in enumerate(self.layers):
--> 164 input, hn = layer(input, None if hidden is None else hidden[i])
165 next_hidden.append(hn)
166

~\Anaconda3\lib\site-packages\torch\nn\modules\module.py in call(self, *input, **kwargs)
545 result = self._slow_forward(*input, **kwargs)
546 else:
--> 547 result = self.forward(*input, **kwargs)
548 for hook in self._forward_hooks.values():
549 hook_result = hook(self, input, result)

~\Anaconda3\lib\site-packages\torchqrnn\qrnn.py in forward(self, X, hidden)
97 # Forget Mult
98 # For testing QRNN without ForgetMult CUDA kernel, C = Z * F may be useful
---> 99 C = ForgetMult()(F, Z, hidden, use_cuda=self.use_cuda)
100
101 # Apply (potentially optional) output gate

~\Anaconda3\lib\site-packages\torch\nn\modules\module.py in call(self, *input, **kwargs)
545 result = self._slow_forward(*input, **kwargs)
546 else:
--> 547 result = self.forward(*input, **kwargs)
548 for hook in self._forward_hooks.values():
549 hook_result = hook(self, input, result)

~\Anaconda3\lib\site-packages\torchqrnn\forget_mult.py in forward(self, f, x, hidden_init, use_cuda)
176 ###
177 # Avoiding 'RuntimeError: expected a Variable argument, but got NoneType' when hidden_init is None
--> 178 if hidden_init is None: return GPUForgetMult()(f, x) if use_cuda else CPUForgetMult()(f, x)
179 return GPUForgetMult()(f, x, hidden_init) if use_cuda else CPUForgetMult()(f, x, hidden_init)
180

~\Anaconda3\lib\site-packages\torchqrnn\forget_mult.py in forward(self, f, x, hidden_init)
118
119 def forward(self, f, x, hidden_init=None):
--> 120 self.compile()
121 seq_size, batch_size, hidden_size = f.size()
122 result = f.new(seq_size + 1, batch_size, hidden_size)

~\Anaconda3\lib\site-packages\torchqrnn\forget_mult.py in compile(self)
100 def compile(self):
101 if self.ptx is None:
--> 102 program = Program(kernel.encode(), 'recurrent_forget_mult.cu'.encode())
103 GPUForgetMult.ptx = program.compile()
104

~\Anaconda3\lib\site-packages\pynvrtc\compiler.py in init(self, src, name, headers, include_names, lib_name)
47 headers=[], include_names=[],
48 lib_name=''):
---> 49 self._interface = NVRTCInterface(lib_name)
50 self._program = self._interface.nvrtcCreateProgram(src, name,
51 headers,

~\Anaconda3\lib\site-packages\pynvrtc\interface.py in init(self, lib_path)
85 def init(self, lib_path=''):
86 self._lib = None
---> 87 self._load_nvrtc_lib(lib_path)
88
89 def _load_nvrtc_lib(self, lib_path):

~\Anaconda3\lib\site-packages\pynvrtc\interface.py in _load_nvrtc_lib(self, lib_path)
107 name = lib_path
108
--> 109 self._lib = cdll.LoadLibrary(name)
110
111 self._lib.nvrtcCreateProgram.argtypes = [

~\Anaconda3\lib\ctypes_init_.py in LoadLibrary(self, name)
424
425 def LoadLibrary(self, name):
--> 426 return self._dlltype(name)
427
428 cdll = LibraryLoader(CDLL)

~\Anaconda3\lib\ctypes_init_.py in init(self, name, mode, handle, use_errno, use_last_error)
346
347 if handle is None:
--> 348 self._handle = _dlopen(self._name, mode)
349 else:
350 self._handle = handle

OSError: [WinError 126] The specified module could not be found

AttributeError: 'bytes' object has no attribute 'encode'

AttributeError Traceback (most recent call last)

in
49 model.train()
50 optimizer.zero_grad()
---> 51 classes = model(data)
52 loss = focal_loss(classes, focal_label) + margin_loss(classes, margin_label)
53 loss.backward()

~/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
487 result = self._slow_forward(*input, **kwargs)
488 else:
--> 489 result = self.forward(*input, **kwargs)
490 for hook in self._forward_hooks.values():
491 hook_result = hook(self, input, result)

in forward(self, batch)
63 # x = torch.nn.utils.rnn.pack_padded_sequence(x, lengths)
64 # self.rnn.flatten_parameters()
---> 65 x, _ = self.rnn(x)
66 # x, _ = torch.nn.utils.rnn.pad_packed_sequence(x)
67 x = self.dropout(x)

~/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
487 result = self._slow_forward(*input, **kwargs)
488 else:
--> 489 result = self.forward(*input, **kwargs)
490 for hook in self._forward_hooks.values():
491 hook_result = hook(self, input, result)

~/.local/lib/python3.6/site-packages/torchqrnn/qrnn.py in forward(self, input, hidden)
162
163 for i, layer in enumerate(self.layers):
--> 164 input, hn = layer(input, None if hidden is None else hidden[i])
165 next_hidden.append(hn)
166

~/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
487 result = self._slow_forward(*input, **kwargs)
488 else:
--> 489 result = self.forward(*input, **kwargs)
490 for hook in self._forward_hooks.values():
491 hook_result = hook(self, input, result)

~/.local/lib/python3.6/site-packages/torchqrnn/qrnn.py in forward(self, X, hidden)
97 # Forget Mult
98 # For testing QRNN without ForgetMult CUDA kernel, C = Z * F may be useful
---> 99 C = ForgetMult()(F, Z, hidden, use_cuda=self.use_cuda)
100
101 # Apply (potentially optional) output gate

~/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
487 result = self._slow_forward(*input, **kwargs)
488 else:
--> 489 result = self.forward(*input, **kwargs)
490 for hook in self._forward_hooks.values():
491 hook_result = hook(self, input, result)

~/.local/lib/python3.6/site-packages/torchqrnn/forget_mult.py in forward(self, f, x, hidden_init, use_cuda)
176 ###
177 # Avoiding 'RuntimeError: expected a Variable argument, but got NoneType' when hidden_init is None
--> 178 if hidden_init is None: return GPUForgetMult()(f, x) if use_cuda else CPUForgetMult()(f, x)
179 return GPUForgetMult()(f, x, hidden_init) if use_cuda else CPUForgetMult()(f, x, hidden_init)
180

~/.local/lib/python3.6/site-packages/torchqrnn/forget_mult.py in forward(self, f, x, hidden_init)
118
119 def forward(self, f, x, hidden_init=None):
--> 120 self.compile()
121 seq_size, batch_size, hidden_size = f.size()
122 result = f.new(seq_size + 1, batch_size, hidden_size)

~/.local/lib/python3.6/site-packages/torchqrnn/forget_mult.py in compile(self)
100 def compile(self):
101 if self.ptx is None:
--> 102 program = Program(kernel.encode(), 'recurrent_forget_mult.cu'.encode())
103 GPUForgetMult.ptx = program.compile()
104

~/.local/lib/python3.6/site-packages/pynvrtc/compiler.py in init(self, src, name, headers, include_names, lib_name)
50 self._program = self._interface.nvrtcCreateProgram(src, name,
51 headers,
---> 52 include_names)
53
54 def del(self):

~/.local/lib/python3.6/site-packages/pynvrtc/interface.py in nvrtcCreateProgram(self, src, name, headers, include_names)
198 include_names_array[:] = encode_str_list(include_names)
199 code = self._lib.nvrtcCreateProgram(byref(res),
--> 200 c_char_p(encode_str(src)), c_char_p(encode_str(name)),
201 len(headers),
202 headers_array, include_names_array)

~/.local/lib/python3.6/site-packages/pynvrtc/interface.py in encode_str(s)
52 if is_python2:
53 return s
---> 54 return s.encode("utf-8")
55
56

AttributeError: 'bytes' object has no attribute 'encode'

Problem with QRNN num_layers=2, layers=None, and input_size != hidden_size

This code will not work:

import torch
from torchqrnn import QRNN

seq_len, batch_size, hidden_size = 7, 20, 256
size = (seq_len, batch_size, 32)
X = torch.autograd.Variable(torch.rand(size), requires_grad=True).cuda()
print(X.size())

qrnn = QRNN(32, hidden_size, num_layers=2, dropout=0.4)
qrnn.cuda()
output, hidden = qrnn(X)

print(output.size(), hidden.size())

I think the problem is caused by this line:

self.layers = torch.nn.ModuleList(layers if layers else [QRNNLayer(input_size, hidden_size, **kwargs) for _ in range(num_layers)])

Aren't the initialization parameters supposed to be QRNNLayer(hidden_size, hidden_size, **kwargs) starting from the second layer?

Error when running your example and on AWD-LSTM-LM

I have python 3.5.2, pytorch 0.4.

>>> import torch
>>> from torchqrnn import QRNN
>>> 
>>> seq_len, batch_size, hidden_size = 7, 20, 256
>>> size = (seq_len, batch_size, hidden_size)
>>> X = torch.autograd.Variable(torch.rand(size), requires_grad=True).cuda()
>>> 
>>> qrnn = QRNN(hidden_size, hidden_size, num_layers=2, dropout=0.4)
>>> qrnn.cuda()
QRNN(
  (layers): ModuleList(
    (0): QRNNLayer(
      (linear): Linear(in_features=256, out_features=768, bias=True)
    )
    (1): QRNNLayer(
      (linear): Linear(in_features=256, out_features=768, bias=True)
    )
  )
)
>>> output, hidden = qrnn(X)
/home/oeganea/.local/lib/python3.5/site-packages/torch/nn/functional.py:995: UserWarning: nn.functional.tanh is deprecated. Use torch.tanh instead.
  warnings.warn("nn.functional.tanh is deprecated. Use torch.tanh instead.")
/home/oeganea/.local/lib/python3.5/site-packages/torch/nn/functional.py:1006: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/oeganea/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/torchqrnn/qrnn.py", line 164, in forward
    input, hn = layer(input, None if hidden is None else hidden[i])
  File "/home/oeganea/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/torchqrnn/qrnn.py", line 99, in forward
    C = ForgetMult()(F, Z, hidden, use_cuda=self.use_cuda)
  File "/home/oeganea/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/torchqrnn/forget_mult.py", line 178, in forward
    if hidden_init is None: return GPUForgetMult()(f, x) if use_cuda else CPUForgetMult()(f, x)
  File "/usr/local/lib/python3.5/dist-packages/torchqrnn/forget_mult.py", line 120, in forward
    self.compile()
  File "/usr/local/lib/python3.5/dist-packages/torchqrnn/forget_mult.py", line 102, in compile
    program = Program(kernel.encode(), 'recurrent_forget_mult.cu'.encode())
  File "/usr/local/lib/python3.5/dist-packages/pynvrtc/compiler.py", line 52, in __init__
    include_names)
  File "/usr/local/lib/python3.5/dist-packages/pynvrtc/interface.py", line 200, in nvrtcCreateProgram
    c_char_p(encode_str(src)), c_char_p(encode_str(name)),
  File "/usr/local/lib/python3.5/dist-packages/pynvrtc/interface.py", line 54, in encode_str
    return s.encode("utf-8")
AttributeError: 'bytes' object has no attribute 'encode'

Also, QRNN fails on the AWD-LSTM-LM repo with a similar error:

  File "/home/oeganea/awd-lstm-lm/model.py", line 81, in forward
    raw_output, new_h = rnn(raw_output, hidden[l])
  File "/home/oeganea/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/torchqrnn/qrnn.py", line 99, in forward
    C = ForgetMult()(F, Z, hidden, use_cuda=self.use_cuda)
  File "/home/oeganea/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/torchqrnn/forget_mult.py", line 179, in forward
    return GPUForgetMult()(f, x, hidden_init) if use_cuda else CPUForgetMult()(f, x, hidden_init)
  File "/usr/local/lib/python3.5/dist-packages/torchqrnn/forget_mult.py", line 120, in forward
    self.compile()
  File "/usr/local/lib/python3.5/dist-packages/torchqrnn/forget_mult.py", line 102, in compile
    program = Program(kernel.encode(), 'recurrent_forget_mult.cu'.encode())
  File "/usr/local/lib/python3.5/dist-packages/pynvrtc/compiler.py", line 52, in __init__
    include_names)
  File "/usr/local/lib/python3.5/dist-packages/pynvrtc/interface.py", line 200, in nvrtcCreateProgram
    c_char_p(encode_str(src)), c_char_p(encode_str(name)),
  File "/usr/local/lib/python3.5/dist-packages/pynvrtc/interface.py", line 54, in encode_str
    return s.encode("utf-8")
AttributeError: 'bytes' object has no attribute 'encode'
Exception ignored in: <bound method Program.__del__ of <pynvrtc.compiler.Program object at 0x7f5aca839ef0>>
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/pynvrtc/compiler.py", line 56, in __del__
    self._interface.nvrtcDestroyProgram(self._program)
AttributeError: 'Program' object has no attribute '_program'

Any ideas ? Thanks!

RuntimeError: size mismatch if use the window size of 2

Hi,

I use QRNN, and it works well with the window size of 1 (default), but when I tried the window if size 2, it gave me the following error?

Could you please help me with it?

RuntimeErrorTraceback (most recent call last)

/deep_learning/rup/.../model_semi_parallel.py in forward(self, state)
--> 173 h_new, states = self.rnn(input_enc, states)

/opt/conda/lib/python3.5/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
475 result = self._slow_forward(*input, **kwargs)
476 else:
--> 477 result = self.forward(*input, **kwargs)
478 for hook in self._forward_hooks.values():
479 hook_result = hook(self, input, result)

/opt/conda/lib/python3.5/site-packages/torchqrnn/qrnn.py in forward(self, input, hidden)
162
163 for i, layer in enumerate(self.layers):
--> 164 input, hn = layer(input, None if hidden is None else hidden[i])
165 next_hidden.append(hn)
166

/opt/conda/lib/python3.5/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
475 result = self._slow_forward(*input, **kwargs)
476 else:
--> 477 result = self.forward(*input, **kwargs)
478 for hook in self._forward_hooks.values():
479 hook_result = hook(self, input, result)

/opt/conda/lib/python3.5/site-packages/torchqrnn/qrnn.py in forward(self, X, hidden)
68
69 # Matrix multiplication for the three outputs: Z, F, O
---> 70 Y = self.linear(source)
71 # Convert the tensor back to (batch, seq_len, len([Z, F, O]) * hidden_size)
72 if self.output_gate:

/opt/conda/lib/python3.5/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
475 result = self._slow_forward(*input, **kwargs)
476 else:
--> 477 result = self.forward(*input, **kwargs)
478 for hook in self._forward_hooks.values():
479 hook_result = hook(self, input, result)

/opt/conda/lib/python3.5/site-packages/torch/nn/modules/linear.py in forward(self, input)
53
54 def forward(self, input):
---> 55 return F.linear(input, self.weight, self.bias)
56
57 def extra_repr(self):

/opt/conda/lib/python3.5/site-packages/torch/nn/functional.py in linear(input, weight, bias)
1024 return torch.addmm(bias, input, weight.t())
1025
-> 1026 output = input.matmul(weight.t())
1027 if bias is not None:
1028 output += bias

RuntimeError: size mismatch, m1: [20 x 1920], m2: [640 x 1920] at /opt/conda/conda-bld/pytorch_1532576276790/work/aten/src/THC/generic/THCTensorMathBlas.cu:249

Bad squeeze in CPUForgetMult

Hi,

It looks like I've encountered a lil bug when batch_size=1 at CPU inference ( haven't checked on GPU yet ). I've found that, whilst forwarding in CPUForgetMult, there is a general squeeze for all dimensions when appending each h to the resulting list of tensors, concretely:

result.append(h.squeeze())

It turns out the size of h at each iteration is (1, batch_size, feats), so when we squeeze with batch_size=1 the resulting tensor is of size (feats,), resulting in a final stack torch.stack(result) of size (seq_len, feats).
This will cause an error when, in QRNN forward, we do C[-1:, :, :] trying to access every sample in batch dimension (i.e. 1) which does not exist because of the squeeze. We can just specify the specific squeeze dimension to be 0 (in batch_first=False option, which is the only one available atm).

Legacy autograd Runtime error

Hi all,

I am currently doing a small project on image captioning. I came across QRNN and thought of replacing LSTM with QRNN. Everything was working fine with LSTM with longer training times but as soon as I replace LSTM with QRNN, I am getting this error.

Legacy autograd function with non-static forward method is deprecated. Please use new-style autograd function with static forward method. (Example: https://pytorch.org/docs/stable/autograd.html#torch.autograd.Function)

Even on running the sample code provided on this repo I am getting the same above error.
import torch
from torchqrnn import QRNN

seq_len, batch_size, hidden_size = 7, 20, 256
size = (seq_len, batch_size, hidden_size)
X = torch.autograd.Variable(torch.rand(size), requires_grad=True).cuda()

qrnn = QRNN(hidden_size, hidden_size, num_layers=2, dropout=0.4)
qrnn.cuda()
output, hidden = qrnn(X)

print(output.size(), hidden.size())

RUntime error:Legacy autograd function with non-static forward method is deprecated. Please use new-style autograd function with static forward method. (Example: https://pytorch.org/docs/stable/autograd.html#torch.autograd.Function)

Please tell me how to get rid of this error. Thanks

Bidirectional QRNN?

When is the bidirectional QRNN going to be ready? It says it would be available in the near future, but I guess I underestimated the span of time represented by 'near'. I'm wondering if it is being developed at all.

Package install ASCII error from long_description=open('README')?

When pulling the repo in my Docker container, I'm able to run only if I remove the long_description=open('README.md').read() line from the setup.py script.
https://github.com/salesforce/pytorch-qrnn/blob/master/setup.py#L8

Not sure what I could do in the Docker to fix this, without making a fork and removing that line. Running in Python 3.5.

Have you gotten this error? I guess Python/ASCII reading problems will always be with us. Would love to find a workaround and not have to fork. Thanks!

Step 3/3 : RUN pip install cupy pynvrtc git+https://github.com/Smerity/pytorch-qrnn
 ---> Running in 25baaa71b04c
Collecting git+https://github.com/Smerity/pytorch-qrnn
  Cloning https://github.com/Smerity/pytorch-qrnn to /tmp/pip-a0ij_ogt-build
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-a0ij_ogt-build/setup.py", line 8, in <module>
        long_description=open('README.md').read(),
      File "/opt/conda/envs/pytorch-py35/lib/python3.5/encodings/ascii.py", line 26, in decode
        return codecs.ascii_decode(input, self.errors)[0]
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 5411: ordinal not in range(128)

Multi-GPU [Torch DataParallel]

Could you guys get it to work with torch.nn.DataParallel(model).cuda()? I could not, but perhaps did not try hard enough. Can't tell if it's a wrong-GPU problem, or CuPy won't support it.

Runs pretty fast on 1x GPUs though. A bit faster than 4x GPUs for vanilla LSTM, but not by much without scaling to multiple GPUs...

Can QRNN be used in a online manner?

The dimension of the input fed to QRNN should be [Batch, Time, Channels] or [Time, Batch, Channels]. So all of each sequence should be fed into QRNN when training or testing? I am trying using it for a online task, but I think it may not be able to work in a online manner?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.