torch type mismatch error about speech HOT 9 OPEN

pkyoung commented on August 19, 2024

torch type mismatch error

from speech.

Comments (9)

pkyoung commented on August 19, 2024 2

I managed to execute train.py, but not yet confirmed training was successful or not.
The quick (and dirty) remedy to above error was:

diff --git a/speech/models/seq2seq.py b/speech/models/seq2seq.py
index b2881e3..65e3a38 100644
--- a/speech/models/seq2seq.py
+++ b/speech/models/seq2seq.py
@@ -87,7 +87,7 @@ class Seq2Seq(model.Model):

         hx = torch.zeros((x.shape[0], x.shape[2]), requires_grad=False)
         if self.is_cuda:
-            hx.cuda()
+            hx = hx.cuda()
         ax = None; sx = None;
         for t in range(y.size()[1] - 1):
             sample = (out and self.scheduled_sampling)
@@ -119,7 +119,7 @@ class Seq2Seq(model.Model):
         if state is None:
             hx = torch.zeros((x.shape[0], x.shape[2]), requires_grad=False)
             if self.is_cuda:
-                hx.cuda()
+                hx = hx.cuda()
             ax = None; sx = None;
         else:
             hx, ax, sx = state
@@ -164,7 +164,7 @@ class Seq2Seq(model.Model):
         Infer a likely output. No beam search yet.
         """
         x, y = self.collate(*batch)
-        end_tok = y.data[0, -1] # TODO
+        end_tok = y.data[0, -1].cuda() # TODO
         t = y
         if self.is_cuda:
             x = x.cuda()
@@ -172,7 +172,7 @@ class Seq2Seq(model.Model):
         x = self.encode(x)

         # needs to be the start token, TODO
-        y = t[:, 0:1]
+        y = t[:, 0:1].cuda()
         _, argmaxs = self.infer_decode(x, y, end_tok, max_len)
         argmaxs = argmaxs.cpu().data.numpy()
         return [seq.tolist() for seq in argmaxs]

And there was also error in train.py

diff --git a/train.py b/train.py
index a04eb6c..6141ba0 100644
--- a/train.py
+++ b/train.py
@@ -10,6 +10,7 @@ import torch
 import torch.nn as nn
 import torch.optim
 import tqdm
+import copy

 import speech
 import speech.loader as loader
@@ -30,7 +31,7 @@ def run_epoch(model, optimizer, train_ldr, it, avg_loss):
         loss.backward()

         grad_norm = nn.utils.clip_grad_norm(model.parameters(), 200)
-        loss = loss.data[0]
+        loss = loss.item()

         optimizer.step()
         prev_end_t = end_t
@@ -54,11 +55,13 @@ def eval_dev(model, ldr, preproc):
     model.set_eval()

     for batch in tqdm.tqdm(ldr):
-        preds = model.infer(batch)
-        loss = model.loss(batch)
-        losses.append(loss.data[0])
+        batch_ = copy.deepcopy(batch)
+        preds = model.infer(batch_)
+        batch_ = copy.deepcopy(batch)
+        loss = model.loss(batch_)
+        losses.append(loss.item())
         all_preds.extend(preds)
-        all_labels.extend(batch[1])
+        all_labels.extend(list(batch)[1])

     model.set_train()

from speech.

arattari commented on August 19, 2024

@silnos If you do manage to fix this, please post an update/solution!

from speech.

arattari commented on August 19, 2024

Thanks, I implemented this on my copy as well. I'm still getting another error:
RuntimeError: Assertion cur_target >= 0 && cur_target < n_classes' failed. -- I'm wondering if you have encountered something similar at all?

I'm wondering if you've run this on Librispeech?

from speech.

arattari commented on August 19, 2024

@silnos, could you share your config file?? are you using the default included example??

from speech.

pkyoung commented on August 19, 2024

@arattari Yes I am using default json file in examples/timit. I haven't tried librispeech

With timit examples, I got the 2x.x % error rate, and I am not sure this is okay or not yet.
I am going to look into the codes soon.

from speech.

eeric commented on August 19, 2024

With python3.6, pytorch0.4.1, cuda9.0,
I got the following error when I run train.py with timit example:

$ python train.py examples/timit/seq2seq_config.json
Traceback (most recent call last):
  File "train.py", line 146, in <module>
    run(config)
  File "train.py", line 104, in run
    run_state = run_epoch(model, optimizer, train_ldr, *run_state)
  File "train.py", line 29, in run_epoch
    loss = model.loss(batch)
  File "/path/to/speech/models/seq2seq.py", line 57, in loss
    out, alis = self.forward_impl(x, y)
  File "/path/to/speech/models/seq2seq.py", line 68, in forward_impl
    out, alis = self.decode(x, y)
  File "/path/to/speech/models/seq2seq.py", line 103, in decode
    hx = self.dec_rnn(ix.squeeze(dim=1), hx)
  File "/path/to/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/path/to/lib64/python3.6/site-packages/torch/nn/modules/rnn.py", line 794, in forward
    self.bias_ih, self.bias_hh,
  File "/path/to/lib64/python3.6/site-packages/torch/nn/_functions/rnn.py", line 53, in GRUCell
    gh = F.linear(hidden, w_hh)
  File "/path/to/lib64/python3.6/site-packages/torch/nn/functional.py", line 1026, in linear
    output = input.matmul(weight.t())
RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #2 'mat2'

hx = self.dec_rnn(ix.squeeze(dim=1), hx)
modified successfully :
hx = self.dec_rnn(ix.squeeze(dim=1), hx.cuda())

from speech.

eeric commented on August 19, 2024

With python3.6, pytorch0.4.1, cuda9.0,

Traceback (most recent call last):
File "train.py", line 148, in
run(config)
File "train.py", line 110, in run
dev_loss, dev_cer = eval_dev(model, dev_ldr, preproc)
File "train.py", line 57, in eval_dev
preds = model.infer(batch)
File "/path/to/speech/models/seq2seq.py", line 176, in infer
_, argmaxs = self.infer_decode(x, y, end_tok, max_len)
File "/path/to/speech/models/seq2seq.py", line 155, in infer_decode
if torch.sum(y.data == end_tok) == y.numel():
RuntimeError: Expected object of type torch.cuda.LongTensor but found type torch.LongTensor for argument #2 'other'
Do you have idea to solve this?

if torch.sum(y.data == end_tok) == y.numel():
modified successfully :
if torch.sum(y.cpu() == end_tok).tolist()==y.numel():

from speech.

NAM-hj commented on August 19, 2024

For someone using "transducer" ....

After convert check_type() function of "libs/transducer/functions/transducer.py" to below code, I can pass this error.

def check_type(var, t, name):
    #if type(var) is not t:
    if var.type() != str(t).split("'")[1]:
        raise TypeError("{} must be {},\n{},\n{}".format(name, t,A,B))

After this, I added code at certify_inputs() and forward().

def certify_inputs(log_probs, labels, lengths, label_lengths):
    if log_probs.is_cuda:
        check_type(log_probs, torch.cuda.FloatTensor, "log_probs")
    else:
        check_type(log_probs, torch.FloatTensor, "log_probs")

    if labels.is_cuda:
        check_type(labels, torch.cuda.IntTensor, "labels")
    else:
        check_type(labels, torch.IntTensor, "labels")
    if label_lengths.is_cuda:
        check_type(label_lengths, torch.cuda.IntTensor, "label_lengths")
    else:
        check_type(label_lengths, torch.IntTensor, "label_lengths")

    if lengths.is_cuda:
        check_type(lengths, torch.cuda.IntTensor, "lengths")
    else:
        check_type(lengths, torch.IntTensor, "lengths")
   ..........

    def forward(self, log_probs, labels, lengths, label_lengths):
        """
        Computes the Transducer cost for a minibatch of examples.

        Arguments:
            log_probs (FloatTensor): The log probabilities should
                be of shape
                (minibatch, input len, output len, vocab size).
            labels (IntTensor): 1D tensor of labels for each example
                consecutively.
            lengths (IntTensor): 1D tensor of number actviation time-steps
                for each example.
            label_lengths (IntTensor): 1D tensor of label lengths for
                each example.

        Returns:
            costs (FloatTensor): .
        """
        is_cuda = log_probs.is_cuda
        certify_inputs(log_probs, labels, lengths, label_lengths)
        log_probs = log_probs.cpu()
        labels = labels.cpu() 
        lengths = lengths.cpu()
        label_lengths = label_lengths.cpu()
        ...............

from speech.

NAM-hj commented on August 19, 2024

After these, I got these warnings while training.

WARNING: Forward backward likelihood mismatch 0.000084
WARNING: Forward backward likelihood mismatch 0.000092
WARNING: Forward backward likelihood mismatch 0.000046

from speech.

torch type mismatch error about speech HOT 9 OPEN

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent