Git Product home page Git Product logo

Comments (9)

pkyoung avatar pkyoung commented on July 19, 2024 2

I managed to execute train.py, but not yet confirmed training was successful or not.
The quick (and dirty) remedy to above error was:

diff --git a/speech/models/seq2seq.py b/speech/models/seq2seq.py
index b2881e3..65e3a38 100644
--- a/speech/models/seq2seq.py
+++ b/speech/models/seq2seq.py
@@ -87,7 +87,7 @@ class Seq2Seq(model.Model):

         hx = torch.zeros((x.shape[0], x.shape[2]), requires_grad=False)
         if self.is_cuda:
-            hx.cuda()
+            hx = hx.cuda()
         ax = None; sx = None;
         for t in range(y.size()[1] - 1):
             sample = (out and self.scheduled_sampling)
@@ -119,7 +119,7 @@ class Seq2Seq(model.Model):
         if state is None:
             hx = torch.zeros((x.shape[0], x.shape[2]), requires_grad=False)
             if self.is_cuda:
-                hx.cuda()
+                hx = hx.cuda()
             ax = None; sx = None;
         else:
             hx, ax, sx = state
@@ -164,7 +164,7 @@ class Seq2Seq(model.Model):
         Infer a likely output. No beam search yet.
         """
         x, y = self.collate(*batch)
-        end_tok = y.data[0, -1] # TODO
+        end_tok = y.data[0, -1].cuda() # TODO
         t = y
         if self.is_cuda:
             x = x.cuda()
@@ -172,7 +172,7 @@ class Seq2Seq(model.Model):
         x = self.encode(x)

         # needs to be the start token, TODO
-        y = t[:, 0:1]
+        y = t[:, 0:1].cuda()
         _, argmaxs = self.infer_decode(x, y, end_tok, max_len)
         argmaxs = argmaxs.cpu().data.numpy()
         return [seq.tolist() for seq in argmaxs]

And there was also error in train.py

diff --git a/train.py b/train.py
index a04eb6c..6141ba0 100644
--- a/train.py
+++ b/train.py
@@ -10,6 +10,7 @@ import torch
 import torch.nn as nn
 import torch.optim
 import tqdm
+import copy

 import speech
 import speech.loader as loader
@@ -30,7 +31,7 @@ def run_epoch(model, optimizer, train_ldr, it, avg_loss):
         loss.backward()

         grad_norm = nn.utils.clip_grad_norm(model.parameters(), 200)
-        loss = loss.data[0]
+        loss = loss.item()

         optimizer.step()
         prev_end_t = end_t
@@ -54,11 +55,13 @@ def eval_dev(model, ldr, preproc):
     model.set_eval()

     for batch in tqdm.tqdm(ldr):
-        preds = model.infer(batch)
-        loss = model.loss(batch)
-        losses.append(loss.data[0])
+        batch_ = copy.deepcopy(batch)
+        preds = model.infer(batch_)
+        batch_ = copy.deepcopy(batch)
+        loss = model.loss(batch_)
+        losses.append(loss.item())
         all_preds.extend(preds)
-        all_labels.extend(batch[1])
+        all_labels.extend(list(batch)[1])

     model.set_train()

from speech.

arattari avatar arattari commented on July 19, 2024

@silnos If you do manage to fix this, please post an update/solution!

from speech.

arattari avatar arattari commented on July 19, 2024

Thanks, I implemented this on my copy as well. I'm still getting another error:
RuntimeError: Assertion cur_target >= 0 && cur_target < n_classes' failed. -- I'm wondering if you have encountered something similar at all?

I'm wondering if you've run this on Librispeech?

from speech.

arattari avatar arattari commented on July 19, 2024

@silnos, could you share your config file?? are you using the default included example??

from speech.

pkyoung avatar pkyoung commented on July 19, 2024

@arattari Yes I am using default json file in examples/timit. I haven't tried librispeech

With timit examples, I got the 2x.x % error rate, and I am not sure this is okay or not yet.
I am going to look into the codes soon.

from speech.

eeric avatar eeric commented on July 19, 2024

With python3.6, pytorch0.4.1, cuda9.0,
I got the following error when I run train.py with timit example:

$ python train.py examples/timit/seq2seq_config.json
Traceback (most recent call last):
  File "train.py", line 146, in <module>
    run(config)
  File "train.py", line 104, in run
    run_state = run_epoch(model, optimizer, train_ldr, *run_state)
  File "train.py", line 29, in run_epoch
    loss = model.loss(batch)
  File "/path/to/speech/models/seq2seq.py", line 57, in loss
    out, alis = self.forward_impl(x, y)
  File "/path/to/speech/models/seq2seq.py", line 68, in forward_impl
    out, alis = self.decode(x, y)
  File "/path/to/speech/models/seq2seq.py", line 103, in decode
    hx = self.dec_rnn(ix.squeeze(dim=1), hx)
  File "/path/to/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/path/to/lib64/python3.6/site-packages/torch/nn/modules/rnn.py", line 794, in forward
    self.bias_ih, self.bias_hh,
  File "/path/to/lib64/python3.6/site-packages/torch/nn/_functions/rnn.py", line 53, in GRUCell
    gh = F.linear(hidden, w_hh)
  File "/path/to/lib64/python3.6/site-packages/torch/nn/functional.py", line 1026, in linear
    output = input.matmul(weight.t())
RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #2 'mat2'

hx = self.dec_rnn(ix.squeeze(dim=1), hx)
modified successfully :
hx = self.dec_rnn(ix.squeeze(dim=1), hx.cuda())

from speech.

eeric avatar eeric commented on July 19, 2024

With python3.6, pytorch0.4.1, cuda9.0,

Traceback (most recent call last):
File "train.py", line 148, in
run(config)
File "train.py", line 110, in run
dev_loss, dev_cer = eval_dev(model, dev_ldr, preproc)
File "train.py", line 57, in eval_dev
preds = model.infer(batch)
File "/path/to/speech/models/seq2seq.py", line 176, in infer
_, argmaxs = self.infer_decode(x, y, end_tok, max_len)
File "/path/to/speech/models/seq2seq.py", line 155, in infer_decode
if torch.sum(y.data == end_tok) == y.numel():
RuntimeError: Expected object of type torch.cuda.LongTensor but found type torch.LongTensor for argument #2 'other'

Do you have idea to solve this?

if torch.sum(y.data == end_tok) == y.numel():
modified successfully :
if torch.sum(y.cpu() == end_tok).tolist()==y.numel():

from speech.

NAM-hj avatar NAM-hj commented on July 19, 2024

For someone using "transducer" ....

After convert check_type() function of "libs/transducer/functions/transducer.py" to below code, I can pass this error.

def check_type(var, t, name):
    #if type(var) is not t:
    if var.type() != str(t).split("'")[1]:
        raise TypeError("{} must be {},\n{},\n{}".format(name, t,A,B))

After this, I added code at certify_inputs() and forward().

def certify_inputs(log_probs, labels, lengths, label_lengths):
    if log_probs.is_cuda:
        check_type(log_probs, torch.cuda.FloatTensor, "log_probs")
    else:
        check_type(log_probs, torch.FloatTensor, "log_probs")

    if labels.is_cuda:
        check_type(labels, torch.cuda.IntTensor, "labels")
    else:
        check_type(labels, torch.IntTensor, "labels")
    if label_lengths.is_cuda:
        check_type(label_lengths, torch.cuda.IntTensor, "label_lengths")
    else:
        check_type(label_lengths, torch.IntTensor, "label_lengths")

    if lengths.is_cuda:
        check_type(lengths, torch.cuda.IntTensor, "lengths")
    else:
        check_type(lengths, torch.IntTensor, "lengths")
   ..........
    def forward(self, log_probs, labels, lengths, label_lengths):
        """
        Computes the Transducer cost for a minibatch of examples.

        Arguments:
            log_probs (FloatTensor): The log probabilities should
                be of shape
                (minibatch, input len, output len, vocab size).
            labels (IntTensor): 1D tensor of labels for each example
                consecutively.
            lengths (IntTensor): 1D tensor of number actviation time-steps
                for each example.
            label_lengths (IntTensor): 1D tensor of label lengths for
                each example.

        Returns:
            costs (FloatTensor): .
        """
        is_cuda = log_probs.is_cuda
        certify_inputs(log_probs, labels, lengths, label_lengths)
        log_probs = log_probs.cpu()
        labels = labels.cpu() 
        lengths = lengths.cpu()
        label_lengths = label_lengths.cpu()
        ...............

from speech.

NAM-hj avatar NAM-hj commented on July 19, 2024

After these, I got these warnings while training.

WARNING: Forward backward likelihood mismatch 0.000084
WARNING: Forward backward likelihood mismatch 0.000092
WARNING: Forward backward likelihood mismatch 0.000046

from speech.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.