Hi I try to run your project. I use cuda 10.1, all requirements are installed (wit

HI <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Gotcha, you can refer to this: <a href="https://github.com/keonlee9420/Daft-Exprt/

yes, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url

The size of tensor a (xx) must match the size of tensor b (yy) about stylespeech HOT 9 CLOSED

keonlee9420 commented on June 3, 2024

The size of tensor a (xx) must match the size of tensor b (yy)

from stylespeech.

Comments (9)

keonlee9420 commented on June 3, 2024

HI @DiDimus , please try to specify an index of GPUs if you have several of in your computer by CUDA_VISIBLE_DEVICES=0 for the first GPU as an example.

from stylespeech.

DiDimus commented on June 3, 2024

Thanks. but result is exactly the same. I think the problem is in the software environment. Do you have docker for this project? Wich OS do you use?

from stylespeech.

keonlee9420 commented on June 3, 2024

Gotcha, you can refer to this:
https://github.com/keonlee9420/Daft-Exprt/blob/main/Dockerfile

I think the Dockerfile should also work for this project. Please try it out and let me know the result.

from stylespeech.

Vadim2S commented on June 3, 2024

This is obviously project code error with predicted tensor size:

With duration_control = 0.3 here RuntimeError: The size of tensor a (25) must match the size of tensor b (31)
x shape is torch.Size([1, 31, 256]) ; mask shape is torch.Size([1, 25])
Right value is 31 (104*0.3)

With duration_control = 0.5 here RuntimeError: The size of tensor a (47) must match the size of tensor b (52)
x shape is torch.Size([1, 52, 256]) ; mask shape is torch.Size([1, 47])
Right value is 52 (104*0.5)

With duration_control = 1.0 all OK
x shape is torch.Size([1, 104, 256]) ; mask shape is torch.Size([1, 104])

With duration_control = 2.0 all OK
x shape is torch.Size([1, 208, 256]) ; mask shape is torch.Size([1, 208])

from stylespeech.

DiDimus commented on June 3, 2024

yes, @Vadim2S . Problem found, thanks. Docker from Daft-Export didn't help :(

from stylespeech.

keonlee9420 commented on June 3, 2024

hey guys, I just found that you had issue with the control value lower than 1. sorry for the late correction, and thanks to @Vadim2S , I can confirm that there is an error in current code. I'll fix it and push soon. thank you all for the report!

from stylespeech.

Vadim2S commented on June 3, 2024

Temporal workaround:

/model/modules.py #177 class LengthRegulator(nn.Module):

change

    if max_len is not None:
        output = pad(output, max_len)
    else:
        output = pad(output)

to:

    if max_len is not None:
        output = pad(output, max_len)
        #VVS
        mel_len.clear()
        mel_len.append(output.shape[1])
    else:
        output = pad(output)

P.S. Duration prediction is real and in LengthRegulator.expand you do

    for i, vec in enumerate(batch):
        expand_size = predicted[i].item()
        out.append(vec.expand(max(int(expand_size), 0), -1))
    out = torch.cat(out, 0)

of course, you get out smaller than max_len due rounding. I am presume you must extend out to max_size later

from stylespeech.

keonlee9420 commented on June 3, 2024

I fixed the code and it's working now. The problem was originated from the value of max_len at inference time in VarianceAdaptor, where it should be 'None' but that of a reference audio was wrongly passed.

from stylespeech.

Vadim2S commented on June 3, 2024

Thanks! Tested. Low duration work OK!

from stylespeech.

The size of tensor a (xx) must match the size of tensor b (yy) about stylespeech HOT 9 CLOSED

Comments (9)

Related Issues (15)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent