I made a notebook for the style transfer (see <a class="issue-link js-issue-link" data

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Style transfer COLAB about flowtron HOT 10 OPEN

nvidia commented on May 23, 2024 9

Style transfer COLAB

from flowtron.

Comments (10)

ninaburns commented on May 23, 2024

I am new to style transfer and this really helped me follow along, thank you. Just curious, what was the reason for the 'byte mask fix' in flowtron.py of your fork? Is that necessary to run your style transfer example?

from flowtron.

karkirowle commented on May 23, 2024

@ninaburns there is a bit of discussion why that's needed in this issue
Basically, if you want to use the latest PyTorch version you have to do this. This is a preferred way in COLAB, because otherwise you have to install also a previous version of PyTorch and it takes more time. As you have to set up your env in colab each time, this is decisive. The disadvantage of this approach is that I have to point the COLAB to my bug fixed fork and the COLAB might break over time (i.e PyTorch version is not fixed).
I also tried setting up with the previous version, which introduces all sorts of problems with the GPU computation. I think what happened is that less efficient GPU computations hit the hard limit of COLAB.

from flowtron.

ninaburns commented on May 23, 2024

Gotcha. I should have seen the issue you mentioned! Thanks for the explanation, it makes sense.

from flowtron.

DamienToomey commented on May 23, 2024

Hi, has anybody managed to do style transfer with the code shared by @karkirowle ?

I have also read the content from issue #9 but I am still struggling.

In particular, I am trying to reproduce the demo 4.4.4 Sampling the Posterior ( Unseen speaker ). I am using the LibriTTS model and using the wav file ravdess_surprised_prior.wav with emotion "surprised" from the demo as style.

I use average_over_time = True and speaker_id = 2092 in the code from @karkirowle.

The generated audio is of good quality but it does not contain the "surprised" emotion which is distinctly heard in the demo.

from flowtron.

karkirowle commented on May 23, 2024

Hi @DamienToomey !
If I understand right you are using the LibriTTS model and the (one) surprised wav file as style that is on the NVIDIA demo page that you linked.
Please note that the results will be almost certainly different than in the demo site. In the demo, the Sally TTS model is used, which is not available publicly as far as I know. Also, we don't know the seed, standard deviation, etc. that were used for synthesis.
To get better results my suggestion would be to use more audio files from RAVDESS containing the same style, but from different speakers. This would be certainly more successful, because it averages out the per speaker noise, copying only the surprised style. This way you will get more robust results. If you are not satisfied with the variety of the intonation, you can try changing the random seed and the std. It might be that it still doesn't work after all this. I hope this helps!

from flowtron.

rafaelvalle commented on May 23, 2024

@karkirowle I sent you an e-mail a few days ago wrt to your article on Flowtron and Style transfer.
Did you receive it?

from flowtron.

karkirowle commented on May 23, 2024

@rafaelvalle No, I haven't. Which e-mail address? Did you use the e-mail address in the COLAB or on my blog? I can have a second look or try resending it just in case?

from flowtron.

rafaelvalle commented on May 23, 2024

@karkirowle I sent and re-sent it to [email protected]
@karkirowle Sent yet another one with a suggestion for experiments.

from flowtron.

rafaelvalle commented on May 23, 2024

Please take a look at the link below for a style transfer demo.
https://github.com/NVIDIA/flowtron/blob/master/inference_style_transfer.ipynb

from flowtron.

TotzkePaul commented on May 23, 2024

I get this error with the inference_style_transfer.ipynb
I'm on windows and using the ljs model

TypeError Traceback (most recent call last)
in ()
8 in_lens = torch.LongTensor([text.shape[1]]).cuda()
9 with torch.no_grad():
---> 10 z = model(mel, sid, text, in_lens, None)[0]
11 z_values.append(z.permute(1, 2, 0))

C:\Users\totzke\miniconda3\envs\flow\lib\site-packages\torch\nn\modules\module.py in call(self, *input, **kwargs)
487 result = self._slow_forward(*input, **kwargs)
488 else:
--> 489 result = self.forward(*input, **kwargs)
490 for hook in self._forward_hooks.values():
491 hook_result = hook(self, input, result)

H:\TTS\flowtron\flowtron.py in forward(self, mel, speaker_vecs, text, in_lens, out_lens)
592 for i, flow in enumerate(self.flows):
593 mel, log_s, gate, attn = flow(
--> 594 mel, encoder_outputs, mask, out_lens)
595 log_s_list.append(log_s)
596 attns_list.append(attn)

H:\TTS\flowtron\flowtron.py in forward(self, mel, text, mask, out_lens)
394 # backwards flow, send padded zeros back to end
395 for k in range(mel.size(1)):
--> 396 mel[:, k] = mel[:, k].roll(out_lens[k].item(), dims=0)
397
398 mel, log_s, gates, attn = self.ar_step(mel, text, mask, out_lens)

TypeError: 'NoneType' object is not subscriptable

from flowtron.

Style transfer COLAB about flowtron HOT 10 OPEN

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent