Git Product home page Git Product logo

tacotron's People

Contributors

barronalex avatar candlewill avatar syoyo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tacotron's Issues

High Memory Usage?

Hi all,

Just wondering if anyone noticed that the memory usage is high?

My training data are:
925M Jun 16 08:09 mels.npy
56K Jun 16 08:09 meta.pkl
7.9K Jun 16 08:09 speech_lens.npy
21K Jun 16 13:45 stft_mean.npy
5.8G Jun 16 08:09 stfts.npy
21K Jun 16 13:45 stft_std.npy
7.9K Jun 16 08:09 text_lens.npy
771K Jun 16 08:09 texts.npy
However, a "top" shows that training uses more than 20GB RES.

Thanks

Jian

Annealing rate question

In a models/tacotron.py, we defined annealing_rate = 1 and learning rate init_lr = 0.0005. Does it mean there's no decay function in this model?
Am I right?

loss exploded

Does anyone run into this issue ? In the training process, "loss exploded" threw out and the training is stopped.

MultiSpeaker test.py

Hello,

we have been experimenting with multispeaker dataset and since we started hearing some understandable words/sentences while training we tried to produce some other via test.py.
The problem with it is, as far as we understand, the ability to use the speaker embedding and testing usually broke on CBHG in ops file.
Could you please be kind and try to treat the general problem of testing given the multispaker dataset such as VCTK.

General Question

Hi just want to ask for educational purposes.Do you think that it easier to implement the paper "Tacotron: Towards End-To-End Speech Synthesis" or Tacotron 2 "Natural TTS Synthesis By Conditioning Wavenet On Mel Spectogram
Predictions"?

help with the output file

Right after the inference one file is generated and saved to ./log/nancy/tacotron/test. As I am a beginner in this, I am unable to understand how to make use of the file and how i can get the audio file of some input text. Please help.
Thanks

Terrible result at 180k steps with arctic dataset

I trained the arctic dataset 180k steps nearly 70hrs with GTX 1080 but the output audio are terrible, un-listenable.
Here is my loss graph:
screenshot from 2017-08-25 14-49-19

P/S: I did not change anything in the code. Please help me to resolve it. Thank you in advance.

Utf-8 characters for non English words.

I replaced the open commands with codecs.open(txt_file, 'r', 'utf-8') to support non English characters.

Everything works fine. But if I use non English characters in prompts.txt when using test.py I get an error :

"xxxx" is not a valid scope name

If I remove the non English characters, script runs without errors, but no speech is generated only noise.

Only except tf.errors.OutOfRangeError: is raised which I think is intentional?

c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\framework\op_kernel.cc:1158]
Out of range: FIFOQueue '_1_batch/fifo_queue' is closed and has insufficient elements (requested 32, current size 0)
[[Node: batch = QueueDequeueUpToV2[component_types=[DT_INT32, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](batch/fifo_queue, batch/n)]]

Any way to freeze the model?

Hi, I was trying to freeze the model, but as the code is using the data loader from tensorflow, I cannot easily find the input nodes and feed the data through sess.run(output, feed_dict). So I am wandering is there anyone who has frozen the model? Did you add some extra nodes to help rebuild the input nodes? Thanks.

Inference error

Hello,

I downloaded the Nancy weights and tried 'python3 test.py < prompts.txt' with one sentence in the prompts.txt file. I get the following error. Is this a tensorflow version issue? I'm on Ubuntu 16.04 and my TF version is 1.12.0:

Traceback (most recent call last):
File "test.py", line 91, in
test(model, config, prompts)
File "test.py", line 31, in test
model = model(config, batch_inputs, train=False)
File "/home/kbalak18/Tacotron/models/tacotron.py", line 191, in init
self.seq2seq_output, self.output = self.inference(inputs, train)
File "/home/kbalak18/Tacotron/models/tacotron.py", line 137, in inference
(seq2seq_output, _), attention_state, _ = decoder.dynamic_decode(dec, maximum_iterations=config.max_decode_iter)
File "/home/kbalak18/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/seq2seq/python/ops/decoder.py", line 209, in dynamic_decode
zero_outputs = _create_zero_outputs(decoder.output_size,
File "/home/kbalak18/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/seq2seq/python/ops/basic_decoder.py", line 101, in output_size
sample_id=self._helper.sample_ids_shape)
File "/home/kbalak18/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/seq2seq/python/ops/helper.py", line 144, in sample_ids_shape
return self._sample_ids_shape
AttributeError: 'InferenceHelper' object has no attribute '_sample_ids_shape'

Training time

How long did it take you to train this model from scratch on your mac using one GPU?

French Model

I have trained a French model recently using a dataset with total of 10 hours (https://datashare.is.ed.ac.uk/handle/10283/2353). The results quality is low in comparison to English using "nancy" dataset with the same loss values. I wanted to know if there is certain limitations on the total number of hours of the training data or any other constrains on the data. In other words, how can we improve our results?

Attention RNN

I am wondering if the attention RNN described in the paper is included in the implementation. If so, could someone point out lines in code where it is used? The reason I ask is because it seems to me that the paper is keeping track of the attention states and using them as input to predict the next timestep's attention. This makes sense as they do a similar thing in the Chorowsky at al paper (this mechanism is also used in the Tacotron2 paper) to keep the attention moving forward. But I could very well be wrong about interpreting this.

weights.zip download issue

executing the following command give me unresolved host problem

./download_weights.sh

--2018-06-09 01:51:24-- https://www.dropbox.com/s/8lq7y9bhglthdjm/tacotron_weights.zip
Resolving www.dropbox.com (www.dropbox.com)... 162.125.66.1, 2620:100:6022:1::a27d:4201
Connecting to www.dropbox.com (www.dropbox.com)|162.125.66.1|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://uce63a8aeb49c0def2795b1fbf40.dl.dropboxusercontent.com/cd/0/get/AIZ13hjTdc5cinjHjaaGngEl_PmJy2e_bgfsmJzaO9yBbKZm0YzGBpahyElIrgVnwgpU4-52DgPgZC6i8kX3vLj2xocAaVijus2AayncSXD2sZW0N4h8a4RxvwbvvBW1P6df0HF_SzSrSqBtl5kptiTQDjWWvFsVmfhczQHDMCN972zP_P_EOaEhbJaaWBGqWVo/file [following]
--2018-06-09 01:51:24-- https://uce63a8aeb49c0def2795b1fbf40.dl.dropboxusercontent.com/cd/0/get/AIZ13hjTdc5cinjHjaaGngEl_PmJy2e_bgfsmJzaO9yBbKZm0YzGBpahyElIrgVnwgpU4-52DgPgZC6i8kX3vLj2xocAaVijus2AayncSXD2sZW0N4h8a4RxvwbvvBW1P6df0HF_SzSrSqBtl5kptiTQDjWWvFsVmfhczQHDMCN972zP_P_EOaEhbJaaWBGqWVo/file
Resolving uce63a8aeb49c0def2795b1fbf40.dl.dropboxusercontent.com (uce63a8aeb49c0def2795b1fbf40.dl.dropboxusercontent.com)... failed: Name or service not known.
wget: unable to resolve host address ‘uce63a8aeb49c0def2795b1fbf40.dl.dropboxusercontent.com’
mv: cannot stat 'tacotron_weights.zip': No such file or directory
unzip: cannot find or open weights/nancy/tacotron_weights.zip, weights/nancy/tacotron_weights.zip.zip or weights/nancy/tacotron_weights.zip.ZIP.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.