Comments (2)
I tried to finetune the YourTTS model with my own small dataset and faced the same error as this issue.
My dataset includes 256 audio data and I made it in LJSpeech format.
Thanks for the suggested solutions above but since changing the source code is not preferable to me, I studied this problem a little bit.
Let me get straight to the point, I think the reason of this error is not about multi-speaker or single-speaker, this issue occurs when the dataset is relatively small.
I tried to train from scratch using only the LJSpeech-1.1 dataset but the error did not occur. So we can tell single-speaker format is not the problem.
Then I then made a subset of LJSpeech with only the first 1024 data and train from scratch again, the error is reproduced in this case.
From the Python log, we can see the error occurs during the evaluation stage of the training.
By default, the evaluation split proportion is 0.01. In this simulation, the size of evaluation set would be 1024*0.01=10, which is smaller than the default batch size 32.
By explicitly declaring eval_split_size=32
, the problem is solved.
Furthermore, it should be aware that, when any of the training data is discarded by MAX_AUDIO_LEN_IN_SECONDS
and the size of evaluation set is less than batch size, this problem will happen.
To conclude, this bug occurs when the actual size of evaluation set is less an 1x batch size. The training-evaluation split proportion, discarding of samples, and inappropriate hyperparameters (such as inconsistency between BATCH_SIZE
and eval_split_max_size
) may cause the problem.
from trainer.
I could correct this by doing the following things :
- In LJspeech formatter I added
speaker_name = cols[1]
so that the formatter outputs the name of the speaker (stored in second column in my csv) - Brought 4 other datasets with different speakers
- Used only 16kHz datasets and changed the sampling rate to 16000.
- Deleted the already generated speaker embedding files.
Now the training is working. So I believe this recipe must be run against a multi speaker datasets. Single speaker dataset may not be supported.
Maybe it can work with 22kHz audio but I did not test it as I only have 16kHz multi speaker datasets and a single one in 22kHz.
from trainer.
Related Issues (20)
- [Bug] ValueError: not allowed to raise maximum limit (rlimit) HOT 1
- [Bug] Unable to run distributed training using TTS recipe for yourtts
- [Bug] grad_accum_steps has no effect on training HOT 2
- [Bug]
- [Bug] AttributeError: module 'torch' has no attribute 'autocast'
- [Bug] trainer.distribute not working HOT 1
- Remove AMP HOT 1
- [Bug] Hardcoded argument parsing on distribute.py preventing other options to be passed when using TrainerArgs
- [Feature request] Multi-node training HOT 2
- [Bug] Errors when running VITS in distributed DDP mode
- [Bug] Trainer saves the given config to ClearML rather than the continued config
- [Feature request] Upload the training script to the tracking dashboard. HOT 1
- [Bug] Gradient accumulation breaks GAN training with Discriminator feature loss
- [Feature request] Reset best loss when restoring from checkpoint in order to keep best model of fine-tuning HOT 1
- [Bug] data = [self.dataset[idx] for idx in possibly_batched_index] TypeError: 'int' object is not iterable HOT 2
- [Bug] Tests taking forever
- Package is marked as platform-independent, but it does not import on Windows HOT 6
- [Bug] Cannot restore YourTTS model HOT 1
- [Feature request] Save a checkpoint when interrupting the training (ctrl - c) HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from trainer.