Comments (7)
@AnzCol Hi Aonan, what are your thoughts here?
from uis-rnn.
from uis-rnn.
@AnzCol I think what @hbredin means is - what if we simply define m_t=x_t
, will it still work? Did we have such experiments (my impression is no)?
Personally I don't think it's going to work well.
My understanding is that (@AnzCol please correct me if I'm wrong), the training process forces m_t
for each speaker to better fall into a normal distribution. But this is not guaranteed in the distributions of x_t
. The power of GRU here is that, to transform the distributions of speaker embeddings into a more clusterable distribution, by learning from the training dataset.
@hbredin Does this explanation make sense to you?
from uis-rnn.
@AnzCol I think what @hbredin means is - what if we simply define
m_t=x_t
, will it still work? Did we have such experiments (my impression is no)?
This is what I meant, indeed.
My understanding is that (@AnzCol please correct me if I'm wrong), the training process forces
m_t
for each speaker to better fall into a normal distribution. But this is not guaranteed in the distributions ofx_t
. The power of GRU here is that, to transform the distributions of speaker embeddings into a more clusterable distribution, by learning from the training dataset.
Except you are still using raw x_t
in Equation 11, so the distribution of speaker embeddings is not changed. Or did I miss something?
@hbredin Does this explanation make sense to you?
Not quite sure -- I think I have to think a bit more about this...
I would really like to see an ablative study with m_t = x_t
:-)
from uis-rnn.
from uis-rnn.
@hbredin Not sure whether this example makes sense: consider two clusters, their distributions of x_t
largely overlap with each other, but their distributions of m_t
are better separated. Eq. 11 regularizes that m_t
should not disjoint too much from x_t
.
from uis-rnn.
Closing as I got the answers I was looking for :-)
Thanks @AnzCol and @wq2012 !
from uis-rnn.
Related Issues (20)
- Embedding Extraction Procedure HOT 1
- about model HOT 1
- [Bug] Predict method does not finish HOT 3
- what is train data format? HOT 1
- Question about custom data generator
- uis-rnn gives different result on broken audios and continuous audios HOT 5
- how to control the number of different speaker when predicting? HOT 1
- Unable to convert pytorch model to tensorflow in Diarization on mobile device. HOT 2
- [Question] Are input d-vectors for training assumed L2-normalized? HOT 8
- Change input size HOT 1
- No module named coverage HOT 1
- Is is possible to pre-load the model for multiple request? HOT 1
- [Question] About num_non_zero HOT 1
- [Question] The dimension of toy test data [test_sequence] is (25, 95, 256) what does the first 2 dimension represent? Toy train data [train_sequence] has dimension (4627, 256) which is understandable. HOT 1
- Is there a way to fine tune an already existing pre-trained model? HOT 1
- rnn initial state trainable HOT 1
- Any documentations on training from scratch using custom data in other languages ? HOT 1
- [Bug] Making a prediction on CPU after training on GPU
- Predicted labels doesn't match with Ground truth labels but the accuracy of test results is 0.8% HOT 1
- assign gpu with arguments
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from uis-rnn.