mycrazycracy / tf-kaldi-speaker Goto Github PK
View Code? Open in Web Editor NEWNeural speaker recognition/verification system based on Kaldi and Tensorflow
License: Apache License 2.0
Neural speaker recognition/verification system based on Kaldi and Tensorflow
License: Apache License 2.0
Hi ,
Thanks for your great job. Does the newest version support multi-GPU training? I find the multi-gpu setup in the file of "run_train_nnet.sh".
Why you use PLDA for AMSoftmax/ArcSoftmax/ASoftmax, did you try using simple cosine similarity?
Hi,
Can you please give a brief idea about how to configure angular softmax as a loss function in the script?
Training steps I am not able to understand
It's strange that in the run.sh, we comment out the following lines
# # Make a version with reverberated speech
# rvb_opts=()
# rvb_opts+=(--rir-set-parameters "0.5, RIRS_NOISES/simulated_rirs/smallroom/rir_list")
# rvb_opts+=(--rir-set-parameters "0.5, RIRS_NOISES/simulated_rirs/mediumroom/rir_list")
however, --rir-set-parameters is a required parameter for steps/data/reverberate_data_dir.py, thus commenting out these lines will cause error.
Can I know why we comment out them, and whether in your experiments you include the reverberation augmentation training data? Since I am having problem on reproducing your results, thus I want to make sure our training data is same. Thanks!
I didn't find the process of enrolment and testing. How can I distinguish between these two parts? I want to separate the enrollment utterances from the testing utterances. What should I do?
as commit show that the exp add netvlad / ghostvlad, what's the result of these pooling strategy with tdnn?
Hi,
thanks for the great work.
When I run the sre/v1 egs, I got the error in stage 2-8 such as :
python: can't open file 'python steps/data/augment_data_dir_new.py': [Errno 2] No such file or directory
python: can't open file 'utils/sample_validset_spk2utt.py': [Errno 2] No such file or directory
nnet/run_train_nnet.sh: line 63: /tf_gpu/bin/activate: No such file or directory
I have checked the such file in the originally path kaldi/egs/wsj/s5/. but none of these files exist.
Is that mean we need write the missing files to achieve the related function?
Thanks
Cheers
Thank you for sharing this great project.
What kind of additional features need to be added to support text-dependent speaker verification?
Thanks.
When I extracted embeddings (stage=8), I encountered a problem. When the length larger than the chunk size, it will be fall into a stop. In order to continue the extracting, I have to set the chunk size bigger to avoid segmentation. So is this a bug? And how can I deal with it?
I saw code for GE2E loss in the code which have already been commented. Do you guys try GE2E loss in the experiment ? If yes, what is the performance of GE2E loss ?
Hi Dr.Liu:
Thank you very much for your sharing, I have seen your eer result(eer=0.02) is state of the art, but i have a few question for you. (1) I don't see the predict code, i just want to try the inference ; (2)How many days did you train on the voxceleb dataset?
Looking forward to your reply. Thank you!!
How could i request the pretrained SRE models ?
I am running run.sh, up to stage7 it worked well, in stage 7 I got below error
File "nnet/lib/train.py", line 8, in
from misc.utils import ValidLoss, load_lr, load_valid_loss, save_codes_and_config, compute_cos_pairwise_eer
ModuleNotFoundError: No module named 'misc'
Yi Liu, Hello.
Thank you very much for your solution!
I trained with dataset voxcelev1&2 and xvector_nnet_tdnn_amsoftmax_m0.20_linear_bn_1e-2_tdnn4_att. Everything works as expected. Training, extracting embeddings, eval works well.
But when i had tried to use pre-trained yours models on the same dataset for extracting embeddings (stage=8) i have got error:
ValueError: Cannot feed value of shape (1, 859, 24) for Tensor u'pred_features:0', which has shape '(?, ?, 30)'
Environment:
tensorflow-gpu==1.12
cuda==9.0.0
net = xvector_nnet_tdnn_amsoftmax_m0.20_linear_bn_1e-2_mhe0.01_2
How to fix error? Thanks in advance!
Full log:
# nnet/wrap/extract_wrapper.sh --gpuid -1 --env tf_cpu --min-chunk-size 25 --chunk-size 10000 --normalize false --node tdnn6_dense /home/psadmin/projects/voxceleb/exp/xvector_nnet_tdnn_amsoftmax_m0.20_linear_bn_1e-2_mhe0.01_2 "ark:apply-cmvn-sliding --norm-vars=false --center=true --cmn-window=300 scp:/home/psadmin/projects/voxceleb/data/voxceleb_train/split40/1/feats.scp ark:- | select-voiced-frames ark:- scp,s,cs:/home/psadmin/projects/voxceleb/data/voxceleb_train/split40/1/vad.scp ark:- |" "ark:| copy-vector ark:- ark,scp:/home/psadmin/projects/voxceleb/exp/xvector_nnet_tdnn_amsoftmax_m0.20_linear_bn_1e-2_mhe0.01_2/xvectors_voxceleb_train/xvector.1.ark,/home/psadmin/projects/voxceleb/exp/xvector_nnet_tdnn_amsoftmax_m0.20_linear_bn_1e-2_mhe0.01_2/xvectors_voxceleb_train/xvector.1.scp"
# Started at Tue Aug 4 13:11:43 MSK 2020
#
INFO:tensorflow:Extract embedding from tdnn6_dense
2020-08-04 13:11:46.819647: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-08-04 13:11:48.224681: E tensorflow/stream_executor/cuda/cuda_driver.cc:300] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2020-08-04 13:11:48.224811: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:163] retrieving CUDA diagnostic information for host: softs-server-07
2020-08-04 13:11:48.224871: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:170] hostname: softs-server-07
2020-08-04 13:11:48.225023: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:194] libcuda reported version is: 440.64.0
2020-08-04 13:11:48.225144: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:198] kernel reported version is: 440.64.0
2020-08-04 13:11:48.225172: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:305] kernel version seems to match DSO: 440.64.0
INFO:tensorflow:Extract embedding from node tdnn6_dense
WARNING:tensorflow:From /home/psadmin/projects/kaldi-tf/tf-kaldi-speaker/model/pooling.py:23: calling reduce_mean (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
copy-vector ark:- ark,scp:/home/psadmin/projects/voxceleb/exp/xvector_nnet_tdnn_amsoftmax_m0.20_linear_bn_1e-2_mhe0.01_2/xvectors_voxceleb_train/xvector.1.ark,/home/psadmin/projects/voxceleb/exp/xvector_nnet_tdnn_amsoftmax_m0.20_linear_bn_1e-2_mhe0.01_2/xvectors_voxceleb_train/xvector.1.scp
apply-cmvn-sliding --norm-vars=false --center=true --cmn-window=300 scp:/home/psadmin/projects/voxceleb/data/voxceleb_train/split40/1/feats.scp ark:-
select-voiced-frames ark:- scp,s,cs:/home/psadmin/projects/voxceleb/data/voxceleb_train/split40/1/vad.scp ark:-
INFO:tensorflow:[INFO] Key id00012-21Uxsk56VDQ-00001 length 859.
INFO:tensorflow:Reading checkpoints...
INFO:tensorflow:Restoring parameters from /home/psadmin/projects/voxceleb/exp/xvector_nnet_tdnn_amsoftmax_m0.20_linear_bn_1e-2_mhe0.01_2/nnet/model-2610000
INFO:tensorflow:Succeed to load checkpoint model-2610000
Traceback (most recent call last):
File "nnet/lib/extract.py", line 90, in <module>
embedding = trainer.predict(feature)
File "/home/psadmin/projects/kaldi-tf/tf-kaldi-speaker/model/trainer.py", line 724, in predict
ERROR (select-voiced-frames[5.5.762~1-0062]:Write():kaldi-matrix.cc:1404) Failed to write matrix to stream
[ Stack-Trace: ]
/home/psadmin/projects/kaldi-tf/kaldi/src/lib/libkaldi-base.so(kaldi::MessageLogger::LogMessage() const+0x8b7) [0x7fed417d3d1d]
select-voiced-frames(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x11) [0x40e76d]
/home/psadmin/projects/kaldi-tf/kaldi/src/lib/libkaldi-matrix.so(kaldi::MatrixBase<float>::Write(std::ostream&, bool) const+0x1a7) [0x7fed41a174ad]
select-voiced-frames(kaldi::TableWriterArchiveImpl<kaldi::KaldiObjectHolder<kaldi::MatrixBase<float> > >::Write(std::string const&, kaldi::MatrixBase<float> const&)+0x1d6) [0x40ef40]
select-voiced-frames(main+0x580) [0x40cf50]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7fed39fe8555]
select-voiced-frames() [0x40c909]
embeddings = self.sess.run(self.embeddings, feed_dict={self.pred_features: features})
File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 929, in run
WARNING (select-voiced-frames[5.5.762~1-0062]:Write():util/kaldi-holder-inl.h:57) Exception caught writing Table object. kaldi::KaldiFatalError
WARNING (select-voiced-frames[5.5.762~1-0062]:Write():util/kaldi-table-inl.h:1057) Write failure to standard output
ERROR (select-voiced-frames[5.5.762~1-0062]:Write():util/kaldi-table-inl.h:1515) Error in TableWriter::Write
[ Stack-Trace: ]
/home/psadmin/projects/kaldi-tf/kaldi/src/lib/libkaldi-base.so(kaldi::MessageLogger::LogMessage() const+0x8b7) [0x7fed417d3d1d]
select-voiced-frames(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x11) [0x40e76d]
select-voiced-frames(main+0x5d3) [0x40cfa3]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7fed39fe8555]
select-voiced-frames() [0x40c909]
run_metadata_ptr)
WARNING (select-voiced-frames[5.5.762~1-0062]:Close():util/kaldi-table-inl.h:1089) Error closing stream: wspecifier is ark:-
File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1128, in _run
ERROR (select-voiced-frames[5.5.762~1-0062]:~TableWriter():util/kaldi-table-inl.h:1539) Error closing TableWriter [in destructor].
[ Stack-Trace: ]
/home/psadmin/projects/kaldi-tf/kaldi/src/lib/libkaldi-base.so(kaldi::MessageLogger::LogMessage() const+0x8b7) [0x7fed417d3d1d]
select-voiced-frames(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x11) [0x40e76d]
select-voiced-frames(kaldi::TableWriter<kaldi::KaldiObjectHolder<kaldi::MatrixBase<float> > >::~TableWriter()+0x59) [0x412893]
select-voiced-frames(main+0x82b) [0x40d1fb]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7fed39fe8555]
select-voiced-frames() [0x40c909]
terminate called after throwing an instance of 'kaldi::KaldiFatalError'
what(): kaldi::KaldiFatalError
str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (1, 859, 24) for Tensor u'pred_features:0', which has shape '(?, ?, 30)'
ERROR (apply-cmvn-sliding[5.5.762~1-0062]:Write():kaldi-matrix.cc:1404) Failed to write matrix to stream
[ Stack-Trace: ]
/home/psadmin/projects/kaldi-tf/kaldi/src/lib/libkaldi-base.so(kaldi::MessageLogger::LogMessage() const+0x8b7) [0x7f618cf15d1d]
apply-cmvn-sliding(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x11) [0x40a4a9]
/home/psadmin/projects/kaldi-tf/kaldi/src/lib/libkaldi-matrix.so(kaldi::MatrixBase<float>::Write(std::ostream&, bool) const+0x1a7) [0x7f618d1594ad]
apply-cmvn-sliding(kaldi::TableWriterArchiveImpl<kaldi::KaldiObjectHolder<kaldi::MatrixBase<float> > >::Write(std::string const&, kaldi::MatrixBase<float> const&)+0x29e) [0x40ad70]
apply-cmvn-sliding(main+0x335) [0x4091c2]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f618572a555]
apply-cmvn-sliding() [0x408dc9]
WARNING (apply-cmvn-sliding[5.5.762~1-0062]:Write():util/kaldi-holder-inl.h:57) Exception caught writing Table object. kaldi::KaldiFatalError
WARNING (apply-cmvn-sliding[5.5.762~1-0062]:Write():util/kaldi-table-inl.h:1057) Write failure to standard output
ERROR (apply-cmvn-sliding[5.5.762~1-0062]:Write():util/kaldi-table-inl.h:1515) Error in TableWriter::Write
[ Stack-Trace: ]
/home/psadmin/projects/kaldi-tf/kaldi/src/lib/libkaldi-base.so(kaldi::MessageLogger::LogMessage() const+0x8b7) [0x7f618cf15d1d]
apply-cmvn-sliding(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x11) [0x40a4a9]
apply-cmvn-sliding(main+0x388) [0x409215]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f618572a555]
apply-cmvn-sliding() [0x408dc9]
WARNING (apply-cmvn-sliding[5.5.762~1-0062]:Close():util/kaldi-table-inl.h:1089) Error closing stream: wspecifier is ark:-
ERROR (apply-cmvn-sliding[5.5.762~1-0062]:~TableWriter():util/kaldi-table-inl.h:1539) Error closing TableWriter [in destructor].
[ Stack-Trace: ]
/home/psadmin/projects/kaldi-tf/kaldi/src/lib/libkaldi-base.so(kaldi::MessageLogger::LogMessage() const+0x8b7) [0x7f618cf15d1d]
apply-cmvn-sliding(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x11) [0x40a4a9]
apply-cmvn-sliding(kaldi::TableWriter<kaldi::KaldiObjectHolder<kaldi::MatrixBase<float> > >::~TableWriter()+0x59) [0x412971]
apply-cmvn-sliding(main+0x5e0) [0x40946d]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f618572a555]
apply-cmvn-sliding() [0x408dc9]
terminate called after throwing an instance of 'kaldi::KaldiFatalError'
what(): kaldi::KaldiFatalError
/bin/sh: line 1: 6753 Aborted apply-cmvn-sliding --norm-vars=false --center=true --cmn-window=300 scp:/home/psadmin/projects/voxceleb/data/voxceleb_train/split40/1/feats.scp ark:-
6754 | select-voiced-frames ark:- scp,s,cs:/home/psadmin/projects/voxceleb/data/voxceleb_train/split40/1/vad.scp ark:-
Exception in thread Thread-2:
Traceback (most recent call last):
File "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner
self.run()
File "/usr/lib64/python2.7/threading.py", line 765, in run
self.__target(*self.__args, **self.__kwargs)
File "/home/psadmin/projects/kaldi-tf/tf-kaldi-speaker/dataset/kaldi_io.py", line 387, in cleanup
raise SubprocessFailed('cmd %s returned %d !' % (cmd,ret))
SubprocessFailed: cmd apply-cmvn-sliding --norm-vars=false --center=true --cmn-window=300 scp:/home/psadmin/projects/voxceleb/data/voxceleb_train/split40/1/feats.scp ark:- | select-voiced-frames ark:- scp,s,cs:/home/psadmin/projects/voxceleb/data/voxceleb_train/split40/1/vad.scp ark:- returned 134 !
Exception KeyboardInterrupt in <module 'threading' from '/usr/lib64/python2.7/threading.pyc'> ignored
# Accounting: time=873 threads=1
# Ended (code 1) at Tue Aug 4 13:26:16 MSK 2020, elapsed time 873 seconds
Hello,
thanks for the great work, it is really useful!
I have a question about how to set the number of steps per epochs for KaldiDataRandomQueue.
As far as I know, an epoch means training the neural network with all the training data for one cycle. In an epoch, we use all of the data at least once. There are many steps in one epoch, and in one step, batch_size examples are processed.
But I don't see in the code for KaldiDataRandomQueue how you make sure to use all training data at least one for one epoch. So I'm having troubles to set the number of steps.
Please, Can you explain to me how I can make sure that the whole training set is seen and how to set the number of steps?
Thank you in advance.
Thanks for realsing this useful implement, it greatly helps my work.
However, I notice that there are GhostVLAD pooling experiments in RESULTS.md file, but I did not find relevant function in pooling.py file. I currently need to do some tests on this popular pooling method.
Would you like to provide GhostVLAD pooling function? Truely appreciate your help.
the jumpahead function of random module is eliminated in python3, I removed it in the code, and found that the eer got degradation. I just want to ask is this function useful in model training? I think the os.urandom can already gives us good randomness
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.