covarep / covarep Goto Github PK
View Code? Open in Web Editor NEWA Cooperative Voice Analysis Repository for Speech Technologies
Home Page: http://covarep.github.io/covarep
License: Other
A Cooperative Voice Analysis Repository for Speech Technologies
Home Page: http://covarep.github.io/covarep
License: Other
Covarep A Cooperative Voice Analysis Repository for Speech Technologies Version (after 1.4.1) http://covarep.github.io/covarep Covarep is an open-source repository of advanced speech processing algorithms and is stored as a GitHub project (https://github.com/covarep/covarep) where researchers in speech processing can store original implementations of published algorithms. Over the past few decades a vast array of advanced speech processing algorithms have been developed, often offering significant improvements over the existing state-of-the-art. Such algorithms can have a reasonably high degree of complexity and, hence, can be difficult to accurately re-implement based on article descriptions. Another issue is the so-called 'bug magnet effect' with re-implementations frequently having significant differences from the original ones. The consequence of all this has been that many promising developments have been under-exploited or discarded, with researchers tending to stick to conventional analysis methods. By developing Covarep we are hoping to address this by encouraging authors to include original implementations of their algorithms, thus resulting in a single de facto version for the speech community to refer to. We envisage a range of benefits to the repository: 1) Reproducible research: Covarep will allow fairer comparison of algorithms in published articles. 2) Encouraged usage: the free availability of these algorithms will encourage researchers from a wide range of speech-related disciplines to exploit them for their own applications. 3) Feedback: as a GitHub project users will be able to offer comments on algorithms, report bugs, suggest improvements etc. Scope We welcome contributions from a wide range of speech processing areas, including (but not limited to): Speech analysis, synthesis, conversion, transformation, enhancement, glottal source/voice quality analysis, etc. Contribute! We believe that the Covarep repository has a great potential benefit to the speech research community and we hope that you will consider contributing your published algorithms to it. If you have any questions, comments issues etc regarding Covarep please contact us on one of the email addresses below. Please forward this email to others who may be interested. Please also have a look at the webiste http://covarep.github.io/covarep and the Covarep.pdf document in the documentation directory for more information. Octave Most of the functions in Covarep are Octave compatible. However, it is necessary to install the following packages: tsa, optimization, signal. How to cite If you publish experiment results obtained by using Covarep, please cite the repository using the following publication: G. Degottex, J. Kane, T. Drugman, T. Raitio and S. Scherer, "COVAREP - A collaborative voice analysis repository for speech technologies", In Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy 2014. Also, within the text of your paper, please mention the version used. E.g. "... we compared with methods X, Y, Z available in [Covarep](v1.0.1)." Maintainers Gilles Degottex <[email protected]> University of Crete, Heraklion, Greece John Kane <[email protected]> Trinity College Dublin, Dublin, Ireland Thomas Drugman <[email protected]> University of Mons, Mons, Belgium Tuomo Raitio <[email protected]> Aalto University, Espoo, Finland Stefan Scherer <[email protected]> University of Southern California, Los Angeles, USA
Currently the frequency check only tests if sample rate is above 16kHz and then down-samples the audio to 16kHz. I suggest to change this to capture frequencies that are "not equal" to 16000 and change them.
Line 77: if fs~=16000
Add CPP and its time/quefrency smoothing variants to the glottalsource code
recently found an issue with formant_CGDZP when it breaks at:
120 formantPeaks=[formantPeaks ; peakIndex];
and
170 while(possibleValues(1)==0) %discard zero entries
at 120 the crash happens due to mismatching when concatenating
at 170 the crash happens due to access attempt in empty array.
i made a fix at work and will commit soon when double checked that it works.
cheers
input: add zeros (maybe about 500 are sufficient) at beginning of file and it will break
From Chin Chii Yeh: "I think that line 99 and 100 should be placed above line 93, so that the size of win and ccp and cc will be the same when min(order, dftlen/2-1)=dftlen/2-1. If not there might be error in line 128."
In particular in the glottalsource files, a lot of the code could be made more efficient by making it more vectorised.
@ThomasDrugman, can we clip the values directly inside the pitch_srh.m code ?
Or this behavior was expected by you ?
See discussion at fb3a151.
When loading system_net_creak
in octave, the following errors pop up:
warning: load: can't find the file C:\Program Files\MATLAB\R2011a\toolbox\nnet\nnet\nnweight\dotprod.m
warning: skipping over ''
warning: load: can't find the file C:\Program Files\MATLAB\R2011a\toolbox\nnet\nnet\nnweight\dotprod.m
warning: skipping over ''
warning: load: can't find the file C:\Program Files\MATLAB\R2011a\toolbox\nnet\nnet\nnweight\dotprod.m
warning: skipping over ''
warning: load: can't find the file C:\Program Files\MATLAB\R2011a\toolbox\nnet\nnet\nnnetinput\netsum.m
warning: skipping over ''
warning: load: can't find the file C:\Program Files\MATLAB\R2011a\toolbox\nnet\nnet\nnnetinput\netsum.m
warning: skipping over ''
warning: load: can't find the file C:\Program Files\MATLAB\R2011a\toolbox\nnet\nnet\nntransfer\tansig.m
warning: skipping over ''
warning: load: can't find the file C:\Program Files\MATLAB\R2011a\toolbox\nnet\nnet\nntransfer\logsig.m
warning: skipping over ''
warning: load: can't find the file C:\Program Files\MATLAB\R2011a\toolbox\nnet\nnet\nntransfer\tansig.m
warning: skipping over ''
warning: load: can't find the file C:\Program Files\MATLAB\R2011a\toolbox\nnet\nnet\nntransfer\logsig.m
warning: skipping over ''
warning: no constructor for class network
warning: load: element has been converted to a structure
Since system_net_creak
isn't successfully loaded, creaky_voice_do_detection
fails at line 87
> creak_pp=sim(net,X);
error: Structure doesn't seem to be a neural network
I'm trying to extract the COVAREP feature from a database at 8KHz. The files were collected exactly the same way, however I'm facing some problems running the pitch_srh and gci_sedreams of methods for a few of them (less than 5% I would say).
In pitch_srh I get the following error:
Attempted to access Spec(4005); index out of bounds because numel(Spec)=4000.
Error in pitch_srh>SRH_EstimatePitch (line 154)
SRHs(freq)=(Spec(freq)+Spec(2_freq)+Spec(3_freq)+Spec(4_freq)+Spec(5_freq))-(Spec(round(1.5_freq))+Spec(round(2.5_freq))+Spec(round(3.5_freq))+Spec(round(4.5_freq)));
Error in pitch_srh (line 92)
[F0s,SRHVal,time] = SRH_EstimatePitch(res',fs,f0min,f0max,hopsize);
Error in COVAREP_feature_extraction (line 107)
[srh_f0,srh_vuv,~,srh_time] = pitch_srh(x,fs,F0min,F0max, ...
In gci_sdreams I get the following error:
Improper assignment with rectangular empty matrix.
Error in gci_sedreams (line 179)
gci(Ind)=start+posi-1;
Error in COVAREP_feature_extraction (line 117)
GCI = gci_sedreams(x,fs,F0med,1); % SEDREAMS GCI detection
Do you think this could have anything to do with the 8K sampling frequency?
Thanks!
José
memory goes out of reasonable bounds.
i will fix it with a temporary workaround:
specMat = zeros(fs, size(frameMatWinMean,2));
for i = 1:size(frameMatWinMean,2)
specMat(:,i) = abs( fft(frameMatWinMean(:,i),fs) )';
end
in addition i will add clear commands to delete unnecessary matrices whenever possible.
There is a findpeaks inbuilt function in matlab and also one in VOICEBOX. Both functions output peak amplitude and index location but in different order. I propose renaming the one in VOICEBOX findpeaks_VB.m to clear this up
Function COVAREP_feature_extraction.m, line 160:
MCEP(m,:) = spec2mfcc(hspec2spec(Ete), fs, MCEP_ord)';
The function 'spec2mfcc' is not defined anymore. I think I solved the problem using:
MCEP(m,:) = hspec2fwcep(Ete, fs, MCEP_ord)';
To test if the methods results do not vary from one version to the next.
I've been feeding get_kd_creak_features()
individual audio files of vowels, segmented out of a larger recording. I've found that with some audio files with duration less than 100 ms, pitch_srh()
returns an index error. I've put up a sample audio file that has this problem: http://jofrhwld.github.io/assets/test_11000.wav
> [x,fs] = wavread("test_11000.wav");
> [H2H1,res_p,ZCR,F0,F0mean,enerN,pow_std,creakF0] = get_kd_creak_features(x,fs);
error: SRH_EstimatePitch: A(I): index out of bounds; value 1 out of bound 0
error: called from:
error: covarep/glottalsource/pitch_srh.m at line 171, column 10
error: covarep/glottalsource/pitch_srh.m at line 87, column 24
error: covarep/glottalsource/creaky_voice_detection/private/get_kd_creak_features.m at line 73, column 11
I wasn't sure if this was an issue with duration or samples, and I got the same error on the same bit of audio sampled at 16000.
Add a small set of "annoying" recordings and run all methods on them systematically to check the methods are robust enough (do not crash easily).
I suggest to add one on the website in order to advertise the project.
A text-version duplicate of this list could be present in the doc directory also.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.