Git Product home page Git Product logo

covarep's Introduction

                                Covarep
        A Cooperative Voice Analysis Repository for Speech Technologies
                            Version (after 1.4.1)
                    http://covarep.github.io/covarep



Covarep is an open-source repository of advanced speech processing algorithms
and is stored as a GitHub project (https://github.com/covarep/covarep) where
researchers in speech processing can store original implementations of published
algorithms.

Over the past few decades a vast array of advanced speech processing algorithms
have been developed, often offering significant improvements over the existing
state-of-the-art. Such algorithms can have a reasonably high degree of
complexity and, hence, can be difficult to accurately re-implement based on
article descriptions. Another issue is the so-called 'bug magnet effect' with
re-implementations frequently having significant differences from the original
ones. The consequence of all this has been that many promising developments
have been under-exploited or discarded, with researchers tending to stick to
conventional analysis methods.

By developing Covarep we are hoping to address this by encouraging authors to
include original implementations of their algorithms, thus resulting in a
single de facto version for the speech community to refer to.

We envisage a range of benefits to the repository:

1) Reproducible research: Covarep will allow fairer comparison of algorithms
in published articles.

2) Encouraged usage: the free availability of these algorithms will encourage
researchers from a wide range of speech-related disciplines to exploit them
for their own applications.

3) Feedback: as a GitHub project users will be able to offer comments on
algorithms, report bugs, suggest improvements etc.

Scope
    We welcome contributions from a wide range of speech processing areas,
    including (but not limited to): Speech analysis, synthesis, conversion,
    transformation, enhancement, glottal source/voice quality analysis, etc.

Contribute!
    We believe that the Covarep repository has a great potential benefit to the
    speech research community and we hope that you will consider contributing
    your published algorithms to it. If you have any questions, comments issues
    etc regarding Covarep please contact us on one of the email addresses below.
    Please forward this email to others who may be interested.

Please also have a look at the webiste http://covarep.github.io/covarep and the
Covarep.pdf document in the documentation directory for more information.

Octave
    Most of the functions in Covarep are Octave compatible.
    However, it is necessary to install the following packages:
        tsa, optimization, signal.

How to cite
    If you publish experiment results obtained by using Covarep, please cite
    the repository using the following publication:
      G. Degottex, J. Kane, T. Drugman, T. Raitio and S. Scherer, "COVAREP - A
      collaborative voice analysis repository for speech technologies", In
      Proc. IEEE International Conference on Acoustics, Speech and Signal
      Processing (ICASSP), Florence, Italy 2014.

    Also, within the text of your paper, please mention the version used.
    E.g. "... we compared with methods X, Y, Z available in [Covarep](v1.0.1)."


Maintainers
    Gilles Degottex <[email protected]>
        University of Crete, Heraklion, Greece

    John Kane <[email protected]>
        Trinity College Dublin, Dublin, Ireland
        
    Thomas Drugman <[email protected]>
        University of Mons, Mons, Belgium
        
    Tuomo Raitio <[email protected]>
        Aalto University, Espoo, Finland

    Stefan Scherer <[email protected]>
        University of Southern California, Los Angeles, USA
        

covarep's People

Contributors

alexis-michaud avatar codacola avatar gillesdegottex avatar goiosunsw avatar jckane avatar mwlodarczak avatar schererstefan avatar thomasdrugman avatar wangy319 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

covarep's Issues

formant_CGDZP breaks if fed with constant input at beginning of wav-file

recently found an issue with formant_CGDZP when it breaks at:

120 formantPeaks=[formantPeaks ; peakIndex];

and

170 while(possibleValues(1)==0) %discard zero entries

at 120 the crash happens due to mismatching when concatenating
at 170 the crash happens due to access attempt in empty array.

i made a fix at work and will commit soon when double checked that it works.
cheers

input: add zeros (maybe about 500 are sufficient) at beginning of file and it will break

env_te.m dftlen, order and winlen can be inconsistent

From Chin Chii Yeh: "I think that line 99 and 100 should be placed above line 93, so that the size of win and ccp and cc will be the same when min(order, dftlen/2-1)=dftlen/2-1. If not there might be error in line 128."

Make code more `vectorised'

In particular in the glottalsource files, a lot of the code could be made more efficient by making it more vectorised.

Octave: Error in loading system_net_creak

When loading system_net_creak in octave, the following errors pop up:

warning: load: can't find the file C:\Program Files\MATLAB\R2011a\toolbox\nnet\nnet\nnweight\dotprod.m
warning: skipping over ''
warning: load: can't find the file C:\Program Files\MATLAB\R2011a\toolbox\nnet\nnet\nnweight\dotprod.m
warning: skipping over ''
warning: load: can't find the file C:\Program Files\MATLAB\R2011a\toolbox\nnet\nnet\nnweight\dotprod.m
warning: skipping over ''
warning: load: can't find the file C:\Program Files\MATLAB\R2011a\toolbox\nnet\nnet\nnnetinput\netsum.m
warning: skipping over ''
warning: load: can't find the file C:\Program Files\MATLAB\R2011a\toolbox\nnet\nnet\nnnetinput\netsum.m
warning: skipping over ''
warning: load: can't find the file C:\Program Files\MATLAB\R2011a\toolbox\nnet\nnet\nntransfer\tansig.m
warning: skipping over ''
warning: load: can't find the file C:\Program Files\MATLAB\R2011a\toolbox\nnet\nnet\nntransfer\logsig.m
warning: skipping over ''
warning: load: can't find the file C:\Program Files\MATLAB\R2011a\toolbox\nnet\nnet\nntransfer\tansig.m
warning: skipping over ''
warning: load: can't find the file C:\Program Files\MATLAB\R2011a\toolbox\nnet\nnet\nntransfer\logsig.m
warning: skipping over ''
warning: no constructor for class network
warning: load: element has been converted to a structure

Since system_net_creak isn't successfully loaded, creaky_voice_do_detection fails at line 87

> creak_pp=sim(net,X);
error: Structure doesn't seem to be a neural network

Problems running COVAREP_feature_extraction

I'm trying to extract the COVAREP feature from a database at 8KHz. The files were collected exactly the same way, however I'm facing some problems running the pitch_srh and gci_sedreams of methods for a few of them (less than 5% I would say).

In pitch_srh I get the following error:
Attempted to access Spec(4005); index out of bounds because numel(Spec)=4000.

Error in pitch_srh>SRH_EstimatePitch (line 154)
SRHs(freq)=(Spec(freq)+Spec(2_freq)+Spec(3_freq)+Spec(4_freq)+Spec(5_freq))-(Spec(round(1.5_freq))+Spec(round(2.5_freq))+Spec(round(3.5_freq))+Spec(round(4.5_freq)));

Error in pitch_srh (line 92)
[F0s,SRHVal,time] = SRH_EstimatePitch(res',fs,f0min,f0max,hopsize);

Error in COVAREP_feature_extraction (line 107)
[srh_f0,srh_vuv,~,srh_time] = pitch_srh(x,fs,F0min,F0max, ...

In gci_sdreams I get the following error:
Improper assignment with rectangular empty matrix.

Error in gci_sedreams (line 179)
gci(Ind)=start+posi-1;

Error in COVAREP_feature_extraction (line 117)
GCI = gci_sedreams(x,fs,F0med,1); % SEDREAMS GCI detection

Do you think this could have anything to do with the 8K sampling frequency?

Thanks!
José

SRH_pitch.m has serious issues with long files

memory goes out of reasonable bounds.
i will fix it with a temporary workaround:

specMat = zeros(fs, size(frameMatWinMean,2));
for i = 1:size(frameMatWinMean,2)
specMat(:,i) = abs( fft(frameMatWinMean(:,i),fs) )';
end

in addition i will add clear commands to delete unnecessary matrices whenever possible.

findpeaks.m function overloaded by VOICEBOX

There is a findpeaks inbuilt function in matlab and also one in VOICEBOX. Both functions output peak amplitude and index location but in different order. I propose renaming the one in VOICEBOX findpeaks_VB.m to clear this up

COVAREP_feature_extraction.m - spec2mfcc not defined

Function COVAREP_feature_extraction.m, line 160:
MCEP(m,:) = spec2mfcc(hspec2spec(Ete), fs, MCEP_ord)';
The function 'spec2mfcc' is not defined anymore. I think I solved the problem using:
MCEP(m,:) = hspec2fwcep(Ete, fs, MCEP_ord)';

pitch_srh breaks on short audio

I've been feeding get_kd_creak_features() individual audio files of vowels, segmented out of a larger recording. I've found that with some audio files with duration less than 100 ms, pitch_srh() returns an index error. I've put up a sample audio file that has this problem: http://jofrhwld.github.io/assets/test_11000.wav

> [x,fs] = wavread("test_11000.wav");
> [H2H1,res_p,ZCR,F0,F0mean,enerN,pow_std,creakF0] = get_kd_creak_features(x,fs);
error: SRH_EstimatePitch: A(I): index out of bounds; value 1 out of bound 0
error: called from:
error:   covarep/glottalsource/pitch_srh.m at line 171, column 10
error:   covarep/glottalsource/pitch_srh.m at line 87, column 24
error:   covarep/glottalsource/creaky_voice_detection/private/get_kd_creak_features.m at line 73, column 11

I wasn't sure if this was an issue with duration or samples, and I got the same error on the same bit of audio sampled at 16000.

Add tests for quality assurance

Add a small set of "annoying" recordings and run all methods on them systematically to check the methods are robust enough (do not crash easily).

Add a list of available methods

I suggest to add one on the website in order to advertise the project.
A text-version duplicate of this list could be present in the doc directory also.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.