Git Product home page Git Product logo

biosignal_challenge_2020's Introduction

Biosignal Challenge 2020 (CVUT)

Link to the event description:

http://sami.fel.cvut.cz/biosignalchallenge2020/

About

This small repo is designed to solve biosignal challenge described in the link above. Shortly: the challenge is to count amount of syllabuses and length of speech for kids of a various age. At the end it should help to calculate articulation rate.

Articulation rate (copy paste from challenge webpage)

Articulation rate (AR) is a prosodic feature that indicates the number of spoken speech units per time. It is typically measured during connected speech where all types of pauses including silence, respiration and hesitations (such as /ah/, /um/, etc.) are excluded. Therefore, the articulation rate is mainly viewed as a representation of speech motor control since the linguistic effects are reduced.

Most researchers agree that AR can be affected by certain variables, which includes the length of utterance, locus of the word or phrase in the sentence and speaking context as well as speaking task such as reading, picture description, spontaneous speech, etc.

Different studies use different metrics to quantify AR. The most commonly used metrics are word per minute (WPM), syllable per second (SPS) and phoneme per second (PPS). The most suitable metric for AR estimation is SPS since syllables can be detected more easily than words or phonemes.

The goal of the Biosignal Challenge 2020 is to use the computing environment MATLAB to develop an algorithm for AR estimation in human speech signals by detecting the number of syllables (NOS) and measuring the duration of fluent speech (DFS), which requires excluding all types of pauses in each utterance.

How to run

Network train

Go into detect_speech or detect_syls directory. Run setup.m in the folder to configure path.

Detection on Biosignal Challenge 2020 contest files

Go into runFolder directory. Put all .wav files there. Run test.p file.

Params

Check configuration file loadSpeechParams or loadSylsParams and all related paths in it. Next parameters should be set according labeled data and sound files:

  • params.wavsFolder
  • params.GTPath
  • params.tgFolder
  • params.tgTierName
  • params.tgIntervalName

Params files also include all configuration parameters with options for feature extractor and layers configuration. After each training they will be saved next to all checkpoints so all configuration will be stored with results it have produced.

Splitting labeled datased defined by params.trnProportion and params.trnNames + params.valNames. params.trnProportion is used to split labeled set on train and validation portions, but if some files manually named by usage of options params.trnNames or params.valNames they will be added to mentioned set. To skip proportional splitting and rely only on written filenames, set params.trnProportion to zero

Validate by running task.p

To validate result pick the best network from checkpoint folder. Put this network checkpoint (it is a net itself) into net folder with trainOpt file. This network and its options will be used to evaluate result on all files by testing script. Also put all .wav files into runFolder. It is necessarily for running test.p script and was designed this way by challenge organizers. Just run test.p at the end.

Data

Put all .wav files inside data/wavs folder in the project's root. They are .gitignored by default, so this step should be done manually. To run test.p they also should be in runFolder.

Toolboxes used

  • audio_system_toolbox
  • neural_network_toolbox
  • signal_toolbox
  • statistics_toolbox

Authors

Yeva Prysiazhniuk & Ilia Shipachev

Biblio links

  • Hendrik Purwins, Bo Li, Tuomas Virtanen, Jan Schlüter, Shuo-yiin Chang, Tara Sainath, „Deep Learning for Audio Signal Processing,“ JOURNAL OF SELECTED TOPICS OF SIGNAL PROCESSING, p. 06–219, 2019.
  • Huy Phan, Philipp Koch, Fabrice Katzberg, Marco Maass, Radoslaw Mazur and Alfred Mertins, „Audio Scene Classification with Deep Recurrent Neural Networks,“ INTERSPEECH, 2017.
  • Federico Colangelo, Federica Battisti, Alessandro Neri, Marco Carli, „CONVOLUTIONAL RECURRENT NEURAL NETWORK FOR AUDIO EVENTSCLASSIFICATION,“ Detection and Classification of Acoustic Scenes and Events, 2018.
  • Kisler, T. and Reichel U. D. and Schiel, F., „Multilingual processing of speech via web services,“ Computer Speech & Language, sv. 45, p. 326–347, 2017.
  • Klessa, K., Karpiński, M., Wagner, A., „Annotation Pro – a new software tool for annotation of linguistic and paralinguistic features.,“ Proceedings of the Tools and Resources for the Analysis of Speech Prosody (TRASP) Workshop, pp. 51-54, 2013.
  • Bořil, T., & Skarnitzl, R., „Tools rPraat and mPraat,“ Text, Speech, and Dialogue, p. 367–374, 2016.
  • I. The MathWorks, MATLAB and Audio Toolbox Release 2020a, Natick, Massachusetts, United States, 2020.
  • Sepp Hochreiter, Jürgen Schmidhuber, „Long short-term memory,“ Neural Computation, sv. 9, č. 8, p. 1735–1780, 1997.
  • Alec Wright, Eero-Pekka Damskägg, and Vesa Välimäki, „REAL-TIME BLACK-BOX MODELLING WITH RECURRENT NEURAL NETWORKS,“ v Proceedings of the 22nd International Conference on Digital Audio Effects , Birmingham, UK, 2019.
  • M. Inc., „Classify Sound Using Deep Learning - MATLAB & Simulink,“ 2020. [Online]. Available: https://www.mathworks.com/help/audio/gs/classify-sound-using-deep-learning.html.

biosignal_challenge_2020's People

Contributors

ishipachev avatar yevap avatar

Watchers

 avatar  avatar

biosignal_challenge_2020's Issues

Extract function causes problems

Hi,
I`ve been running into problems when testing your algorithm on some of our data.
It seems your are using function extract from a given toolbox but my desktop is unaware of it.
In particular, running articulation_rate displays message

Undefined function 'extract' for input arguments of type 'double'.

My desktop is picking extract function from here:

which extract

C:\Program Files\MATLAB\R2019_a\toolbox\slcoverage\@cvdata\extract.m % cvdata method

and I think thats not the one your are using.
Can you please specify the origin of extract?

Thanks
Vojta

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.