Git Product home page Git Product logo

127-hours-malay-conversational-speech-data-by-mobile-phone's Introduction

127-Hours-Malay-Conversational-Speech-Data-by-Mobile-Phone

Description

The 127 Hours - Malay Conversational Speech Data by Mobile Phone collected by phone involved 142 native speakers, developed with proper balance of gender ratio, Speakers would choose a few familiar topics out of the given list and start conversations to ensure dialogues' fluency and naturalness. The recording devices are various mobile phones. The audio format is 16kHz, 16bit, uncompressed WAV, and all the speech data was recorded in quiet indoor environments. All the speech audio was manually transcribed with text content, the start and end time of each effective sentence, and speaker identification.

For more details, please refer to the link: https://www.nexdata.ai/datasets/1280?source=Github

Specifications

Format

16kHz 16bit, uncompressed wav, mono channel;

Environment

quiet indoor environment, without echo;

Recording content

dozens of topics are specified, and the speakers make dialogue under those topics while the recording is performed;

Demographics

142 speakers totally, with 46% males and 54% females.

Annotation

annotating for the transcription text, speaker identification, gender and noise symbols;

Device

Android mobile phone, iPhone;

Language

Malay;

Application scenarios

speech recognition; voiceprint recognition;

Accuracy rate

the word accuracy rate is not less than 98%

Licensing Information

Commercial License

127-hours-malay-conversational-speech-data-by-mobile-phone's People

Contributors

nexdata avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.