Git Product home page Git Product logo

voice2txt's Introduction

Voice2Txt

By using AUDIO CAPTCHA, in different websites, we want to create a big Database where a TEXT WORD (The one typed by the captcha user) is linked to a fingerprint of an AUDIO FILE (The one played on the captcha). But Why? .....We want to create subtitles for the media, by matching the audio word's fingerprints from a video, to the fingerprints in our Database, so we will be able to display the subtitles in Real Time or with minimal latency. By having human contribution in this process, we factor in the accent and tones with which words are spoken.

With this, we totally automate the currently subtitling work process, we also increase the accuracy of the subtitles and we open the door for further translation of the subtitle(once is completed) to another language. It's a WIN-WIN business model, in which people contribute to the subtitling and we make the subtitle possible. We would also OpenSource the database to create an ecosystem for other applications to be built using it.

Components

  1. MobilePrototype - A mimicking demo of the idea as an android application
  2. WebMockUp - A static UI prototype of the web interface
  3. NodeJSApplication - A work in progress of the web application built using NodeJS

Way it works

  • Sample generation through Audio Captcha

    1. Short audio phrases with varying accent and tones are used
    2. User inputs deductions in terms of text are stored against the audio source
    3. fpcalc package is used to generate a fingerprint for the audio clip
    4. The fingerprint & text are saved in the database
  • Subtitle Generation for an uploaded video sample

    1. The audio is stripped from the video saple uploaded
    2. A fingerprint for the audio sample is generated
    3. This fingerprint is searched for a match from the sample database
    4. If its a match, then the text is returned back and a subtitle is constructed.
    5. The subtitle which is in the form of a text is translated into other languages through machine translation

    Dependencies

    1. NodeJS
    2. MeteorJS
    3. fpcalc command-line tool

voice2txt's People

Contributors

jugautam avatar parallelthought avatar pinakinanda avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

jugautam

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.