Git Product home page Git Product logo

speech-recognition's Introduction

Automatic Speech Recognition (ASR)

Project Overview

we will build a deep neural network that functions as part of an end-to-end automatic speech recognition (ASR) pipeline! The completed pipeline will accept raw audio as input and return a predicted transcription of the spoken language. The full pipeline is summarized in the figure below.

  • STEP 1 is a pre-processing step that converts raw audio to one of two feature representations that are commonly used for ASR.
  • STEP 2 is an acoustic model which accepts audio features as input and returns a probability distribution over all potential transcriptions. After learning about the basic types of neural networks that are often used for acoustic modeling, we will engage in our own investigations, to design your own acoustic model!
  • STEP 3 in the pipeline takes the output from the acoustic model and returns a predicted transcription.

Dataset

We begin by investigating the LibriSpeech dataset that will be used to train and evaluate your models. The algorithm will first convert any raw audio to feature representations that are commonly used for ASR. We will then move on to building neural networks that can map these audio features to transcribed text.

speech-recognition's People

Contributors

imgbotapp avatar soheil-mp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

speech-recognition's Issues

Got tensor shape error

Hi, i just run my own experiment using my own dataset. But I got this error

InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument:  Specified a list with shape [?,13] from a tensor with shape [20,161]
	 [[node model_1/bidirectional/forward_bidir_rnn/TensorArrayUnstack/TensorListFromTensor (defined at <ipython-input-20-d1f3145faba3>:88) ]]
	 [[model_1/ctc/Cast_3/_180]]
  (1) Invalid argument:  Specified a list with shape [?,13] from a tensor with shape [20,161]
	 [[node model_1/bidirectional/forward_bidir_rnn/TensorArrayUnstack/TensorListFromTensor (defined at <ipython-input-20-d1f3145faba3>:88) ]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_4456]

This might be the shape of my data, but I confused with this error, could you tell me what should I do? thank you very much!

Got a NaN result at training_loss and Validation_Loss

Hi, there! I have solved my first problem before, but now i got another error. When I trained my data to model, the value of loss and val_loss was NaN, how to fix this? are there any problems with my data? Thank you very much

image

Need to train it on the custom data

I am currently working on a project where I am working on the custom data, I am unable to process my data from the repo, I am getting this error: train_corpus.json not found

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.