View Code? Open in Web Editor NEW

End-to-End Attention-Based Large Vocabulary Speech Recognition

License: MIT License

Python 91.26% Shell 0.36% Makefile 0.01% C++ 0.51% HTML 0.01% TeX 1.07% Batchfile 0.01% Gnuplot 0.01% C 1.21% Cuda 5.55%

attention-lvcsr's Introduction

Attention-based Speech Recognizer

The reference implementation for the paper

End-to-End Attention-based Large Vocabulary Speech Recognition.
Dzmitry Bahdanau, Jan Chorowski, Dmitriy Serdyuk, Philemon Brakel, Yoshua Bengio.

(arxiv draft, submitted to ICASSP 2016).

How to use

install all the dependencies (see the list below)
set your environment variables by calling source env.sh

Then, please proceed to exp/wsj for the instructins how to replicate our results on Wall Street Journal (WSJ) dataset (available at the Linguistic Data Consortium as LDC93S6B and LDC94S13B).

Dependencies

Python packages: pykwalify, toposort, pyyaml, numpy, pandas, pyfst
kaldi
kaldi-python

Given that you have the dataset in HDF5 format, the models can be trained without Kaldi.

Subtrees

The repository contains custom modified versions of Theano, Blocks, Fuel, picklable-itertools, Blocks-extras as subtrees (please follow this link for more information about subtrees). In order to ensure that these specific versions are used, we recommend to uninstall regular installations of these packages if you have them installed in addition to sourcing env.sh.

License

MIT

Recommend Projects

nke001 / attention-lvcsr Goto Github PK

attention-lvcsr's Introduction

Attention-based Speech Recognizer

How to use

Dependencies

Subtrees

License

attention-lvcsr's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent