Speech commands classification with recurrent neural networks
The file Speech_commands_classification_with_recurrent_neural_networks.pdf
is the report explaining used architectures, showing obtained results and comparisons.
Dataset: Speech Commands Dataset
https://www.kaggle.com/c/tensorflow-speech-recognition-challenge/data
- test and compare different network architectures (at least one of them should be Long short-term memory (LSTM))
- investigate influence of parameters change on the obtained results
- present confusion matrix (with appropriate discussion)
- in case of accuracy or efficiency problem a subset of classes can be selected and tested (e.g. only “yes” and “no” commands)
- please pay special attention on “silence” and “unknown” classes - test different approaches (e.g. separate network for their recognition)
Useful resources:
https://www.kaggle.com/davids1992/speech-representation-and-data-exploration
https://www.coursera.org/lecture/nlp-sequence-models/recurrent-neural-network-model- ftkzt