Recurrent neural network-based speech recognition using MATLAB
Praveen Edward James,
Mun Hou Kit,
Chockalingam Aravind Vaithilingam and
Alan Tan Wee Chiat
International Journal of Intelligent Enterprise, 2020, vol. 7, issue 1/2/3, 56-66
Abstract:
The purpose of this paper is to design an efficient recurrent neural network (RNN)-based speech recognition system using software with long short-term memory (LSTM). The design process involves speech acquisition, pre-processing, feature extraction, training and pattern recognition tasks for a spoken sentence recognition system using LSTM-RNN. There are five layers namely, an input layer, a fully connected layer, a hidden LSTM layer, SoftMax layer and a sequential output layer. A vocabulary of 80 words which constitute 20 sentences is used. The depth of the layer is chosen as 20, 42 and 60 and the accuracy of each system is determined. The results reveal that the maximum accuracy of 89% is achieved when the depth of the hidden layer is 42. Since the depth of the hidden layer is fixed for a task, increased performance can be achieved by increasing the number of hidden layers.
Keywords: speech recognition; feature extraction; pre-processing; recurrent neural network; RNN; long short-term memory; LSTM; hidden layer; MATLAB. (search for similar items in EconPapers)
Date: 2020
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.inderscience.com/link.php?id=104645 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ids:ijient:v:7:y:2020:i:1/2/3:p:56-66
Access Statistics for this article
More articles in International Journal of Intelligent Enterprise from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().