Speech recognition is an emerging research area having its focus on human computer interactions (HCI) and expert systems. Analyzing speech signals are often tricky for processing, due to the non-stationary nature of audio signals. The work in this paper presents a system for speaker independent speech recognition, which is tested on isolated words from three oriental languages, i.e., Urdu, Persian, and Pashto. The proposed approach combines discrete wavelet transform (DWT) and feed-forward artificial neural network (FFANN) for the purpose of speech recognition. DWT is used for feature extraction and the FFANN is utilized for the classification purpose. The task of isolated word recognition is accomplished with speech signal capturing, creating a code bank of speech samples, and then by applying pre-processing techniques. For classifying a wave sample, four layered FFANN model is used with resilient back-propagation (Rprop). The proposed system yields high accuracy for two and five classes. For db-8 level-5 DWT filter 98.40%, 95.73%, and 95.20% accuracy rate is achieved with 10, 15, and 20 classes, respectively. Haar level-5 DWT filter shows 97.20%, 94.40%, and 91% accuracy rate for 10, 15, and 20 classes, respectively. The proposed system is also compared with a baseline method where it shows better performance. The proposed system can be utilized as a communication interface to computing and mobile devices for low literacy regions.
|Number of pages||21|
|Journal||Malaysian Journal of Computer Science|
|Publication status||Published - 1 Sept 2015|