LPC AND ITS DERIVATIVES FOR STUTTERED SPEECH RECOGNITION
Keywords:Stuttering, linear prediction coefficient, linear prediction cepstral coefficient, line spectral frequency
AbstractStuttering or stammering is disruptions in the normal flow of speech by dysfluencies, which can be repetitions or prolongations of phoneme or syllable. Stuttering cannot be permanently cured, though it may go into remission or stutterers can learn to shape their speech into fluent speech with an appropriate speech pathology treatment. Linear Prediction Coefficient (LPC), Linear Prediction Cepstral Coefficient (LPCC) and Line Spectral Frequency (LSF) were used for the feature extraction, while Multilayer Perceptron (MLP) was used as the classifier. The samples used were obtained from UCLASS (University College London Archive of Stuttered Speech) release 1. The LPCC-MLP system had the highest overall sensitivity, precision and the lowest overall misclassification rate. LPCC-MLP system had challenges with F3, the sensitivity of the system to F3 was negligible, similarly, the precision was moderate and the misclassification rate was negligible, but above 10%.Â
G. Manjula and M. Kumar. 2014. Stuttered Speech Recognition for Robotic Control Work. 3(12).
L. Chee, O. Ai, M. Hariaran, and S. Yaacob. 2009. MFCC based Recognition of Repetitions and Prolongations in Stuttered Speech using k-NN and LDA. In SCOReD2009-Proceedings of 2009 IEEE Student Conference on Research and Development.
M. Hariharan, L. S. Chee, and S. Yaacob. 2012. Analysis of Infant Cry Through Weighted Linear Prediction Cepstral Coefficients and Probabilistic Neural Network. J. Med. Syst. 36(3): 1309-15.
J. Zhang, B. Dong, and Y. Yan. 2013. A Computer-Assist Algorithm to Detect Repetitive Stuttering Automatically. In 2013 International Conference on Asian Language Processing (IALP). 249-252.
T. Voigt, K. Hewage, and P. Alm. 2014. Smartphone Support for Persons Who Stutter. In 13th International Symposium on Information Processing In Sensor Networks. 293-294.
S. Awad. 1997. The Application of Digital Speech Processing to Stuttering Therapy. In Instrumentation and Measurement Technology Conference. 1361-1367.
E. G. Conture and J. S. Yaruss. 2002. Treatment Efficacy Summary. Am. speech-language Hear. Assoc. 1993: 20850.
C. Oliveira, D. Cunha, and A. Santos. 2013. Risk Factors for Stuttering in Disfluent Children with Familial Recurrence. Audiol. Res. 18(1): 43-49.
L. S. Chee, O. C. Ai, M. Hariharan, and S. Yaacob. 2009. Automatic detection of prolongations and repetitions using LPCC. In International Conference for Technical Postgraduates 2009. TECHPOS 2009. 1-4.
J. PÃ¡lfy and J. PospÃchal. 2011. Recognition of Repetitions Using Support Vector Machines. Signal Process. Algorithms.
K. M. Ravikumar, B. Reddy, R. Rajagopal, and H. C. Nagaraj. 2008. Automatic Detection of Syllable Repetition in Read Speech for Objective Assessment of Stuttered Disfluencies. In Proceedings of World Academy Science, Engineering and Technology. 270-273.
K. T. Al-Sarayreh, R. E. Al-Qutaish, and B. M. Al-Kasasbeh. 2009. Using the Sound Recognition Techniques to Reduce the Electricity Consumption in Highways. J. Am. Sci. 5(2): 1-12.
S. Agrawal, A. K. Shruti, and C. R. Krishna. 2010. Prosodic Feature Based Text Dependent Speaker Recognition Using Machine Learning Algorithms. Int. J. Eng. Sci. Technol. 2(10): 5150-5157.
R. Kumar, R. Ranjan, S. K. Singh, R. Kala, A. Shukla, and R. Tiwari. 2009. Multilingual Speaker Recognition Using Neural Network. Proc. Front. Res. Speech Music. FRSM. 1-8.
K. M. Ravikumar, R. Rajagopal, and H. C. Nagaraj, 2009. An Approach for Objective Assessment of Stuttered Speech Using MFCC Features. ICGST Int. J. Digit. Signal Process. DSP. 9(1): 19-24.
S. Ismail and A. bin Ahmad. 2004. Recurrent Neural Network with Backpropagation Through Time Algorithm for Arabic Recognition. Proc. 18th ESM Magdeburg, Ger. 13-16.
A. M. Ahmad, S. Ismail, and D. F. Samaon. 2004. Recurrent Neural Network with Backpropagation Through Time for Speech Recognition. In Communications and Information Technology, 2004. ISCIT 2004. IEEE International Symposium on. 1: 98-102.
M. M. El Choubassi, H. E. El Khoury, C. E. J. Alagha, J. a. Skaf, and M. a. Al-Alaoui. 2003. Arabic Speech Recognition Using Recurrent Neural Networks. In Proceedings of the 3rd IEEE International Symposium on Signal Processing and Information Technology (IEEE Cat. No.03EX795). 543-547.
Q.-Z. Wu, I. Chang Jou, and S.-Y. Lee. 1997. On-line Signature Verification using LPC cepstrum and Neural Networks. Syst. Man, Cybern. Part B Cybern. IEEE Trans. 27(1): 148-153.
H. Beigi. 2011. Fundamentals of Speaker Recognition.
Q. Li. 2012. Speaker Authentication. Springer-Verlag Berlin Heidelberg.
V. Namburu. 2001. Speech Coder Using Line Spectral Frequencies of Cascaded Second Order Predictors.
P. Kabal and R. Ramachandran. 1986. The Computation of Line Spectral Frequencies using Chebyshev polynomials. Acoust. Speech Signal.
W. B. Kleijn, T. BÃ¤ckstrÃ¶m, and P. Alku. 2003. On Line Spectral Frequencies. IEEE Signal Process. Lett. 10(3): 75-77.
M. A. Al-Alaoui, L. Al-Kanj, J. Azar, and E. Yaacoub. 2008. Speech Recognition Using Artificial Neural Networks and Hidden Markov Models. IEEE Multidiscip. Eng. Educ. Mag. 3(3): 77-86.
S. S. Haykin. 2009. Neural Networks and Learning Machines. vol. 3. Pearson Education Upper Saddle River.
P. Howell, S. Davis, and J. Bartrip. 2009. The University College London Archive of Stuttered Speech (UCLASS). J. Speech, Lang. Hear. Res. 52(2): 556-569.
How to Cite
Copyright of articles that appear in Jurnal Teknologi belongs exclusively to Penerbit Universiti Teknologi Malaysia (Penerbit UTM Press). This copyright covers the rights to reproduce the article, including reprints, electronic reproductions, or any other reproductions of similar nature.