A COMPARATIVE STUDY OF STATISTICAL AND NATURAL LANGUAGE PROCESSING TECHNIQUES FOR SENTIMENT ANALYSIS
Keywords:Natural language processing, sentiment analysis, word sense disambiguation
Sentiment analysis has emerged as one of the most powerful tools in business intelligence. With the aim of proposing an effective sentiment analysis technique, we have performed experiments on analyzing the sentiments of 3,424 tweets using both statistical and natural language processing (NLP) techniques as part of our background study.Â For statistical technique, machine learning algorithms such as Support Vector Machines (SVMs), decision trees and NaÃ¯ve Bayes have been explored. The results show that SVM consistently outperformed the rest in both classifications. As for sentiment analysis using NLP techniques, we used two different tagging methods for part-of-speech (POS) tagging.Â Subsequently, the output is used for word sense disambiguation (WSD) using WordNet, followed by sentiment identification using SentiWordNet.Â Our experimental results indicate that adjectives and adverbs are sufficient to infer the sentiment of tweets compared to other combinations. Comparatively, the statistical approach records higher accuracy than the NLP approach by approximately 17%.
Koppel, M., & Schler, J. 2006. The Importance of Neutral Examples for Learning Sentiment. Computational Intelligence. Retrieved from http://onlinelibrary.wiley.com/doi/10.1111/j.1467-8640.2006.00276.x/abstract.
B. Pang, L. Lee, H. Rd, and S. Jose. 2002. Thumbs Upâ€¯? Sentiment Classification using Machine Learning Techniques. July: 79-86.
H. Jin, M. Huang, and X. Zhu. 2012. Sentiment Analysis with Multi-source Product Reviews. Intell. Comput. Technol. 301-308.
T. Pedersen and S. Banerjee. 2005. Maximizing Semantic Relatedness to Perform. March.
Baccianella, S., Esuli, A., & Sebastiani, F. 2008. SENTIWORDNET 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining. 0: 2200-2204.S.
R. Passonneau. 2011. Sentiment Analysis of Twitter Data. June: 30-38.
Ku, L., Liang, Y., Chen, H., Lun-Wei, K., Yu-Ting, L., & Hsin-Hsi, C. 2006. Opinion Extraction, Summarization and Tracking in News and Blog Corpora. In Artificial Intelligence. http://doi.org/citeulike-article-id:2913694. 100-107.
A. Montejo-RÃ¡ez, E. MartÃnez-CÃ¡mara, M. T. MartÃn-Valdivia, and L. A. UreÃ±a-LÃ³pez. 2014. Ranked WordNet graph for Sentiment Polarity Classification in Twitte. Comput. Speech Lang. 28(1): 93-107.
M. Hu, B. Liu, and S. M. Street. 2004. Mining and Summarizing Customer Reviews.
Naive Bayes classifier. 2014. [Online]. Available: http://en.wikipedia.org/wiki/Naive_Bayes_classifier.
Support Vector Machines (SVM). 2014. [Online]. Available: http://www.statsoft.com/textbook/support-vector-machines. [Accessed: 03-Nov-2014].
Decision Tree Classifier. 2014. [Online]. Available: http://mines.humanoriented.com/classes/2010/fall/csci568/portfolio_exports/lguo/decisionTree.html. [Accessed: 03-Nov-2014].
V. Vryniotis. 2013. The importance of Neutral Class in Sentiment Analysis | DatumBox. [Online]. Available: http://blog.datumbox.com/the-importance-of-neutral-class-in-sentiment-analysis/. [Accessed: 10-Apr-2014].
How to Cite
Copyright of articles that appear in Jurnal Teknologi belongs exclusively to Penerbit Universiti Teknologi Malaysia (Penerbit UTM Press). This copyright covers the rights to reproduce the article, including reprints, electronic reproductions, or any other reproductions of similar nature.