SENTIMENT CLASSIFICATION OF UNSTRUCTURED DATA USING LEXICAL BASED TECHNIQUES

Authors

  • Nurul Fathiyah Shamsudin Faculty of Information & Communication Technology, Universiti Teknikal Malaysia Melaka, Melaka, Malaysia
  • Halizah Basiron Faculty of Information & Communication Technology, Universiti Teknikal Malaysia Melaka, Melaka, Malaysia
  • Zurina Saaya Faculty of Information & Communication Technology, Universiti Teknikal Malaysia Melaka, Melaka, Malaysia
  • Ahmad Fadzli Nizam Abdul Rahman Faculty of Information & Communication Technology, Universiti Teknikal Malaysia Melaka, Melaka, Malaysia
  • Mohd Hafiz Zakaria Faculty of Information & Communication Technology, Universiti Teknikal Malaysia Melaka, Melaka, Malaysia
  • Nurulhalim Hassim Faculty of Engineering Technology, Universiti Teknikal Malaysia Melaka, Melaka, Malaysia

DOI:

https://doi.org/10.11113/jt.v77.6497

Keywords:

Sentiment analysis, lexical based approach, term counting, term score summation, average on sentence and average on comments

Abstract

Sentiment analysis is the computational study of people’s opinion or feedback, attitudes, and emotions toward entities, individuals, issues, events, topics and their attributes. There are many research conducted for other languages such as English, Spanish, French, and German. However, lack of research is conducted to harvest the information in Malay words and structure them into a meaningful data. The objective of this paper is to introduce a lexical based method in analysing sentiment of Facebook comments in Malay. Three types of lexical based techniques are implemented in order to identify the sentiment of Facebook comments. The techniques used are term counting, term score summation and average on comments. The comparison of accuracy, precision and recall for all techniques are computed. The result shows that the average on comments method outperforms the other two techniques.

References

[ 1 ] Duggan, M., Ellison, N. B., Lampe, C., Lenhart, A. & Madden, M. 2014. Social Media Update 2014. Pew Internet and American Life Project. From: http://www.pewinternet.org/2015/01/09/social-media-update-2014/. [Accessed on 7 May 2015].

[ 2 ] Argaez, E. D. 2015. Internet World Stats–Usage and Population Statistics. Facebook Users in the World. [Online]. From: http://www.internetworldstats.com/facebook.htm. [Accessed on 7 May 2015].

[ 3 ] Roblyer, M. D., McDaniel, M., Webb, M., Herman, J., & Witty, J. V. 2010. Findings on Facebook in Higher Education: A Comparison of College Faculty and Student Uses and Perceptions of Social Networking Sites. The Internet and Higher Education. 13(3): 134-140.

[ 4 ] Back, M., Stopfer, J., Vazire, S., Gaddis, S., Schmukle, S., Egloff, B. and Gosling, S. 2010. Facebook Profiles Reflect Actual Personality, Not Self-Idealization. Psychological Science. 21(3): 372.

[ 5 ] Liu, B. 2012. Sentiment Analysis and Opinion Mining. Morgan & Claypool Publishers.

[ 6 ] Korayem, M., Crandall, D., & Abdul-Mageed, M. 2012. Subjectivity and Sentiment Analysis of Arabic: A Survey. Advanced Machine Learning Technologies and Applications.

[ 7 ] Puteh, M., Isa, N., Puteh, S., & Redzuan, N. A. 2013. Sentiment Mining of Malay Newspaper (SAMNews) Using Artificial Immune System. In Proceedings of the World Congress on Engineering. Vol. 3.

[ 8 ] Samsudin, N., Puteh, M., & Hamdan, A. R. 2011. Bess or xbest: Mining the Malaysian Online Reviews. In Data Mining and Optimization (DMO), 2011 3rd Conference on IEEE. 38-43.

[ 9 ] Taboada, M., Brooke, J., Tofiloski, M., Voll, K. and Stede, M. 2011. Lexicon-based Methods for Sentiment Analysis. In Association for Computational Linguistics. 37(2): 267- 307.

[ 10 ] Pang, B., Lee, L., & Vaithyanathan, S. 2002. Thumbs Up? Sentiment Classification Using Machine Learning Techniques. In Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10. Association for Computational Linguistics. 79-86.

[ 11 ] Samsudin, N., Puteh, M., Hamdan, A. R. and Ahmad, M. Z. 2013. Mining Opinion in Online Messages. International Journal of Advanced Computer Science and Applications. 4(8): 19- 24.

[ 12 ] Pang, B. and Lee, L. 2008. Opinion Mining and Sentiment Analysis. Foundations and Trends in Information Retrieval. 2(1-2).

[ 13 ] Feldman, R. 2013. Techniques and Applications for Sentiment Analysis. Communications of the ACM. 56(4): 82-89.

[ 14 ] Ohana, B., & Tierney, B. 2009. Sentiment Classification of Reviews Using SentiWordNet. In 9th. IT & T Conference. 13.

[ 15 ] Hamouda, A., & Rohaim, M. 2011. Reviews Classification Using Sentiwordnet Lexicon. In World Congress on Computer Science and Information Technology.

[ 16 ] Kim, S. M., & Hovy, E. 2004. Determining the Sentiment Of Opinions. In Proceedings of the 20th international conference on Computational Linguistics. Association for Computational Linguistics. 1367.

[ 17 ] Eguchi, K and Lavrenko, V. 2006. Sentiment Retrieval Using Generative Models. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). 345-354.

[ 18 ] Hajmohammadi, M. S., Ibrahim, R., & Othman, Z. A. 2012. Opinion Mining and Sentiment Analysis: A Survey. International Journal of Computers & Technology. 2(3): 171-178.

[ 19 ] Indurkhya, N. and F. Damerau. 2010. Handbook of Natural Language Processing. Chapman & Hall/Crc Machine Learning & Pattern Recognition.

[ 20 ] O'Neill, A. 2009. Sentiment Mining for Natural Language Documents, Book Sentiment Mining for Natural Language Documents’, Australian National University.

[ 21 ] Thelwall, M., Buckley, K., and Paltoglou, G. 2012. Sentiment Strength Detection for the Social Web. Journal of America Society Information Science Technology. 63(1):163-173.

[ 22 ] Tromp, E. 2011. Multilingual Sentiment Analysis on Social Media. Master Thesis, Dept. Computer Science, Eindhoven University.

Downloads

Published

2015-11-26

How to Cite

SENTIMENT CLASSIFICATION OF UNSTRUCTURED DATA USING LEXICAL BASED TECHNIQUES. (2015). Jurnal Teknologi, 77(18). https://doi.org/10.11113/jt.v77.6497