IMPROVING A DEEP NEURAL NETWORK GENERATIVE-BASED CHATBOT MODEL
DOI:
https://doi.org/10.11113/aej.v14.20663Keywords:
Deep Learning, Artificial Neural Network, Generative-based Chatbot, Hyperparameter Optimization, Attentive Sequence-to-SequenceAbstract
A chatbot is an application that is developed in the field of machine learning, which has become a hot topic of research in recent years. The majority of today's chatbots integrate the Artificial Neural Network (ANN) approach with a Deep Learning environment, which results in a new generation chatbot known as a Generative-Based Chatbot. The current chatbot application mostly fails to recognize the optimum capacity of the network environment due to its complex nature resulting in low accuracy and loss rate. In this paper, we aim to conduct an experiment in evaluating the performance of chatbot model when manipulating the selected hyperparameters that can greatly contribute to the well-performed model without modifying any major structures and algorithms in the model. The experiment involves training two models, which are the Attentive Sequence-to-Sequence model (baseline model), and Attentive Seq2Sequence with Hyperparametric Optimization. The result was observed by training two models on Cornell Movie-Dialogue Corpus, run by using 10 epochs. The comparison shows that after optimization, the model’s accuracy and loss rate were 87% and 0.51%, respectively, compared to the results before optimizing the network (79% accuracy and 1.05% loss)
References
Mathew, S. 2018. The Value Of Chatbots For Today’s Consumers. Forbes,https://www.forbes.com/sites/forbescommunicationscouncil/2018/02/13/the value of chatbots for todays consumers/?sh=4b3f408b2918 Retrieve on 23 September 2023
Bansal, H., and R. Khan. 2018. A Review Paper on Human Computer Interaction. International Journals of Advanced Research in Computer Science and Software Engineering. ISSN: 2277-128X. 8(4): 53–56.
Shang, L., Z. Lu, and H. Li. 2015, March. Neural responding machine for short- text conversation. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Vinyals, O., and Q. V. Le. 2015, July. A neural conversational model. Proceedings of the 31st International Conference on Machine Learning, Lille, France. 37
Nuruzzaman, M., and O. K. Hussain. 2018. A Survey on Chatbot Implementation in Customer Service Industry through Deep Neural Networks. IEEE 15th International Conference on e-Business Engineering (ICEBE). 54–61.
Elgeldawi, E., A. Sayed, A. R. Galal, and A. M. Zaki. 2021, November. Hyperparameter Tuning for Machine Learning Algorithms. Informatics 2021. 8(79)
Wu, Y., Z. Li, W. Wu, and M. Zhou. 2018. Response selection with topic clues for retrieval-based chatbots. Neurocomputing. 316: 251–261.
Grossi, E., and M. Buscema. 2008. Introduction to artificial neural networks. European Journal of Gastroenterology Hepatology. 1046–1054.
Deng, L., and D. Yu. 2014. Deep Learning: Methods and Applications. Foundations and Trends in Signal Processing. 7 (3–4): 197–387
Vargas, R., A. Mosavi, and R. Ruiz. 2017. Deep Learning: A Review. Advances in Intelligent Systems and Computing. 5.
Sherstinsky, A. 2020, March. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network. Physica D: Nonlinear Phenomena: Special Issue on Machine Learning and Dynamical Systems. 404.
Hochreiter, S. 1998, April. The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems. 6: 107–116. DOI: http://dx.doi.org/10.1142/S0218488598000094
Pascanu, R., T. Mikolov, and Y. Bengio. 2012, November. On the difficulty of training Recurrent Neural Networks. 30th International Conference on Machine Learning, ICML 2013.
Yuhuang, H., H. E. G. Adrian, A. Jithendar, and L. Shih-Chii. 2018. Overcoming the vanishing gradient problem in plain recurrent networks. ArXiv, vol. abs/1801.06105.
Agnihotri, S. 2019. Hyperparameter Optimization on Neural Machine Translation.
Zheng, A. 2015. Evaluating Machine Learning Models. O’Reilly.
Hajiabadi, M., M. Farhadi, V. Babaiyan, and A. Estebsari. 2020, August. Deep Learning with Loss Ensembles for Solar Power Prediction in Smart Cities. Smart Cities. 3: 842–852. DOI: http://dx.doi.org/10.3390/smartcities3030043
Jun, Q., D. Jun, S. Marco, M. Xiaoli, and L. Chin-Hui. 2020, August. On Mean Absolute Error for Deep Neural Network Based Vector-to-Vector Regression.
Sebastian, R. 2018, September. An overview of gradient descent optimization algorithms.
Mukkamala, M. C., and M. Hein. 2017. Variants of RMSProp and Adagrad with logarithmic regret bounds. In Proceedings of the 34th International Conference on Machine Learning. 70 (ICML’17): 2545–2553.
Kingma, D. P., and J. Ba. 2019. Adam: A Method for Stochastic Optimization. CoRR, vol. abs/1412.6980.
Senior, A., G. Heigold, M. Ranzato, and K. Yang. 2015. An empirical study of learning rates in deep neural networks for speech recognition. 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. 6724-6728. DOI: http://dx.doi.org/10.1109/ICASSP.2013.6638963
Srivasta, N., G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. 2014, June. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of Machine Learning Research. 15: 1929 - 1958