Sentiment Classification Using BERT-CNN and SMOTE: A Case Study on Hotel Reviews Dataset
Abstract
The increasing importance of user-generated content in the hospitality industry necessitates advanced sentiment analysis tools to derive actionable insights from customer reviews. Traditional methods often struggle with the complexities of natural language, such as contextual dependencies and nuanced emotional expressions. This research leverages the BERT-CNN hybrid model, which combines BERT’s contextual language understanding with CNN’s feature extraction capabilities, to address these challenges and improve sentiment classification accuracy. Using a dataset of 1,828 hotel reviews from Eastparc Hotel Yogyakarta, the model achieved an impressive accuracy of 99.59%, with precision and recall exceeding 0.99. The application of SMOTE effectively resolved class imbalance, enhancing the model’s ability to generalize across diverse sentiment classes. Training and validation loss curves exhibited steady convergence, indicating robust learning and minimal overfitting. These results provided actionable insights into customer satisfaction, offering targeted recommendations for enhancing service quality and operational strategies. This study demonstrates the practicality of integrating advanced machine learning architectures in sentiment analysis, enabling the hospitality sector to transform unstructured feedback into meaningful insights. The findings contribute to academic advancements in natural language processing and practical innovations in customer experience management. Future research may expand this framework to other domains, further underscoring its adaptability and impact.
Downloads
References
A. He and M. Abisado, “Text Sentiment Analysis of Douban Film Short Comments Based on BERT-CNN-BiLSTM-Att Model,” IEEE Access, vol. 12, no. March, pp. 45229–45237, 2024, doi: 10.1109/ACCESS.2024.3381515.
C. Xin and L. Q. Zakaria, “Integrating Bert with CNN and Bilstm for Explainable Detection of Depression in Social Media Contents,” IEEE Access, vol. 12, pp. 161203–161212, 2024, doi: 10.1109/ACCESS.2024.3488081.
K. H. Alyoubi, F. S. Alotaibi, A. Kumar, V. Gupta, and A. Sharma, “A novel multi-layer feature fusion-based BERT-CNN for sentence representation learning and classification,” Robot. Intell. Autom., vol. 43, no. 6, pp. 704–715, 2023, doi: 10.1108/RIA-04-2023-0047.
Y. He, “BERT-CNN-BiLSTM: A Hybrid Deep Learning Model for Accurate Sentiment Analysis,” 2023 IEEE 5th International Conference on Power, Intelligent Computing and Systems, ICPICS 2023. pp. 921–926, 2023. doi: 10.1109/ICPICS58376.2023.10235335.
A. R. Abas, I. Elhenawy, M. Zidan, and M. Othman, “Bert-cnn: A deep learning model for detecting emotions from text,” Comput. Mater. Contin., vol. 71, no. 2, pp. 2943–2961, 2022, doi: 10.32604/cmc.2022.021671.
M. Fawzy, M. W. Fakhr, and M. A. Rizka, “Sentiment Analysis For Arabic Low Resource Data Using BERT-CNN,” Proceedings of the 20th Conference on Language Engineering, ESOLEC 2022. pp. 24–26, 2023. doi: 10.1109/esolec54569.2022.10009633.
J. Dong, F. He, Y. Guo, and H. Zhang, “A commodity review sentiment analysis based on BERT-CNN model,” 2020 5th International Conference on Computer and Communication Systems, ICCCS 2020. pp. 143–147, 2020. doi: 10.1109/ICCCS49078.2020.9118434.
M. A. Wani, M. ELAffendi, P. Bours, A. S. Imran, A. Hussain, and A. A. Abd El-Latif, “CoDeS: A Deep Learning Framework for Identifying COVID-Caused Depression Symptoms,” Cognit. Comput., vol. 16, no. 1, pp. 305–325, 2024, doi: 10.1007/s12559-023-10190-z.
T. N. Ram Kumar, G. Shidaganti, P. Anand, S. Singh, and S. Salil, “Analyzing and Automating Customer Service Queries on Twitter Using Robotic Process Automation,” J. Comput. Sci., vol. 19, no. 4, pp. 514–525, 2023, doi: 10.3844/jcssp.2023.514.525.
Z. Luo, “A study into text sentiment analysis model based on deep learning,” Int. J. Inf. Commun. Technol., vol. 24, no. 8, pp. 64–75, 2024, doi: 10.1504/IJICT.2024.139869.
F. Khanam, A. Chakraborty, M. A. Habib, and M. S. Iqbal, “Bangla Sentiment Analysis On Highly Imbalanced Data Using Hybrid CNN-LSTM & Bangla BERT,” 2024 3rd International Conference on Advancement in Electrical and Electronic Engineering, ICAEEE 2024. 2024. doi: 10.1109/ICAEEE62219.2024.10561678.
H. T. Phan, N. T. Nguyen, and D. Hwang, “Aspect-Level Sentiment Analysis Using CNN Over BERT-GCN,” IEEE Access, vol. 10, pp. 110402–110409, 2022, doi: 10.1109/ACCESS.2022.3214233.
S. Islam, M. Jahidul Islam, M. Mahadi Hasan, S. M. Shahnewaz Mahmud Ayon, and S. Shabnam Hasan, “Bengali Social Media Post Sentiment Analysis using Deep Learning and BERT Model,” 2022 IEEE Symposium on Industrial Electronics and Applications, ISIEA 2022. 2022. doi: 10.1109/ISIEA54517.2022.9873680.
L. Khan, A. Amjad, N. Ashraf, and H. T. Chang, “Multi-class sentiment analysis of urdu text using multilingual BERT,” Sci. Rep., vol. 12, no. 1, 2022, doi: 10.1038/s41598-022-09381-9.
R. Ahuja and S. C. Sharma, “Transformer-Based Word Embedding With CNN Model to Detect Sarcasm and Irony,” Arab. J. Sci. Eng., vol. 47, no. 8, pp. 9379–9392, 2022, doi: 10.1007/s13369-021-06193-3.
A. N. Azhar and M. L. Khodra, “Fine-tuning Pretrained Multilingual BERT Model for Indonesian Aspect-based Sentiment Analysis,” 2020 7th International Conference on Advanced Informatics: Concepts, Theory and Applications, ICAICTA 2020. 2020. doi: 10.1109/ICAICTA49861.2020.9428882.
P. K. Mall et al., “Self-Attentive CNN+BERT: An Approach for Analysis of Sentiment on Movie Reviews Using Word Embedding,” Int. J. Intell. Syst. Appl. Eng., vol. 12, no. 12s, pp. 612–623, 2024, [Online]. Available: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85185299520&origin=inward
F. Makhmudov, A. Kultimuratov, and Y. I. Cho, “Enhancing Multimodal Emotion Recognition through Attention Mechanisms in BERT and CNN Architectures,” Appl. Sci., vol. 14, no. 10, 2024, doi: 10.3390/app14104199.
H. Murfi, S. Theresia Gowandi, G. Ardaneswari, and S. Nurrohmah, “BERT-based combination of convolutional and recurrent neural network for indonesian sentiment analysis,” Appl. Soft Comput., vol. 151, 2024, doi: 10.1016/j.asoc.2023.111112.
S. Srivastava, M. K. Sarkar, and C. Chakraborty, “Machine Learning Approaches for COVID-19 Sentiment Analysis: Unveiling the Power of BERT,” 2024 IEEE 14th Annual Computing and Communication Workshop and Conference, CCWC 2024. pp. 92–97, 2024. doi: 10.1109/CCWC60891.2024.10427866.
G. Bourahouat, M. Abourezq, and N. Daoudi, “Improvement of Moroccan Dialect Sentiment Analysis Using Arabic BERT-Based Models,” J. Comput. Sci., vol. 20, no. 2, pp. 157–167, 2024, doi: 10.3844/jcssp.2024.157.167.
M. E. Basiri, S. Nemati, M. Abdar, E. Cambria, and U. R. Acharya, “ABCDM: An Attention-based Bidirectional CNN-RNN Deep Model for sentiment analysis,” Futur. Gener. Comput. Syst., vol. 115, pp. 279–294, 2021, doi: 10.1016/j.future.2020.08.005.
R. K. Behera, M. Jena, S. K. Rath, and S. Misra, “Co-LSTM: Convolutional LSTM model for sentiment analysis in social big data,” Inf. Process. Manag., vol. 58, no. 1, 2021, doi: 10.1016/j.ipm.2020.102435.
A. Onan, “Bidirectional convolutional recurrent neural network architecture with group-wise enhancement mechanism for text sentiment classification,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 5, pp. 2098–2117, 2022, doi: 10.1016/j.jksuci.2022.02.025.
Y. Cheng, L. Yao, G. Xiang, G. Zhang, T. Tang, and L. Zhong, “Text Sentiment Orientation Analysis Based on Multi-Channel CNN and Bidirectional GRU with Attention Mechanism,” IEEE Access, vol. 8, pp. 134964–134975, 2020, doi: 10.1109/ACCESS.2020.3005823.
A. Bello, S. C. Ng, and M. F. Leung, “A BERT Framework to Sentiment Analysis of Tweets,” Sensors, vol. 23, no. 1, 2023, doi: 10.3390/s23010506.
M. Kamyab, G. Liu, and M. Adjeisah, “Attention-Based CNN and Bi-LSTM Model Based on TF-IDF and GloVe Word Embedding for Sentiment Analysis,” Appl. Sci., vol. 11, no. 23, 2021, doi: 10.3390/app112311255.
A. Er, S. T. Ozcelik, and M. T. Yondem, “Leveraging Transformer-based Language Models for Enhanced Service Insight in Tourism,” 4th International Informatics and Software Engineering Conference - Symposium Program, IISEC 2023. 2023. doi: 10.1109/IISEC59749.2023.10391041.
S. S. Tiwari, R. Pandey, A. Deepak, J. P. Singh, and S. Tripathi, “An ensemble approach to detect depression from social media platform: E-CLS,” Multimed. Tools Appl., vol. 83, no. 28, pp. 71001–71033, 2024, doi: 10.1007/s11042-023-17971-6.
M. Luqman, M. Faheem, W. Y. Ramay, M. K. Saeed, and M. B. Ahmad, “Utilizing Ensemble Learning for Detecting Multi-Modal Fake News,” IEEE Access, vol. 12, pp. 15037–15049, 2024, doi: 10.1109/ACCESS.2024.3357661.
R. Olusegun, T. Oladunni, H. Audu, Y. A. O. Houkpati, and S. Bengesi, “Text Mining and Emotion Classification on Monkeypox Twitter Dataset: A Deep Learning-Natural Language Processing (NLP) Approach,” IEEE Access, vol. 11, pp. 49882–49894, 2023, doi: 10.1109/ACCESS.2023.3277868.
M. K. Singh and S. Kumar, “Stress Detection During Social Interactions with Natural Language Processing and Machine Learning,” Proceedings - 2024 International Conference on Expert Clouds and Applications, ICOECA 2024. pp. 297–301, 2024. doi: 10.1109/ICOECA62351.2024.00060.
Z. Ahmed and J. Wang, “A fine-grained deep learning model using embedded-CNN with BiLSTM for exploiting product sentiments,” Alexandria Eng. J., vol. 65, pp. 731–747, 2023, doi: 10.1016/j.aej.2022.10.037.
N. Azzouza, K. Akli-Astouati, and R. Ibrahim, “Twitterbert: Framework for twitter sentiment analysis based on pre-trained language model representations,” Advances in Intelligent Systems and Computing, vol. 1073. pp. 428–437, 2020. doi: 10.1007/978-3-030-33582-3_41.
R. Man and K. Lin, “Sentiment analysis algorithm based on bert and convolutional neural network,” Proceedings of IEEE Asia-Pacific Conference on Image Processing, Electronics and Computers, IPEC 2021. pp. 769–772, 2021. doi: 10.1109/IPEC51340.2021.9421110.
N. Habbat and H. Nouri, “Unlocking travel narratives: a fusion of stacking ensemble deep learning and neural topic modeling for enhanced tourism comment analysis,” Soc. Netw. Anal. Min., vol. 14, no. 1, 2024, doi: 10.1007/s13278-024-01256-3.
M. Heidari and S. Rafatirad, “Using Transfer Learning Approach to Implement Convolutional Neural Network model to Recommend Airline Tickets by Using Online Reviews,” SMAP 2020 - 15th International Workshop on Semantic and Social Media Adaptation and Personalization. 2020. doi: 10.1109/SMAP49528.2020.9248443.
R. A. Hameed, W. J. Abed, and A. T. Sadiq, “Evaluation of Hotel Performance with Sentiment Analysis by Deep Learning Techniques,” Int. J. Interact. Mob. Technol., vol. 17, no. 9, pp. 70–87, 2023, doi: 10.3991/ijim.v17i09.38755.
N. Khamphakdee and P. Seresangtakul, “An Efficient Deep Learning for Thai Sentiment Analysis,” Data, vol. 8, no. 5, 2023, doi: 10.3390/data8050090.
N. Habbat, H. Anoun, L. Hassouni, and H. Nouri, “Hotel Demand Forecasting via Booking’s Comments Using Sentiment Analysis and Topic Modeling Techniques,” Advances in Science, Technology and Innovation. pp. 113–122, 2024. doi: 10.1007/978-3-031-46849-0_13.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Sentiment Classification Using BERT-CNN and SMOTE: A Case Study on Hotel Reviews Dataset
Pages: 1569−1581
Copyright (c) 2024 Yerik Afrianto Singgalen

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).