IndoBERT-Based Sentiment Analysis for Understanding Hotel Guests’ Preferences


  • Yerik Afrianto Singgalen * Mail Atma Jaya Catholic University of Indonesia, Jakarta, Indonesia
  • (*) Corresponding Author
Keywords: Sentiment Analysis; IndoBERT; Hotel Guest Preferences; Hospitality Industry; Natural Language Processing

Abstract

The rapid growth of the hospitality industry and the increasing reliance on online reviews emphasize the need for advanced sentiment analysis tools to understand customer preferences effectively. This study explores the application of IndoBERT, a pre-trained language model tailored for the Indonesian language, in classifying sentiments from hotel guest reviews. Utilizing a dataset of 715 reviews, the study employed the Knowledge Discovery in Databases (KDD) framework for systematic data preprocessing, feature extraction, and machine learning analysis. IndoBERT demonstrated exceptional performance, achieving perfect precision, recall, and F1-scores of 1.00 for both positive (657 reviews) and negative (53 reviews) sentiment classes. The ROC curve analysis also yielded a mean AUC score of 0.86, validating the model's robustness and reliability. The results highlight IndoBERT's capability to accurately capture linguistic nuances and contextual meaning, offering actionable insights into factors influencing guest satisfaction, such as cleanliness, staff behavior, and service quality. This research contributes to advancing natural language processing applications in regional contexts and provides practical implications for enhancing service strategies in the hospitality sector. Future research should expand the model's application to other industries and explore multimodal approaches for a more comprehensive understanding of customer behavior.

Downloads

Download data is not yet available.

References

Y. Wu, J. Wang, Y. Xia, Q. Li, and Y. Pan, “Sensing hotel customers distribution and their sentiment variations using online travel agent data: a case of Shanghai star-rated hotels,” Ann. GIS, vol. 30, no. 3, pp. 323–343, 2024, doi: 10.1080/19475683.2024.2335976.

O. Belo and R. Milhazes, “Creating a Well-being Index for a Company Using Customer Sentiment Analysis,” Smart Innovation, Systems and Technologies, vol. 344. pp. 35–48, 2024. doi: 10.1007/978-981-99-0333-7_3.

F. Rahman and A. S. Girsang, “IndoBERTweet for Sarcasm: Evaluating Domain-Adapted Transformers for Indonesian Twitter Sarcasm Classification,” J. Logist. Informatics Serv. Sci., vol. 11, no. 2, pp. 155–164, 2024, doi: 10.33168/JLISS.2024.0210.

Gilbert, S. Beatricia, A. Reinaldo, and M. F. Hasani, “Text Augmentation for Indonesian Intent Classification: Comparative Study,” Proceedings - 11th International Conference on Information Technology, Computer and Electrical Engineering, ICITACEE 2024. pp. 265–270, 2024. doi: 10.1109/ICITACEE62763.2024.10762810.

J. L. Nicolau, Z. Xiang, and D. Wang, “Daily online review sentiment and hotel performance,” Int. J. Contemp. Hosp. Manag., vol. 36, no. 3, pp. 790–811, 2024, doi: 10.1108/IJCHM-05-2022-0594.

O. A. George and C. M. Q. Ramos, “Sentiment analysis applied to tourism: exploring tourist-generated content in the case of a wellness tourism destination,” Int. J. Spa Wellness, vol. 7, no. 2, pp. 139–161, 2024, doi: 10.1080/24721735.2024.2352979.

R. A. Rahman and Suyanto, “Performance Analysis of ChatGPT for Indonesian Abstractive Text Summarization,” in Proceedings - International Seminar on Intelligent Technology and its Applications, ISITIA, 2024, no. 2024, pp. 477–482. doi: 10.1109/ISITIA63062.2024.10668361.

H. Ahmadian, T. F. Abidin, H. Riza, and K. Muchtar, “Hybrid Models for Recognizing Indonesian Textual Entailment,” in Proceedings - International Conference on Informatics and Computational Sciences, 2024, pp. 462–467. doi: 10.1109/ICICoS62600.2024.10636863.

M. T. Uliniansyah et al., “Twitter dataset on public sentiments towards biodiversity policy in Indonesia,” Data Br., vol. 52, 2024, doi: 10.1016/j.dib.2023.109890.

F. S. Yerzi, D. P. Ramadhani, and A. Alamsyah, “Comparison of Multiclass Classification and Topic Modeling to Identify Technology Acceptance in Popular E-Commerce in Indonesia Based on UTAUT3 Model,” in Proceedings of the 2024 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology, IAICT 2024, 2024, pp. 273–279. doi: 10.1109/IAICT62357.2024.10617771.

M. Maryamah, G. Wilsen, C. T. Suhalim, R. Septiana, A. Fajar, and M. I. Solihin, “Hybrid Information Retrieval with Masked and Permuted Language Modeling (MPNet) and BM25L for Indonesian Drug Data Retrieval,” in KST 2024 - 16th International Conference on Knowledge and Smart Technology, 2024, pp. 242–247. doi: 10.1109/KST61284.2024.10499674.

E. I. Setiawan et al., “Indonesian News Stance Classification Based on Hybrid Bidirectional LSTM and Transformer Based Embedding,” Int. J. Intell. Eng. Syst., vol. 17, no. 5, pp. 517–537, 2024, doi: 10.22266/ijies2024.1031.41.

S. Latisha, S. Favian, and D. Suhartono, “Criminal Court Judgment Prediction System Built on Modified BERT Models,” J. Adv. Inf. Technol., vol. 15, no. 2, pp. 288–298, 2024, doi: 10.12720/jait.15.2.288-298.

E. Yulianti, N. Bhary, J. Abdurrohman, F. W. Dwitilas, E. Q. Nuranti, and H. S. Husin, “Named entity recognition on Indonesian legal documents: a dataset and study using transformer-based models,” Int. J. Electr. Comput. Eng., vol. 14, no. 5, pp. 5489–5501, 2024, doi: 10.11591/ijece.v14i5.pp5489-5501.

Y. A. Singgalen, S. Y. Wahyuningtyas, Y. E. Widodo, M. N. A. Dasra, and R. W. Setiawan, “Knowledge Discovery in Databases for Hotel Service Quality Improvement Through Data-Mining Approach,” J. Theor. Appl. Inf. Technol., vol. 102, no. 24, pp. 9004–9020, 2024, [Online]. Available: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85213991405&origin=inward

Y. A. A. I. Rifai and D. Suhartono, “Emotion Classification of Indonesian Twitter Social Media Text Using Soft Voting Ensemble Method,” ICIC Express Lett. Part B Appl., vol. 15, no. 1, pp. 101–108, 2024, doi: 10.24507/icicelb.15.01.101.

I. Daqiqil, H. Saputra, Syamsudhuha, R. Kurniawan, and Y. Andriyani, “Sentiment analysis of student evaluation feedback using transformer-based language models,” Indones. J. Electr. Eng. Comput. Sci., vol. 36, no. 2, pp. 1127–1139, 2024, doi: 10.11591/ijeecs.v36.i2.pp1127-1139.

W. Yustanti, R. P. Wibawa, Y. Yamasari, and A. I. Nurhidayat, “A Text Mining Analysis of Social Media Criticisms on Tuition Fee Policies at State Universities in Indonesia,” International Conference on Electrical Engineering, Computer Science and Informatics (EECSI). pp. 742–748, 2024. doi: 10.1109/EECSI63442.2024.10776274.

K. Chandra, K. A. Prasetya, R. D. Saputra, and M. F. Hasani, “Leveraging IndoBert for CyberBullying Classification on Social Media,” in ICSINTESA 2024 - 2024 4th International Conference of Science and Information Technology in Smart Administration: The Collaboration of Smart Technology and Good Governance for Sustainable Development Goals, 2024, pp. 407–411. doi: 10.1109/ICSINTESA62455.2024.10747874.

M. A. Naufal and A. S. Girsang, “Traffic accident classification using IndoBERT,” Int. J. Informatics Commun. Technol., vol. 13, no. 1, pp. 42–49, 2024, doi: 10.11591/ijict.v13i1.pp42-49.

D. Y. Yefferson, V. Lawijaya, and A. S. Girsang, “Hybrid model: IndoBERT and long short-term memory for detecting Indonesian hoax news,” IAES Int. J. Artif. Intell., vol. 13, no. 2, pp. 1911–1922, 2024, doi: 10.11591/ijai.v13.i2.pp1913-1924.

M. D. Setiawan, S. Kimberly, F. A. Suharjo, and J. Harefa, “Transfer Learning using IndoBERT with Long Short-Term Memory Classifier for Detection of Suicide Ideation Themed Indonesian Twitter Posts,” ICSINTESA 2024 - 2024 4th International Conference of Science and Information Technology in Smart Administration: The Collaboration of Smart Technology and Good Governance for Sustainable Development Goals. pp. 276–281, 2024. doi: 10.1109/ICSINTESA62455.2024.10747816.

G. Z. Nabiilah, I. N. Alam, E. S. Purwanto, and M. F. Hidayat, “Indonesian multilabel classification using IndoBERT embedding and MBERT classification,” Int. J. Electr. Comput. Eng., vol. 14, no. 1, pp. 1071–1078, 2024, doi: 10.11591/ijece.v14i1.pp1071-1078.

H. Ahmadian, T. F. Abidin, H. Riza, and K. Muchtar, “Hybrid Models for Emotion Classification and Sentiment Analysis in Indonesian Language,” Appl. Comput. Intell. Soft Comput., vol. 2024, 2024, doi: 10.1155/2024/2826773.

I. A. Mannix and E. Yulianti, “Academic expert finding using BERT pre-trained language model,” Int. J. Adv. Intell. Informatics, vol. 10, no. 2, pp. 280–295, 2024, doi: 10.26555/ijain.v10i2.1497.

M. Irdayanti, D. Purwitasari, and D. O. Siahaan, “Relevance Detection using Text Entailment for Health-related Question-Answer Texts with Imbalanced Data,” in Proceedings - International Seminar on Intelligent Technology and its Applications, ISITIA, 2024, no. 2024, pp. 681–686. doi: 10.1109/ISITIA63062.2024.10667778.

R. Sivanaiah, S. Suresh, S. Pandian, and A. D. Suseelan, “Bridging the Language Gap: Transformer-Based BERT for Fake News Detection in Low-Resource Settings,” Communications in Computer and Information Science, vol. 2046 CCIS. pp. 398–411, 2024. doi: 10.1007/978-3-031-58495-4_29.

Edwina and T. Mauritsius, “Data-Driven Insights for Mobile Banking App Improvement: A Sentiment Analysis and Topic Modelling Approach for SimobiPlus User Reviews,” Int. J. Eng. Trends Technol., vol. 72, no. 6, pp. 347–360, 2024, doi: 10.14445/22315381/IJETT-V72I6P132.

G. D. Mendonça, S. R. de M. Oliveira, O. F. Lima, and P. T. V. de Resende, “Intelligent algorithms applied to the prediction of air freight transportation delays,” Int. J. Phys. Distrib. Logist. Manag., vol. 54, no. 1, pp. 61–91, Jan. 2024, doi: 10.1108/IJPDLM-10-2022-0328.

J. I. T. Krisna, A. Luthfiarta, L. D. Cahya, S. Winarno, and A. Nugraha, “Comparing Optimizer Strategies For Enhancing Emotion Classification In IndoBERT Models,” Adv. Sustain. Sci. Eng. Technol., vol. 6, no. 2, 2024, doi: 10.26877/asset.v6i2.18228.

E. Dave and A. Chowanda, “IPerFEX-2023: Indonesian personal financial entity extraction using indoBERT-BiGRU-CRF model,” J. Big Data, vol. 11, no. 1, 2024, doi: 10.1186/s40537-024-00987-6.

L. C. Cheng, H. Y. Huang, and Y. W. Huang, “Multi-task Chinese aspect-based sentiment analysis framework for service improvement: a case study on BNB reviews,” Electron. Commer. Res., 2024, doi: 10.1007/s10660-024-09871-0.

I. Botunac, M. Brkić Bakarić, and M. Matetić, “Comparing Fine-Tuning and Prompt Engineering for Multi-Class Classification in Hospitality Review Analysis,” Appl. Sci., vol. 14, no. 14, 2024, doi: 10.3390/app14146254.

H. Zhang, A. M. Kassim, N. H. Samsudin, L. Teng, and C. Y. Tang, “A Hybrid Deep Learning Framework for Hotel Rating Systems: Integrating Word2Vec, TF-IDF, and Bi-LSTM With Attention Mechanism,” IEEE Trans. Comput. Soc. Syst., 2024, doi: 10.1109/TCSS.2024.3461796.

S. Yang, H. Liao, and L. T. Kóczy, “Preference mining and fuzzy inference for hotel selection based on aspect-based sentiment analysis from user-generated content,” J. Oper. Res. Soc., 2024, doi: 10.1080/01605682.2024.2437128.

D. Marutho, Muljono, S. Rustad, and Purwanto, “Optimizing aspect-based sentiment analysis using sentence embedding transformer, bayesian search clustering, and sparse attention mechanism,” J. Open Innov. Technol. Mark. Complex., vol. 10, no. 1, 2024, doi: 10.1016/j.joitmc.2024.100211.

H. Almaqtari, F. Zeng, and A. Mohammed, “Enhancing Arabic Sentiment Analysis of Consumer Reviews: Machine Learning and Deep Learning Methods Based on NLP,” Algorithms, vol. 17, no. 11, 2024, doi: 10.3390/a17110495.

A. Ameur, S. Hamdi, and S. Ben Yahia, “Enhanced approach of multilabel learning for the Arabic aspect category detection of the hotel reviews,” Comput. Intell., vol. 40, no. 1, 2024, doi: 10.1111/coin.12609.

I. Piriyakul, S. Kunathikornkit, and R. Piriyakul, “Evaluating brand equity in the hospitality industry: Insights from customer journeys and text mining,” Int. J. Inf. Manag. Data Insights, vol. 4, no. 2, 2024, doi: 10.1016/j.jjimei.2024.100245.

T. D. Dang and M. T. Nguyen, “Understanding Customer Perception and Brand Equity in the Hospitality Sector: Integrating Sentiment Analysis and Topic Modeling,” Springer Proceedings in Business and Economics. pp. 413–425, 2024. doi: 10.1007/978-3-031-49105-4_24.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel IndoBERT-Based Sentiment Analysis for Understanding Hotel Guests’ Preferences

Dimensions Badge
Article History
Submitted: 2025-01-28
Published: 2025-02-28
Abstract View: 21 times
PDF Download: 5 times
Section
Articles