Komparasi Algoritma Naive Bayes dan K-Nearest Neighbor untuk Analisis Sentimen Pengguna Dompet Digital pada Google Play Store


  • M Adhe Akbar * Mail Universitas Teknokrat Indonesia, Bandar Lampung, Indonesia
  • Fenty Ariany Universitas Teknokrat Indonesia, Bandar Lampung, Indonesia
  • (*) Corresponding Author
Keywords: Sentiment Analysis; Digital Wallet; Naive Bayes; K-Nearest Neighbor; McNemar Test

Abstract

The rapid growth of digital wallet users in Indonesia, reaching millions of active users, has generated a massive volume of reviews on the Google Play Store. This textual data contains crucial insights regarding customer satisfaction but is often underutilized due to challenges in processing unstructured data. This study aims to perform a comparative performance analysis between the probabilistic Naive Bayes algorithm and the distance-based K-Nearest Neighbor (KNN) in classifying user sentiment for DANA, OVO, DOKU, and LinkAja applications. This study utilizes a dataset of 18,869 reviews which exhibits a mild class imbalance with a negative sentiment dominance of 57.54%. To preserve the representation of the large original data, this research applies Stratified Sampling without synthetic data balancing techniques (such as SMOTE), followed by comprehensive preprocessing stages aided by the Sastrawi library and Term Frequency-Inverse Document Frequency (TF-IDF) feature extraction. Model optimization was systematically conducted using GridSearchCV for Naive Bayes and the Elbow Method to determine the optimal k value for KNN. Empirical test results show that the Naive Bayes algorithm with a smoothing parameter alpha of 0.1 achieved the best performance with an accuracy of 88.5% and an AUC of 0.9237, outperforming KNN at k=27 which obtained an accuracy of 87.4%. The validity of this performance difference was confirmed to be significant through the McNemar statistical test with a p-value of 0.0045. Another crucial finding is computational efficiency, where Naive Bayes proved to be 129 times faster in the prediction process compared to KNN. Based on the significant advantages in accuracy and time efficiency, Naive Bayes is recommended as the superior method for real-time sentiment analysis in the financial technology ecosystem.

Downloads

Download data is not yet available.

References

A. Ramadhani, D. Putu, and Y. Pardita, “Manajemen transformasi digital pembayaran: Faktor-faktor adopsi e-money dan implikasinya pada velocity of money,” El-Mal: Jurnal Kajian Ekonomi & Bisnis Islam, vol. 6, no. 9, 2025.

Asosiasi Penyelenggara Jasa Internet Indonesia, “APJII jumlah pengguna internet Indonesia tembus 221 juta orang,” 2023. [Online]. Available: https://apjii.or.id/berita/d/apjii-jumlah-pengguna-internet-indonesia-tembus-221-juta-orang

A. Ciptarianto, “E-wallet application penetration for financial inclusion in Indonesia,” International Journal of Current Science Research and Review, vol. 5, no. 2, 2022, doi: 10.47191/ijcsrr/v5-i2-03.

M. Birjali, M. Kasri, and A. Beni-Hssane, “A comprehensive survey on sentiment analysis: Approaches, challenges and trends,” Knowledge-Based Systems, vol. 226, 2021, doi: 10.1016/j.knosys.2021.107134.

K. Naithani and Y. Raiwani, “Realization of natural language processing and machine learning approaches for text-based sentiment analysis,” Expert Systems, vol. 40, 2022, doi: 10.1111/exsy.13114.

L. Bharadwaj, “Sentiment analysis in online product reviews: Mining customer opinions for sentiment classification,” International Journal for Multidisciplinary Research, vol. 5, no. 5, 2023, doi: 10.36948/ijfmr.2023.v05i05.6090.

A. Wibowo, S. Rahayu, and D. Kusuma, “Analisis sentimen ulasan aplikasi e-commerce menggunakan Naive Bayes dengan feature selection chi-square,” Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 10, no. 3, 2023.

D. Rahmawati and B. Kusuma, “Perbandingan algoritma Naive Bayes dan SVM untuk klasifikasi sentimen ulasan transportasi online,” Indonesian Journal of Computing and Cybernetics Systems, vol. 17, no. 2, 2023.

H. Santoso, M. Firdaus, and A. Pramono, “Optimasi parameter K pada algoritma K-nearest neighbor untuk klasifikasi sentimen ulasan hotel,” Jurnal Ilmiah Teknik Elektro Komputer dan Informatika, vol. 8, no. 4, 2022.

A. Fattahila et al., “Indonesian digital wallet sentiment analysis using CNN and LSTM method,” in Proc. Int. Conf. Artificial Intelligence and Big Data Analytics, 2021, doi: 10.1109/ICAIBDA53487.2021.9689712.

A. Pandey, A. Kajla, D. Shrivastava, and C. Samadiya, “Spam email detection using machine learning,” IMRJR, 2025, doi: 10.17148/imrjr.2025.020403.

M. Romano, G. Contu, F. Mola, and C. Conversano, “Threshold-based Naïve Bayes classifier,” Advances in Data Analysis and Classification, 2023, doi: 10.1007/s11634-023-00536-8.

J. W. Fachezi, “Google-play-scraper: Python library for scraping Google Play Store,” GitHub Repository. [Online]. Available: https://github.com/JoMingyu/google-play-scraper

S. Mola, D. Polly, and N. Rumlaklak, “Sentiment analysis on user reviews of the Edlink application using the random forest classifier method,” Jurnal Sisfotek Global, vol. 15, no. 1, 2025, doi: 10.38101/sisfotek.v15i1.15788.

A. F. Aji, “Sastrawi: High quality Indonesian stemmer,” GitHub Repository. [Online]. Available: https://github.com/sastrawi/sastrawi

M. Habibi and P. W. Cahyo, “Clustering user characteristics based on the influence of hashtags on the Instagram platform,” Indonesian Journal of Computing and Cybernetics Systems, vol. 13, no. 4, 2021, doi: 10.22146/ijccs.50574.

N. Semary et al., “Enhancing machine learning-based sentiment analysis through feature extraction techniques,” PLOS ONE, vol. 19, 2024, doi: 10.1371/journal.pone.0294968.

H. Xu and Y. Li, “Classification of news texts based on Bayes algorithm,” in Proc. Int. Conf. Electronic Information Technology and Computer Engineering, 2021, doi: 10.1145/3501409.3501636.

M. Aditya, A. Helen, and I. Suryana, “Naïve Bayes and maximum entropy comparison for translated novel’s genre classification,” Journal of Physics: Conference Series, vol. 1722, 2021, doi: 10.1088/1742-6596/1722/1/012007.

R. Halder et al., “Enhancing K-nearest neighbor algorithm: A comprehensive review and performance analysis of modifications,” Journal of Big Data, vol. 11, 2024, doi: 10.1186/s40537-024-00973-y.

A. Jalal and B. Ali, “Text documents clustering using data mining techniques,” International Journal of Electrical and Computer Engineering, vol. 11, no. 1, 2021, doi: 10.11591/ijece.v11i1.pp664-670.

J. Brownlee, “How to configure k-fold cross-validation,” Machine Learning Mastery, 2021. [Online]. Available: https://machinelearningmastery.com/k-fold-cross-validation/

O. Rainio, J. Teuho, and R. Klén, “Evaluation metrics and statistical tests for machine learning,” Scientific Reports, vol. 14, 2024, doi: 10.1038/s41598-024-56706-x.

F. Sabiq et al., “Performance comparison of multinomial and Bernoulli Naïve Bayes algorithms with Laplace smoothing optimization in fake news classification,” in Proc. Int. Conf. Artificial Intelligence, Blockchain, Cloud Computing, and Data Analytics, 2024, doi: 10.1109/ICOABCD63526.2024.10704399.

D. Jurafsky and J. H. Martin, Speech and Language Processing, 3rd ed., Stanford University Press, 2024.

A. Sharma, P. Singh, and R. Chandra, “SMOTified-GAN for class imbalanced pattern classification problems,” IEEE Access, vol. 10, 2022, doi: 10.1109/ACCESS.2022.3158977.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Komparasi Algoritma Naive Bayes dan K-Nearest Neighbor untuk Analisis Sentimen Pengguna Dompet Digital pada Google Play Store

Dimensions Badge
Article History
Submitted: 2026-01-24
Published: 2026-03-06
Abstract View: 114 times
PDF Download: 115 times
How to Cite
Akbar, M., & Ariany, F. (2026). Komparasi Algoritma Naive Bayes dan K-Nearest Neighbor untuk Analisis Sentimen Pengguna Dompet Digital pada Google Play Store. Building of Informatics, Technology and Science (BITS), 7(4), 2335−2348. https://doi.org/10.47065/bits.v7i4.9285
Issue
Section
Articles