Enchancing K-NN Performance With SMOTE for Sentiment Analysis of Streaming App Reviews


  • Ni Putu Eka Patrycia Dewi * Mail Institut Bisnis dan Teknologi Indonesia, Bali, Indonesia
  • Christina Purnama Yanti Institut Bisnis dan Teknologi Indonesia, Bali, Indonesia
  • I Made Marthana Yusa Institut Bisnis dan Teknologi Indonesia, Bali, Indonesia
  • (*) Corresponding Author
Keywords: Sentiment Analysis; K-Nearest Neighbor (KNN); SMOTE; Streaming App; Pre-processing

Abstract

This research aims to analyze the sentiment of user reviews for a popular streaming app on both the Play Store and App Store using the K-Nearest Neighbor (K-NN) method. As the user base expands, reviews increasingly influence app development, guiding improvements and optimizing user experience. However, the large volume of reviews renders manual analysis inefficient and prone to inconsistencies, underscoring the necessity of sentiment analysis to quickly and accurately capture user perceptions. Review data were collected from both platforms, with preprocessing steps such as data cleaning, tokenization, and normalization applied to ensure data consistency. The Synthetic Minority Over-sampling Technique (SMOTE) was used to address class imbalance, enhancing the reliability of classification results. Findings indicate that SMOTE improved model accuracy, raising it from 74% to 82.9% for Play Store data and from 79% to 84.1% for App Store data. Furthermore, a notable difference in sentiment dominance was observed, with positive sentiment prevailing on the Play Store, while negative sentiment was more prevalent on the App Store. These insights reveal that, overall, the app is well received, although certain areas highlighted in negative reviews require further attention to improve user satisfaction.

Downloads

Download data is not yet available.

References

U. Kulsum, M. Jajuli, and N. Sulistiyowati, “Analisis Sentimen Aplikasi WETV di Google Play Store Menggunakan Algoritma Support Vector Machine,” J. Appl. Informatics Comput., vol. 6, no. 2, pp. 205–212, 2022, doi: 10.30871/jaic.v6i2.4802.

Bagas hutauruk dkk., “Hubungan antara Intensitas Penggunaan Aplikasi Streaming Berbayar dan Tingkat Kekayaan Konten (content richness) yang Ditawarkan dengan Loyalitas Konsumsi Aplikasi Streaming Berbayar GoPlay,” J. SUARA MERDEKA, pp. 410–421, 2023, doi: 10.48175/ijarsct-13062.

A. D. Yuliani, “Peranan Media Streaming dalam Menggantikan Televisi Konvensional di Kalangan Masyarakat,” J. Ris. Manaj. Komun., vol. 3, no. 2, pp. 109–114, 2023, doi: 10.29313/jrmk.v3i2.3140.

R. R. Putra and Z. Hidayat, “Komunikasi Pemasaran Layanan Video Streaming Dan on Demand Mnc Group (Studi Kasus: Aplikasi Rcti+),” JISIP (Jurnal Ilmu Sos. dan Pendidikan), vol. 6, no. 1, pp. 2255–2269, 2022, doi: 10.58258/jisip.v6i1.2813.

Y. Du, S. Choe, J. Vega, Y. Liu, and A. Trujillo, “Listening to Stakeholders Involved in Speech-Language Therapy for Children With Communication Disorders: Content Analysis of Apple App Store Reviews,” JMIR Pediatr. Parent., vol. 5, no. 1, 2022, doi: 10.2196/28661.

J. A. Rieuwpassa, S. Sugito, and T. Widiharih, “Implementasi Metode Naive Bayes Classifier Untuk Klasifikasi Sentimen Ulasan Pengguna Aplikasi Netflix Pada Google Play,” J. Gaussian, vol. 12, no. 3, pp. 362–371, 2024, doi: 10.14710/j.gauss.12.3.362-371.

H. Wisnu, M. Afif, and Y. Ruldevyani, “Sentiment analysis on customer satisfaction of digital payment in Indonesia: A comparative study using KNN and Naïve Bayes,” J. Phys. Conf. Ser., vol. 1444, no. 1, 2020, doi: 10.1088/1742-6596/1444/1/012034.

D. K. and S. A. K. Hulliyah, A. M. Almaisah, F. Mintarsih, S. U. Masrurah, “Analysis of Public Sentiment Using The K-Nearest Neighbor (k-NN) Algorithm and Lexicon Based on Indonesian Television Shows on Social Media Twitter,” Int. Conf. Cyber IT Serv. Manag., 2022, doi: 10.1109/CITSM56380.2022.9936011.

S. Albahli, “Twitter sentiment analysis: An Arabic text mining approach based on COVID-19,” Front. Public Heal., vol. 10, no. October, pp. 1–13, 2022, doi: 10.3389/fpubh.2022.966779.

D. A. Kristiyanti, S. A. Sanjaya, V. C. Tjokro, and J. Suhali, “Dealing imbalance dataset problem in sentiment analysis of recession in Indonesia,” IAES Int. J. Artif. Intell., vol. 13, no. 2, pp. 2058–2070, 2024, doi: 10.11591/ijai.v13.i2.pp2060-2072.

F. F. Supeli and Setiaji, “Klasifikasi Sentimen Positif Dan Negatif Pada Aplikasi Vidio Dengan Algoritma K-Nearest Neighbor,” Indones. J. Comput. Sci., vol. 2, no. 1, pp. 7–15, 2023, doi: 10.31294/ijcs.v2i1.1874.

S. Nuraeni, S. P. A. Syam, M. F. Wajdi, B. Firmansyah, and M. Malkan, “Implementasi Metode K-NN Untuk Menentukan Jurusan Siswa di SMAN 02 Manokwari,” G-Tech J. Teknol. Terap., vol. 7, no. 1, pp. 89–95, 2023, doi: 10.33379/gtech.v7i1.1905.

M. F. El Firdaus, N. Nurfaizah, and S. Sarmini, “Analisis Sentimen Tokopedia Pada Ulasan di Google Playstore Menggunakan Algoritma Naïve Bayes Classifier dan K-Nearest Neighbor,” JURIKOM (Jurnal Ris. Komputer), vol. 9, no. 5, p. 1329, 2022, doi: 10.30865/jurikom.v9i5.4774.

O. Chamorro-Atalaya et al., “Student Satisfaction Classification Algorithm Using the Minority Synthetic Oversampling Technique,” Int. J. Inf. Educ. Technol., vol. 13, no. 7, pp. 1094–1100, 2023, doi: 10.18178/ijiet.2023.13.7.1909.

J. Soni, S. Xavier, R. Saxena, and P. Waghela, “Leveraging Web Scraping for Aspect-Based Sentiment Analysis: A Case Study on Flipkart Customer Reviews of Mobile Phones,” Int. J. Res. Publ. Rev., vol. 5, no. 6, pp. 6747–6752, 2024, doi: 10.55248/gengpi.5.0624.1638.

S. Raheja and A. Asthana, “Sentiment Analysis of Tweets During the COVID-19 Pandemic Using Multinomial Logistic Regression,” Int. J. Softw. Innov., vol. 11, no. 1, pp. 1–16, 2022, doi: 10.4018/IJSI.315740.

K. Adrian Manurung, “Sentiment Analysis of Tourist Attraction Review from TripAdvisor Using CNN and LSTM,” Int. J. Inf. Commun. Technol., vol. 9, no. 1, pp. 73–85, 2023, doi: 10.21108/ijoict.v9i1.756.

R. L. Mustofa and B. Prasetiyo, “Sentiment analysis using lexicon-based method with naive bayes classifier algorithm on #newnormal hashtag in twitter,” J. Phys. Conf. Ser., vol. 1918, no. 4, 2021, doi: 10.1088/1742-6596/1918/4/042155.

F. Westin, “Time Period Categorization in Fiction: A Comparative Analysis of Machine Learning Techniques,” Cat. Classif. Q., vol. 62, no. 2, pp. 124–153, 2024, doi: 10.1080/01639374.2024.2315548.

E. Serradell-Lopez, P. Lara-Navarra, and S. Martínez-Martínez, “The Pareto Principle in virtual communities of learning,” Comput. Human Behav., vol. 138, no. August 2022, p. 107444, 2023, doi: 10.1016/j.chb.2022.107444.

H. Herman, I. Riadi, and Y. Kurniawan, “Vulnerability Detection With K-Nearest Neighbor and Naïve Bayes Method using Machine Learning,” Int. J. Artif. Intell. Res., vol. 7, no. 1, p. 10, 2023, doi: 10.29099/ijair.v7i1.795.

S. Bengesi, T. Oladunni, R. Olusegun, and H. Audu, “A Machine Learning-Sentiment Analysis on Monkeypox Outbreak: An Extensive Dataset to Show the Polarity of Public Opinion From Twitter Tweets,” IEEE Access, vol. 11, no. February, pp. 11811–11826, 2023, doi: 10.1109/ACCESS.2023.3242290.

N. Sigeef, “An Oversampling Algorithm combining SMOTE and RF for Imbalanced Medical Data,” Int. J. Res. Appl. Sci. Eng. Technol., vol. 11, no. 6, pp. 2429–2434, 2023, doi: 10.22214/ijraset.2023.54074.

A. X. Wang, S. S. Chukova, and B. P. Nguyen, “Synthetic minority oversampling using edited displacement-based k-nearest neighbors,” Appl. Soft Comput., vol. 148, no. October, p. 110895, 2023, doi: 10.1016/j.asoc.2023.110895.

M. Rahardi, A. Aminuddin, F. F. Abdulloh, and R. A. Nugroho, “Sentiment Analysis of Covid-19 Vaccination using Support Vector Machine in Indonesia,” Int. J. Adv. Comput. Sci. Appl., vol. 13, no. 6, pp. 534–539, 2022, doi: 10.14569/IJACSA.2022.0130665.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Enchancing K-NN Performance With SMOTE for Sentiment Analysis of Streaming App Reviews

Dimensions Badge
Article History
Submitted: 2024-11-03
Published: 2024-12-30
Abstract View: 44 times
PDF Download: 14 times
How to Cite
Patrycia Dewi, N. P. E., Yanti, C. P., & Yusa, I. M. M. (2024). Enchancing K-NN Performance With SMOTE for Sentiment Analysis of Streaming App Reviews. Building of Informatics, Technology and Science (BITS), 6(3), 1966-1976. https://doi.org/10.47065/bits.v6i3.6190
Issue
Section
Articles