Klasifikasi Text Dokumen Web Berbasis Supervised Learning Sebagai Pemodelan Aplikasi Pembelajaran Kebudayaan Melayu di Indonesia


  • Mustakim Mustakim * Mail Universitas Islam Negeri Sultan Syarif Kasim Riau, Pekanbaru, Indonesia
  • Febi Nur Salisah Universitas Islam Negeri Sultan Syarif Kasim Riau, Pekanbaru, Indonesia
  • Suryani Suryani Universitas Islam Negeri Sultan Syarif Kasim Riau, Pekanbaru, Indonesia
  • (*) Corresponding Author
Keywords: KNN; NBC; PNN; Random Forest; SVM

Abstract

Indonesia, as the largest archipelagic country, is home to diverse cultures, including Malay culture in Riau Province. The website features numerous text documents, including articles, news, and personal documents, uploaded by members of the cultural community. This study aims to support the preservation of Malay culture through technology by implementing a digital learning system based on Machine Learning. Previous research has identified weaknesses in the application of intelligent systems and machine learning algorithms. This study tests five classification algorithms Random Forest, SVM, Naïve Bayes, KNN, and PNN to improve the system's accuracy and performance. The results show that Random Forest achieved the highest accuracy of 91.17%, followed by KNN at 88.23%, SVM and NBC at 82.35%, and PNN at 76.47%. The developed Digital Learning System (DLS) received positive feedback, with a User Acceptance Test (UAT) score of 86% and a 100% success rate in Blackbox testing, demonstrating stable performance across various devices. This research introduces a new innovation in Malay cultural preservation applications, utilizing Machine Learning algorithms to enhance both accuracy and functionality.

Downloads

Download data is not yet available.

References

M. P. Sari and A. R. Hidayatulloh, “Pengenalan Kebudayaan Indonesia melalui Fotografi pada Akun Instagram ‘KWODOKIJO,’” Edsence J. Pendidik. Multimed., vol. 2, no. 2, pp. 111–120, 2020, doi: 10.17509/edsence.v2i2.27460.

N. D. Budi Setyaningrum, “Local Culture in the Global Era,” Ekspresi Seni, vol. 20, no. 2, p. 102, 2018.

M. J. Saputra and N. Hamdi, “Rancang Bangun Aplikasi Sejarah Kebudayaan Aceh Berbasis Android Studi Kasus Dinas Kebudayaan Dan Pariwisata Aceh,” J. Informatics Comput. Sci., vol. 5, no. 2, pp. 147–158, 2019, [Online]. Available: http://www.jurnal.uui.ac.id/index.php/jics/article/view/555

A. Suryadi, N. M. Rosa, and E. Subandriyo, “Perancangan Aplikasi Pengenalan Suku Dan Kebudayaan Berbasis Android,” Semin. Nas. Ris. dan Teknol. (SEMNAS RISTEK), vol. 4, no. 1, pp. 186–192, 2020, [Online]. Available: http://www.proceeding.unindra.ac.id/index.php/semnasristek/article/view/2497

M. Mustakim and F. N. Salisah, “Density-Based Spatial Clustering, K-Means and Frequent Pattern Growth for Clustering and Association of Malay Cultural Text Data in Indonesia”, bits, vol. 7, no. 1, pp. 884-895, Jun. 2025.

A. C. Benabdellah, A. Benghabrit, and I. Bouhaddou, “A survey of clustering algorithms for an industrial context,” Procedia Comput. Sci., vol. 148, pp. 291–302, 2019, doi: 10.1016/j.procs.2019.01.022.

S. Reddy et al., “Use and validation of text mining and cluster algorithms to derive insights from Corona Virus Disease-2019 (COVID-19) medical literature,” Comput. Methods Programs Biomed. Updat., vol. 1, no. February, p. 100010, 2021, doi: 10.1016/j.cmpbup.2021.100010.

S. Jun, S. S. Park, and D. S. Jang, “Document clustering method using dimension reduction and support vector clustering to overcome sparseness,” Expert Syst. Appl., vol. 41, no. 7, pp. 3204–3212, 2014, doi: 10.1016/j.eswa.2013.11.018.

J. Rejito, A. Atthariq, and A. S. Abdullah, “Application of text mining employing k-means algorithms for clustering tweets of Tokopedia,” J. Phys. Conf. Ser., vol. 1722, no. 1, 2021, doi: 10.1088/1742-6596/1722/1/012019.

Mustakim, M. Z. Fauzi, Mustafa, A. Abdullah, and Rohayati, “Clustering of Public Opinion on Natural Disasters in Indonesia Using DBSCAN and K-Medoids Algorithms,” J. Phys. Conf. Ser., vol. 1783, no. 1, 2021, doi: 10.1088/1742-6596/1783/1/012016.

A. Patel, P. Oza, and S. Agrawal, “Sentiment Analysis of Customer Feedback and Reviews for Airline Services using Language Representation Model,” Procedia Comput. Sci., vol. 218, pp. 2459–2467, 2022, doi: 10.1016/j.procs.2023.01.221.

V. Jackins, S. Vimal, M. Kaliappan, and M. Y. Lee, “AI-based smart prediction of clinical disease using random forest classifier and Naive Bayes,” J. Supercomput., vol. 77, no. 5, pp. 5198–5219, 2021, doi: 10.1007/s11227-020-03481-x.

L. C. Lee, C. Y. Liong, and A. A. Jemain, “Validity of the best practice in splitting data for hold-out validation strategy as performed on the ink strokes in the context of forensic science,” Microchem. J., vol. 139, no. 2017, pp. 125–133, 2018, doi: 10.1016/j.microc.2018.02.009.

I. Ho, H. N. Goh, and Y. F. Tan, “Preprocessing Impact on Sentiment Analysis Performance on Malay Social Media Text,” J. Syst. Manag. Sci., vol. 12, no. 5, pp. 73–90, 2022, doi: 10.33168/JSMS.2022.0505.

N. Jalal, A. Mehmood, G. S. Choi, and I. Ashraf, “A novel improved random forest for text classification using feature ranking and optimal number of trees,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 6, pp. 2733–2742, 2022, doi: 10.1016/j.jksuci.2022.03.012.

S. G. Delis et al., “Hepatic resection for large hepatocellular carcinoma in the era of UCSF criteria,” Hpb, vol. 11, no. 7, pp. 551–558, 2009, doi: 10.1111/j.1477-2574.2009.00084.x.

Mustakim and R. Novita, “The Implementation of Probabilistic Neural Networks to Sentiment Analysis of National Principle and Religion Issues in Indonesia,” J. Syst. Manag. Sci., vol. 13, no. 5, pp. 311–321, 2023, doi: 10.33168/JSMS.2023.0520.

J. L. Speiser, M. E. Miller, J. Tooze, and E. Ip, “A comparison of random forest variable selection methods for classification prediction modeling,” Expert Syst. Appl., vol. 134, pp. 93–101, 2019, doi: 10.1016/j.eswa.2019.05.028.

M. Azhari, Z. Situmorang, and R. Rosnelly, “Perbandingan Akurasi, Recall, dan Presisi Klasifikasi pada Algoritma C4.5, Random Forest, SVM dan Naive Bayes,” J. Media Inform. Budidarma, vol. 5, no. 2, p. 640, 2021, doi: 10.30865/mib.v5i2.2937.

W. Apriliah, I. Kurniawan, M. Baydhowi, and T. Haryati, “Prediksi Kemungkinan Diabetes pada Tahap Awal Menggunakan Algoritma Klasifikasi Random Forest,” Sistemasi, vol. 10, no. 1, p. 163, 2021, doi: 10.32520/stmsi.v10i1.1129.

G. A. Sandag, “Prediksi Rating Aplikasi App Store Menggunakan Algoritma Random Forest,” CogITo Smart J., vol. 6, no. 2, pp. 167–178, 2020, doi: 10.31154/cogito.v6i2.270.167-178.

J. Cervantes, F. Garcia-Lamont, L. Rodríguez-Mazahua, and A. Lopez, “A comprehensive survey on support vector machine classification: Applications, challenges and trends,” Neurocomputing, vol. 408, no. xxxx, pp. 189–215, 2020, doi: 10.1016/j.neucom.2019.10.118.

M. Muhathir, M. H. Santoso, and D. A. Larasati, “Wayang Image Classification Using SVM Method and GLCM Feature Extraction,” J. Informatics Telecommun. Eng., vol. 4, no. 2, pp. 373–382, 2021, doi: 10.31289/jite.v4i2.4524.

D. Tuhenay and E. Mailoa, “Perbandingan Klasifikasi Bahasa Menggunakan Metode Naïve Bayes Classifier ( Nbc ) Dan Support Vector Machine ( Svm ) Comparison of Language Classification Using Naive Bayes Classifier ( Nbc ) and Support Vector Machine ( Svm ) Method,” JIKO (Jurnal Inform. dan Komputer), vol. 4, no. 2, pp. 105–111, 2021, doi: 10.33387/jiko.

F. Paquin, J. Rivnay, A. Salleo, N. Stingelin, and C. Silva, “Multi-phase semicrystalline microstructures drive exciton dissociation in neat plastic semiconductors,” J. Mater. Chem. C, vol. 3, pp. 10715–10722, 2015, doi: 10.1039/b000000x.

H. Chen, S. Hu, R. Hua, and X. Zhao, “Improved naive Bayes classification algorithm for traffic risk management,” EURASIP J. Adv. Signal Process., vol. 2021, no. 1, 2021, doi: 10.1186/s13634-021-00742-6.

A. R. Lubis, M. Lubis, and Al-Khowarizmi, “Optimization of distance formula in k-nearest neighbor method,” Bull. Electr. Eng. Informatics, vol. 9, no. 1, pp. 326–338, 2020, doi: 10.11591/eei.v9i1.1464.

D. Prasetyawan and R. Gatra, “Algoritma K-Nearest Neighbor untuk Memprediksi Prestasi Mahasiswa Berdasarkan Latar Belakang Pendidikan dan Ekonomi,” JISKA (Jurnal Inform. Sunan Kalijaga), vol. 7, no. 1, pp. 56–67, 2022, doi: 10.14421/jiska.2022.7.1.56-67.

M. Laia, R. K. Hondro, and T. Zebua, “Implementasi Pengolahan Citra dengan Menggunakan Metode K-Nearest Neighbor Untuk Mengetahui Daging Ayam Busuk dan Daging Ayam Segar,” JURIKOM (Jurnal Ris. Komputer), vol. 8, no. 2, pp. 39–49, 2021, doi: 10.30865/jurikom.v8i2.2818.

S. Ramadona, M. Diono, M. Susantok, and S. Ahdan, “Indoor location tracking pegawai berbasis Android menggunakan algoritma k-nearest neighbor,” JITEL (Jurnal Ilm. Telekomun. Elektron. dan List. Tenaga), vol. 1, no. 1, pp. 51–58, 2021, doi: 10.35313/jitel.v1.i1.2021.51-58.

Y. Bodyanskiy, A. Deineko, I. Pliss, O. Chala, and A. Nortsova, “Matrix fuzzy-probabilistic neural network in image recognition task,” Proc. 2020 IEEE 3rd Int. Conf. Data Stream Min. Process. DSMP 2020, pp. 33–36, 2020, doi: 10.1109/DSMP47368.2020.9204236.

L. Breiman, “Random Forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001.

T. Cover and P. Hart, “Nearest neighbor pattern classification,” IEEE Transactions on Information Theory, vol. 13, no. 1, pp. 21–27, 1967.

D. F. Specht, “Probabilistic Neural Networks for Classification, Mapping, or Associative Memory,” IEEE Transactions on Neural Networks, vol. 1, no. 3, pp. 525–532, 1990.

A. Suryadi, M. Juanda, and Yulisman, “Integrasi Machine Learning dalam Sistem Pembelajaran Kebudayaan Digital,” Jurnal Teknologi Informasi dan Pembelajaran, vol. 6, no. 2, pp. 115–124, 2024.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Klasifikasi Text Dokumen Web Berbasis Supervised Learning Sebagai Pemodelan Aplikasi Pembelajaran Kebudayaan Melayu di Indonesia

Dimensions Badge
Article History
Submitted: 2025-10-09
Published: 2025-12-31
Abstract View: 320 times
PDF Download: 247 times
How to Cite
Mustakim, M., Salisah, F. N., & Suryani, S. (2025). Klasifikasi Text Dokumen Web Berbasis Supervised Learning Sebagai Pemodelan Aplikasi Pembelajaran Kebudayaan Melayu di Indonesia. Building of Informatics, Technology and Science (BITS), 7(3), 2097-2108. https://doi.org/10.47065/bits.v7i3.8499
Issue
Section
Articles