Komparasi Metode Perhitungan Jarak K-Means Paling Baik Terhadap Pembentukan Pola Kunjungan Wisatawan Mancanegara


  • Lalu Mutawalli * Mail STMIK Lombok, Praya, Indonesia
  • Sofiansyah Fadli STMIK Lombok, Praya, Indonesia https://orcid.org/0000-0002-4297-9873
  • Supardianto Supardianto Universitas Teknologi Mataram, Mataram, Indonesia
  • (*) Corresponding Author
Keywords: K-Means; Cluster; Destinations; Tourism; Foreigen

Abstract

Understanding patterns among foreign tourists is an urgent matter. These patterns can become knowledge that helps in making better decisions because they are data-driven. The pattern to be elaborated on is regarding the clustering of visits by foreign tourists to tourist destinations in Jakarta. Data mining is an approach that extracts knowledge patterns from a dataset. K-Means is one of the data mining algorithms used for clustering data, where data is grouped based on similarity in features and attributes. This study compares the Euclidean Distance, Manhattan Distance, and Haversine Distance methods to obtain more representative data clusters for the datasets. The datasets in this study are not normally distributed due to outlier data; hence, the DBSCAN algorithm is used for improvement without removing or cutting the data, as it can result in a significant amount of missing values that could affect information that does not align with empirical facts. In this study, 5 clusters were created based on elbow calculation results. The K-Means cluster testing in Euclidean distance yielded a Silhouette Score of 0.36, Inertia of 0.86, and Davies-Bouldin Index of 2.39. The Manhattan method resulted in a Silhouette Score of 0.65, Inertia of 1.46, and Davies-Bouldin Index of 0.47. Meanwhile, applying the Haversine method resulted in a Silhouette Score of 0.36, Inertia of 0.03, and a value of 2.39 for the Davies-Bouldin Index.

Downloads

Download data is not yet available.

References

D. Rahmayani, S. Oktavilia2, D. A. Suseno, E. L. Isnaini, and A. Supriyadi, “Economics Development Analysis Journal Tourism Development and Economic Growth: An Empirical Investigation for Indonesia Article Information,” Econ. Dev. Anal. J., vol. 1, no. 1, pp. 1–11, 2022, [Online]. Available: https://doi.org/10.15294/edaj.v11i1.50009

P. Widayanti, “Kian Melesat di 2023 Wisata Indonesia Bersiap Menuju Level Pandemi,” Media Keuangan Kemenku RI, 2023. https://mediakeuangan.kemenkeu.go.id/article/show/kian-melesat-di-2023-pariwisata-indonesia-bersiap-menuju-level-prapandemi

Kemenkumham, “Undang-Undang Republik Indonesia Nomor 10 Tahun 2009 Tentang Keparwisataan,” Kementerian Keuangan RI, 2009. https://jdih.kemenkeu.go.id/fullText/2009/10TAHUN2009UU.HTM#:~:text=Daerah tujuan pariwisata yang selanjutnya,saling terkait dan melengkapi terwujudnya

DKI, “Data Terbuka Pemerintah Provinsi DKI Jakarta,” Provinsi Jakarta, 2020. https://data.jakarta.go.id/

C. M. Bishop, Pattern Recognition and Machine Learning. New York: Springer-Verlag, 2006.

L. Ardiansyah and S. A. Awalludin, “Implementation of the K-Mean Algorithm to Determine the Level of Student Satisfaction with the Online Learning Uhamka System (OLU),” J. Pembelajaran Dan Mat. Sigma, vol. 9, no. 1, pp. 162–171, 2023, doi: 10.36987/jpms.v9i1.4121.

F. Grandoni, R. Ostrovsky, Y. Rabani, L. J. Schulman, and R. Venkat, “A refined approximation for Euclidean k-means,” Inf. Process. Lett., vol. 176, p. 106251, 2022, doi: 10.1016/j.ipl.2022.106251.

K. E. Setiawan, A. Kurniawan, A. Chowanda, and D. Suhartono, “Clustering models for hospitals in Jakarta using fuzzy c-means and k-means,” Procedia Comput. Sci., vol. 216, no. 2022, pp. 356–363, 2023, doi: 10.1016/j.procs.2022.12.146.

N. H. M. M. Shrifan, M. F. Akbar, and N. A. M. Isa, “An adaptive outlier removal aided k-means clustering algorithm,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 8, pp. 6365–6376, 2022, doi: 10.1016/j.jksuci.2021.07.003.

A. Aditya, N. B. Sari, and T. N. Padilah, “Perbandingan pengukuran jarak Euclidean dan Gower pada klaster k-medoids,” J. Teknol. dan Sist. Komput., vol. 9, no. 1, pp. 1–7, 2021, doi: 10.14710/jtsiskom.2021.13747.

T. M. Ghazal et al., “Performances of k-means clustering algorithm with different distance metrics,” Intell. Autom. Soft Comput., vol. 30, no. 2, pp. 735–742, 2021, doi: 10.32604/iasc.2021.019067.

R. Suwanda, Z. Syahputra, and E. M. Zamzami, “Analysis of Euclidean Distance and Manhattan Distance in the K-Means Algorithm for Variations Number of Centroid K,” J. Phys. Conf. Ser., vol. 1566, no. 1, 2020, doi: 10.1088/1742-6596/1566/1/012058.

R. Hidayati, A. Zubair, A. H. Pratama, and L. Indana, “Analisis Silhouette Coefficientpada 6 Perhitungan Jarak K-Means Clustering Silhouette Coefficient Analysis in 6 Measuring Distancesof K-Means Clustering,” Techno.COM, vol. 20, no. 2, pp. 186–197, 2021.

W. Wahyu Pribadi, A. Yunus, and A. S. Wiguna, “Perbandingan Metode K-Means Euclidean Distance Dan Manhattan Distance Pada Penentuan Zonasi Covid-19 Di Kabupaten Malang,” JATI (Jurnal Mhs. Tek. Inform., vol. 6, no. 2, pp. 493–500, 2022, doi: 10.36040/jati.v6i2.4808.

Y. Miftahuddin, S. Umaroh, and F. R. Karim, “Perbandingan Metode Perhitungan Jarak Pada Kehadiran Karyawan Institut Teknologi Nasional Bandung,” J. Tekno Insentif, vol. 14, no. 2, pp. 69–77, 2020.

V. S. Thalapala and K. Guravaiah, “FCMCP: Fuzzy C-Means for Controller Placement in Software Defined Networking,” Procedia Comput. Sci., vol. 201, no. 1, pp. 109–116, 2022, doi: 10.1016/j.procs.2022.03.017.

P. D. Jakrata, “Data Jumlah Kunjungan Wisatawan Mancanegara ke Destinasi Wisata di Provinsi DKI Jakarta Tahun 2020,” 2020. https://data.jakarta.go.id/dataset/data-jumlah-kunjungan-wisatawan-mancanegara-ke-destinasi-wisata-di-provinsi-dki-jakarta-tahun-2021

P. D. Jakarta, “Data Jumlah Kunjungan Wisatawan Mancanegara ke Destinasi Wisata di Provinsi DKI Jakarta Tahun 2021,” Jakarta Open Data, 2021. https://data.jakarta.go.id/dataset/data-jumlah-kunjungan-wisatawan-mancanegara-ke-destinasi-wisata-di-provinsi-dki-jakarta-tahun-2021

F. Jin, M. Chen, W. Zhang, Y. Yuan, and S. Wang, “Intrusion detection on internet of vehicles via combining log-ratio oversampling, outlier detection and metric learning,” Inf. Sci. (Ny)., vol. 579, pp. 814–831, 2021.

P.-N. Tan, M. Steinbach, and V. Kumar, Introduction to Data Mining. 2005.

K. P. Murphy, Machine Learning: A Probabilistic Perspective. London: MIT Press, 2012.

C. Aggarawal, Data Clustering Algoritms and Applications. Florida: CRC Press, 2014.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Komparasi Metode Perhitungan Jarak K-Means Paling Baik Terhadap Pembentukan Pola Kunjungan Wisatawan Mancanegara

Dimensions Badge
Article History
Submitted: 2023-10-07
Published: 2023-10-26
Abstract View: 662 times
PDF Download: 551 times
How to Cite
Mutawalli, L., Fadli, S., & Supardianto, S. (2023). Komparasi Metode Perhitungan Jarak K-Means Paling Baik Terhadap Pembentukan Pola Kunjungan Wisatawan Mancanegara. Journal of Information System Research (JOSH), 5(1), 159-166. https://doi.org/10.47065/josh.v5i1.4377
Section
Articles