Seleksi Fitur Menggunakan Eigen Vector Untuk Peningkatan Kinerja K-Means Clustering Dalam Pengelompokan Data
Abstract
The large number of data set attributes from the data grouping process with K-Means Clustering can affect the number of iterations produced. In this research, Eigen Vector is used to perform feature selection on the data set. The selected data set is then clustered using K-Means Clustering. The data set used in this research is the Wine Quality Dataset obtained from the UCI Machine Learning Repository, with 11 attributes, 4898 data records and 7 attribute classes. Then the South German Credit Dataset was obtained from kaggle.com with 20 attributes, 1000 data records and 2 attribute classes. The results of this research indicate that the number of iterations obtained from the comparison of tests using K-Means without feature selection is that in the Wine Quality Dataset, 11 iterations are obtained, and in the South German Credit Dataset, there are 10 iterations. Meanwhile, K-Means with Eigen Vector feature selection obtained the number of iterations in the Wine Quality Dataset with a total of 5 iterations, and in the South German Credit Dataset with a total of 4 iterations. Clustering evaluation was calculated using Sum of Square Error (SSE). The SSE value in K-Means Clustering without feature selection from the Wine Quality Dataset is 678.5735, while in the South German Credit Dataset it is 1534.3167. While the K-Means Clustering with Eigen Vector from the Wine Quality Dataset is 383.0517, and the South German Credit Dataset is 469.0698. From the results of the proposed method is able to reduce the percentage of errors and minimize the number of iterations on K-Means Clustering with feature selection using Eigen Vector
Downloads
References
N. Arunkumar, M. A. Mohammed, M. K.A Ghani, D. A. Ibrahim, “K-means clustering and neural network for osbject detecting and identifying abnormality of brain tumor”. Soft Computing, vol. 23, no. 19, pp. 9083-9096, 2019.
U. R. Raval, C. Jani, “Implementing & Improvisation of K-means Clustering Algorithm”, IJCSMC, vol. 5, no. 5, 2016
M. Bora, D. Jyoti, D. Gupta, A. Kumar, “Effect of Different Distance Measures on the Performance of K-Means Algorithm: An Experimental Study in Matlab”, IJCBIT, vol. 5, no. 2, 2014.
M. Kuhkan, "A Method to Improve the Accuracy of K-Nearest Neighbor Algorithm," International Journal of Computer Engineering and Information Technology, vol. 8, no. 6, pp. 90-95, 2016.
R. K. Dinata, H. Novriando, N. Hasdyna, and S. Retno, "Reduksi atribut menggunakan information gain untuk optimasi cluster algoritma k-means," Jurnal Edukasi dan Penelitian Informatika, vol. 6, no. 1, pp. 48-53. 2020.
A. Izzuddin, "Optimasi Cluster pada Algoritma K-Means dengan Reduksi Dimensi Dataset Menggunakan Principal Component Analysis untuk Pemetaan Kinerja Dosen," Energy-Jurnal Ilmiah Ilmu-Ilmu Teknik, vol. 5, no. 2, pp.41-46, 2015.
T. Silwattananusarn, K. Tuamsuk, “Data Mining and Its Applications for Knowledge Management: A Literature Review from 2007 to 2012”, IJDKP, vol. 2, no. 5, 2012.
Z. Ren, Z. Xu, and H. Wang, "The strategy selection problem on artificial intelligence with an integrated VIKOR and AHP method under probabilistic dual hesitant fuzzy information," IEEE Access, vol. 7, pp. 103979-103999, 2019.
C. Saranya, and G. Manikandan, "A Study on Normalization Techniques for Privacy Preserving Data Mining," International Journal of Engineering and Technology (IJET), vol. 5, no. 3, pp. 2701-2704, 2013.
L. P. Refialy, H. Maitimu, and M. S. Pesulima, “Perbaikan Kinerja Clustering K-Means pada Data Ekonomi Nelayan dengan Perhitungan Sum of Square Error (SSE) dan Optimasi nilai K cluster,” Techno. Com, vol. 20, no. 2, pp. 321-329, 2021.
A. I. Lubis, U. Erdiansyah, and R. Siregar, ”Comparison of Accuracy in Naïve Bayes and Random Forests in Classification of Liver Disease,” CESS (Journal of Computer Engineering, System and Science), vol. 7, no. 1, pp. 81-89, 2022.
A.E. Munthafa, and H. Mubarok, "Penerapan Metode Analytical Hierarchy Process Dalam Sistem Pendukung Keputusan Penentuan Mahasiswa Berprestasi," Jurnal Siliwangi, vol.3, no.2, 2017.
O. J. Oyelade, O. O. Oladipupo, I. C. Obagbuwa, “Application of K-Means Clustering Algorithm for Prediction of Students’s Academic Performance”, IJCSIS, Vol 7, No 1, 2010
W. Wijayanti, R. Ayu, M. T. Furqon, and S. Adinugroho. "Penerapan Algoritme Support Vector Machine Terhadap Klasifikasi Tingkat Risiko Pasien Gagal Ginjal." Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer e-ISSN 2548 (2018): 964X.
G. Tian, H. Zhang, M. Zhou, and Z. Li, "AHP, Gray Correlation, and TOPSIS Combined Approach to Green Performance Evaluation of Design Alternatives," IEEE Transaction on Systems, MAN, and Cybernetics, pp. 1-13, 2007
H. Haviluddin, S. J. Patandianan, G. M. Putra, N. Puspitasari, and H. S. Pakpahan, "Implementasi Metode K-Means Untuk Pengelompokkan Rekomendasi Tugas Akhir," Informatika Mulawarman: Jurnal Ilmiah Ilmu Komputer, vol. 16, no. 1, pp. 13-18, 2021.s
R. Nainggolan, and G. Lumbantoruan, "Optimasi performa cluster K-Means menggunakan Sum of Squared Error (SSE)," METHOMIKA: Jurnal Manajemen Informatika & Komputerisasi Akuntansi, vol. 2, no. 2, pp. 103-108, 2018.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Seleksi Fitur Menggunakan Eigen Vector Untuk Peningkatan Kinerja K-Means Clustering Dalam Pengelompokan Data
Pages: 1010−1017
Copyright (c) 2022 Nugroho Syahputra, Muhammad Zarlis, Syahril Efendi

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).