Optimasi Cluster Pada K-Means Clustering Dengan Teknik Reduksi Dimensi Dataset Menggunakan Gini Index
Abstract
In K-Means Clustering, the number of attributes of a data can affect the number of iterations generated in the data grouping process. One of the solutions to overcome these problems is by using a reduction technique on the dimensions of the dataset. In this study, the authors apply the Gini Index to perform attribute reduction on the data set to reduce attributes that have no effect on the dataset before clustering with K-Means Clustering. The dataset used to be tested as a testing instrument in this research is Absenteeism at work obtained from the UCI Machine Learning Repository, with 20 attributes, 740 data records and 4 attribute classes. The results of the tests in this research indicate that the number of iterations obtained from the comparison of tests using the K-Means in a Conversional (Without Attribute Reduction) is obtained by the number of 9 iterations, while the K-Means with attribute reduction with the Gini Index obtains the number of iterations totaling 6 iterations. Clustering evaluation was calculated using Sum of Square Error (SSE). The SSE value in K-Means Clustering in a Conversional (Without Attribute Reduction) is 1391.613, while in K-Means Clustering with attribute reduction with a Gini Index, it is 440.912. From the results of the proposed method, it is able to reduce the percentage of errors and minimize the number of iterations in K-Means Clustering by reducing the dimensions of the dataset using the Gini Index
Downloads
References
I. Alpiana and L. Anifah, “Penerapan Metode KnA (Kombinasi K-Means dan Agglomerative Hierarchical Clustering) dengan Pendekatan Single Linkage untuk Menentukan Status Gizi pada Balita,” Indones. J. Eng. Technol., vol. 1, no. 2, pp. 2623–2464, 2019, [Online]. Available: https://journal.unesa.ac.id/index.php/inajet
E. Muningsih, “Kombinasi Metode K-Means Dan Decision Tree Dengan Perbandingan Kriteria Dan Split Data,” J. Teknoinfo, vol. 16, no. 1, p. 113, 2022, doi: 10.33365/jti.v16i1.1561.
N. K. Zuhal, “Study Comparison K-Means Clustering dengan Algoritma Hierarchical Clustering,” Univ. Nusant. PGRI Kediri. Kediri, vol. 1, no. 1, pp. 200–205, 2022.
M. Arief Soeleman and F. Ilmu Komputer, “Penentuan Centroid Awal Pada Algoritma K-Means Dengan Dynamic Artificial Chromosomes Genetic Algorithm Untuk Tuberculosis Dataset Pre-Centroid Determination in K-Means Algorithm using Dynamic Artificial Chromosomes Genetic Algorithm for Tuberculosis Datas,” Februari, vol. 20, no. 1, pp. 97–108, 2021.
G. Rahayu and Mustakim, “Principal Component Analysis Untuk Dimensi Reduksi Data Clustering Sebagai Pemetaan Persentase Sertifikasi Guru Di Indonesia,” Semin. Nas. Teknol. Inf. Komun. dan Ind., vol. 0, no. 0, pp. 201–208, 2017, [Online]. Available: http://ejournal.uin-suska.ac.id/index.php/SNTIKI/article/view/3265
A. Izzuddin, “Optimasi Cluster pada Algoritma K-Means dengan Reduksi Dimensi Dataset Menggunakan Principal Component Analysis untuk Pemetaan Kinerja Dosen,” Ed. Nop., vol. 5, no. 2, pp. 41–46, 2015.
D. Hediyati and I. M. Suartana, “Penerapan Principal Component Analysis (PCA) Untuk Reduksi Dimensi Pada Proses Clustering Data Produksi Pertanian Di Kabupaten Bojonegoro,” J. Inf. Eng. Educ. Technol., vol. 5, no. 2, pp. 49–54, 2021.
M. Mauludin Rohman and S. Adinugroho, “Analisis Sentimen pada Ulasan Aplikasi Mobile JKN Menggunakan Metode Maximum Entropy dan Seleksi Fitur Gini Index Text,” J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 5, no. 6, pp. 2646–2654, 2021, [Online]. Available: http://j-ptiik.ub.ac.id
H. Irwandi, O. S. Sitompul, and S. Sutarman, “K-Means Performance Optimization Using Rank Order Centroid (ROC) And Braycurtis Distance,” SinkrOn, vol. 7, no. 2, pp. 472–478, 2022, doi: 10.33395/sinkron.v7i2.11371.
T. Setiyorini and R. T. Asmono, “Penerapan Metode K-Nearest Neighbor Dan Gini Index Pada Klasifikasi Kinerja Siswa,” J. Techno Nusa Mandiri, vol. 16, no. 2, pp. 121–126, 2019, doi: 10.33480/techno.v16i2.747.
T. Setiyorini and R. T. Asmono, “Penerapan Gini Index dan K-Nearest Neighbor untuk Klasifikasi Tingkat Kognitif Soal Pada Taksonomi Bloom,” Pilar Nusa Mandiri, vol. 13, no. 2, pp. 209–216, 2017, [Online]. Available: https://ejournal.nusamandiri.ac.id/index.php/pilar/article/view/239
I. Arfiani, H. Yuliansyah, and M. D. Suratin, “Implementasi Bee Colony Optimization Pada Pemilihan Centroid (Klaster Pusat) Dalam Algoritma K-Means,” Build. Informatics, Technol. Sci., vol. 3, no. 4, pp. 756–763, 2022, doi: 10.47065/bits.v3i4.1446.
A. I. Lubis, U. Erdiansyah, and R. Siregar, “Komparasi Akurasi pada Naive Bayes dan Random Forest dalam Klasifikasi Penyakit Liver,” J. Comput. Eng. Syst. Sci., vol. 7, no. 1, pp. 81–89, 2022.
U. Erdiansyah, A. Irmansyah Lubis, and K. Erwansyah, “Komparasi Metode K-Nearest Neighbor dan Random Forest Dalam Prediksi Akurasi Klasifikasi Pengobatan Penyakit Kutil,” J. Media Inform. Budidarma, vol. 6, no. 1, p. 208, 2022, doi: 10.30865/mib.v6i1.3373.
N. Putu, E. Merliana, and A. J. Santoso, “Analisa Penentuan Jumlah Cluster Terbaik pada Metode K-Means,” Pros. Semin. Nas. MULTI DISIPLIN ILMU&CALL Pap. UNISBANK, pp. 978–979, 2016.
A. I. Lubis, P. Sihombing, and E. B. Nababan, “Comparison SAW and MOORA Methods with Attribute Weighting Using Rank Order Centroid in Decision Making,” Mecn. 2020 - Int. Conf. Mech. Electron. Comput. Ind. Technol., no. February 2022, pp. 127–131, 2020, doi: 10.1109/MECnIT48290.2020.9166640.
L. Zahrotun, “Analisis Pengelompokan Jumlah Penumpang Bus Trans Jogja Menggunakan Metode Clustering K-Means Dan Agglomerative Hierarchical Clustering (Ahc),” J. Inform., vol. 9, no. 1, pp. 1039–1047, 2015, doi: 10.26555/jifo.v9i1.a2045.
D. Jollyta, S. Efendi, M. Zarlis, and H. Mawengkang, “Optimasi Cluster Pada Data Stunting: Teknik Evaluasi Cluster Sum of Square Error dan Davies Bouldin Index,” Pros. Semin. Nas. Ris. Inf. Sci., vol. 1, no. September, p. 918, 2019, doi: 10.30645/senaris.v1i0.100.
L. P. Refialy, H. Maitimu, and M. S. Pesulima, “Perbaikan Kinerja Clustering K-Means pada Data Ekonomi Nelayan dengan Perhitungan Sum of Square Error (SSE) dan Optimasi nilai K cluster,” Techno.Com, vol. 20, no. 2, pp. 321–329, 2021, doi: 10.33633/tc.v20i2.4572.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Optimasi Cluster Pada K-Means Clustering Dengan Teknik Reduksi Dimensi Dataset Menggunakan Gini Index
Pages: 1309−1316
Copyright (c) 2022 Muhammad Imam Zarkasyi, Herman Mawengkang, Opim Salim Sitompul

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).





















