Evaluasi Performa Rmixmod dan KAMILA dalam Pengelompokan Perguruan Tinggi di Indonesia Berdasarkan Data Capaian Kinerja Bertipe Campuran
Abstract
Clustering is a technique for grouping objects based on their similarities within clusters and their differences across clusters. In real-world, objects often have characteristics represented by a combination of numerical and categorical variables, requiring clustering techniques that can process mixed-type data. Model-based clustering is one of the approaches that can be utilized for such data. This study evaluates and compares two model-based clustering algorithms for mixed data type, Rmixmod, which employs a mixture model with maximum likelihood estimation and expectation-maximization, and KAMILA, which utilizes a semi-parametric approach. Both algorithms are implemented to cluster Indonesian higher education institutions based on their performance. The optimal number of clusters is determined using the Bayesian Information Criterion and the Silhouette Coefficient. Algorithms performance is evaluated using the Silhouette Coeeficient, the Calinski-Harabasz Index, and the Davies-Bouldin Index. The research results showed that the Rmixmod algorithm outperformed KAMILA in clustering Indonesian higher education institutions, with a Silhouette Coeeficient of 0.2878, a Calinski-Harabasz Index of 253.9433, and a Davies-Bouldin Index of 1.5321. The optimal number of clusters formed was five. Cluster interpretation is conducted by analyzing the mean values of PC and the distribution of categorical variables within each cluster. The clustering results are expected to serve as a foundation for the government in formulating strategic policies that are both effective and differentiated according to the characteristics of each group of higher education institutions.
Downloads
References
M. Mulyoto, U. Rosyidi, and R. Rugayah, “Mutu Perguruan Tinggi: Perspektif Peringkat Universitas Global dan Akreditasi Perguruan Tinggi di Indonesia,” Manajemen Pendidikan, vol. 18, no. 1, pp. 26–41, Jul. 2023, doi: 10.23917/jmp.v18i1.20955.
M. F. Rouf, A. N. R. Attamimi, D. A. V. Putri, I. Nirmala, and A. N. Fadhilah, Statistik Pendidikan Tinggi Tahun 2023. Jakarta: Sekretariat Direktorat Jenderal Pendidikan Tinggi, Riset, dan Teknologi, 2023.
J. Singh and D. Singh, “A comprehensive review of clustering techniques in artificial intelligence for knowledge discovery: Taxonomy, challenges, applications and future prospects,” Advanced Engineering Informatics, vol. 62, Part C, no. 1, p. 102799, Oct. 2024, doi: 10.1016/J.AEI.2024.102799.
Q. Wen, “Application of Clustering Algorithm in Corporate Strategy and Risk,” Comput Intell Neurosci, vol. 2022, no. 1, p. 8803375, Jan. 2022, doi: https://doi.org/10.1155/2022/8803375.
C. Zhang, S. Lasaulce, M. Hennebel, L. Saludjian, P. Panciatici, and H. V. Poor, “Decision-making oriented clustering: Application to pricing and power consumption scheduling,” Appl Energy, vol. 297, no. 1, p. 117106, Sep. 2021, doi: 10.1016/j.apenergy.2021.117106.
E. Puspaputri et al., Panduan Penelitian dan Pengabdian kepada Masyarakat Tahun 2024. Jakarta: Direktorat Jenderal Pendidikan Tinggi, Riset, dan Teknologi, 2024.
M. Marhayati, A. M. Fa’ani, S. U. Ruhmanasari, and S. Faridah, “Application of K-Means Cluster Analysis for Grouping State Islamic University in Indonesia based on the Readiness Indicators for World Class University (WCU),” CAUCHY: Jurnal Matematika Murni dan Aplikasi, vol. 8, no. 2, pp. 30–48, Nov. 2023, doi: 10.18860/ca.v8i2.18046.
V. Fatmawaty, I. Riadi, and H. Herman, “Higher Education Institution Clustering Based on Key Performance Indicators using Quartile Binning Method,” MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer, vol. 24, no. 1, Nov. 2024, doi: https://doi.org/10.30812/matrik.v24i1.4244.
A. D. Cipta, “Klasterisasi Perguruan Tinggi Swasta Berdasarkan Minat Siswa Menggunakan Metode K-Medoids: Andika Dwi Cipta, Asep Id Hadiana, Fajri Rahmat Umbara,” Journal of Informatics and Communication Technology (JICT), vol. 4, no. 2, pp. 21–29, Jan. 2023, doi: 10.52661/j_ict.v4i2.116.
L. Alexandre, R. S. Costa, and R. Henriques, “DISA tool: Discriminative and informative subspace assessment with categorical and numerical outcomes,” PLoS One, vol. 17, no. 10, p. e0276253, Oct. 2022, doi: 10.1371/journal.pone.0276253.
Y. Lee, C. Park, and S. Kang, “Deep Embedded Clustering Framework for Mixed Data,” IEEE Access, vol. 11, no. 1, pp. 33–40, 2023, doi: 10.1109/ACCESS.2022.3232372.
B. Ghattas and A. Sanchez San-Benito, “Clustering Approaches for Mixed‐Type Data: A Comparative Study,” J Probab Stat, vol. 2025, no. 1, Jan. 2025, doi: 10.1155/jpas/2242100.
A. Ahmad and S. S. Khan, “Survey of State-of-the-Art Mixed Data Clustering Algorithms,” IEEE Access, vol. 7, no. 1, pp. 31883–31902, 2019, doi: 10.1109/ACCESS.2019.2903568.
C. Bouveyron, G. Celeux, T. B. Murphy, and A. E. Raftery, Eds., “Model-based Clustering: Basic Ideas,” in Model-Based Clustering and Classification for Data Science: With Applications in R, in Cambridge Series in Statistical and Probabilistic Mathematics. , Cambridge: Cambridge University Press, 2019, pp. 15–78. doi: 10.1017/9781108644181.003.
P. Giordani, M. B. Ferraro, and F. Martella, “Model-Based Clustering,” in An Introduction to Clustering with R, P. Giordani, M. B. Ferraro, and F. Martella, Eds., Singapore: Springer Singapore, 2020, pp. 215–289. doi: 10.1007/978-981-13-0553-5_6.
R. Lebret, S. Iovleff, F. Langrognet, C. Biernacki, G. Celeux, and G. Govaert, “Rmixmod: The R Package of the Model-Based Unsupervised, Supervised, and Semi-Supervised Classification Mixmod Library,” J Stat Softw, vol. 67, no. 6, 2015, doi: 10.18637/jss.v067.i06.
J. Roche et al., “GenoTriplo: A SNP genotype calling method for triploids,” PLoS Comput Biol, vol. 20, no. 9, p. e1012483, Sep. 2024, doi: 10.1371/journal.pcbi.1012483.
A. Foss, M. Markatou, B. Ray, and A. Heching, “A semiparametric method for clustering mixed data,” Mach Learn, vol. 105, no. 3, pp. 419–458, Dec. 2016, doi: 10.1007/s10994-016-5575-7.
J. Jimeno, M. Roy, and C. Tortora, “Clustering Mixed-Type Data: A Benchmark Study on KAMILA and K-Prototypes,” in Data Analysis and Rationality in a Complex World, T. Chadjipadelis, B. Lausen, A. Markos, T. R. Lee, A. Montanari, and R. Nugent, Eds., Cham: Springer International Publishing, 2021, pp. 83–91.
A. H. Foss and M. Markatou, “kamila: Clustering mixed-type data in R and hadoop,” J Stat Softw, vol. 83, no. 13, 2018, doi: 10.18637/jss.v083.i13.
“SINTA - Science and Technology Index.” Accessed: Jan. 11, 2025. [Online]. Available: https://sinta.kemdikbud.go.id/
“Badan Akreditasi Nasional Perguruan Tinggi – BANPT.” Accessed: Mar. 31, 2024. [Online]. Available: https://www.banpt.or.id/
“Website PPK BLU.” Accessed: Apr. 03, 2024. [Online]. Available: https://blu-djpb.kemenkeu.go.id/
“PDDikti.” Accessed: May 29, 2024. [Online]. Available: https://pddikti.kemdiktisaintek.go.id/
S. Ayesha, M. K. Hanif, and R. Talib, “Overview and comparative study of dimensionality reduction techniques for high dimensional data,” Information Fusion, vol. 59, no. 1, pp. 44–58, 2020, doi: https://doi.org/10.1016/j.inffus.2020.01.005.
K. R. Shahapure and C. Nicholas, “Cluster Quality Analysis Using Silhouette Score,” in 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), 2020, pp. 747–748. doi: 10.1109/DSAA49011.2020.00096.
S. Geng, “Analysis of the Different Statistical Metrics in Machine Learning,” in Highlights in Science, Engineering and Technology IFMPT, Darcy & Roy Press, 2024, pp. 350–356.
A. Aditya, B. N. Sari, and T. N. Padilah, “Comparison analysis of Euclidean and Gower distance measures on k-medoids cluster,” Jurnal Teknologi dan Sistem Komputer, vol. 9, no. 1, pp. 1–7, Jan. 2021, doi: 10.14710/jtsiskom.2020.13747.
A. Yunita, H. B. Santoso, and Z. A. Hasibuan, “‘Everything is data’: towards one big data ecosystem using multiple sources of data on higher education in Indonesia,” J Big Data, vol. 9, no. 1, Dec. 2022, doi: 10.1186/s40537-022-00639-7.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Evaluasi Performa Rmixmod dan KAMILA dalam Pengelompokan Perguruan Tinggi di Indonesia Berdasarkan Data Capaian Kinerja Bertipe Campuran
Pages: 37-390
Copyright (c) 2025 Andrianto Santoso, Anang Kurnia, Aji Hamim Wigena

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).





















