Optimization in Time and Score using IID Algorithm for K-Modes Clustering
Abstract
Nowadays, there are numerous methods for analyzing data, one of which is cluster analysis. Because most practical data in today's analysis contains categorical attributes, categorical data clustering has recently received a lot of attention. To cluster categorical data, unsupervised machine learning techniques, which used frequency-based method, such as K-Mode’s clustering are used. The K-Modes algorithm takes advantage of the differences between the data points (total mis-matches or dissimilarities). The lower the dissimilarities, the more similar the data points, and thus the better the cluster. This paper aims to improve K-Mode’s clustering performance by incorporating the intercluster and intracluster dissimilari-ty measure, or IID measure, into the K-Modes algorithm rather than just using the standard simple-matching method to increase the algorithm's accuracy and execution time. This combined algorithm improves accuracy and execution time of the K-Modes algorithm. As a result, this algorithm can be used as an alternative to better cluster categorical data.
Downloads
References
D.-T. Dinh and V.-N. Huynh, “k-PbC: an improved cluster center initialization for categorical data clustering,” Applied Intelligence, vol. 50, no. 8, pp. 2610–2632, 2020.
Pal, S. K., & Pal, M. A Comparative Study of Initialization Methods for K-Means-Type Clustering Algorithms IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
Kuo, R. J., & Nguyen, T. P. Q. Genetic intuitionistic weighted fuzzy k-modes algorithm for categorical data. Neurocomputing, 330, 116-126, 2019.
Zafar, A., & Swarupa Rani, K. Novel Initialization Strategy for K-modes Clustering Algorithm. In Proceedings of International Conference on Big Data, Machine Learning and Applications (pp. 89-100). Springer, Singapore, 2021.
F. Cao et al., “An algorithm for clustering categorical data with set-valued features,” IEEE Trans Neural Netw Learn Syst, vol. 29, no. 10, pp. 4593–4606, 2017.
Xiao, Y., Huang, C., Huang, J., Kaku, I., & Xu, Y. Optimal mathematical programming and variable neighborhood search for k-modes categorical data clustering. Pattern Recognition, 90, 183-195, 2019.
Wang, Y., & Zhang, Y. A K-Means Clustering-Based Hybrid Offspring Generation Mechanism in Evolutionary Multi-Objective Optimization. IEEE Access, 9, 167642-167651, 2021.
Guo, J., Li, X., Li, X., & Li, Y. “Gaussian Mixture Model for Mixed Data Types”. IEEE Transactions on Cybernetics, 2021.
Oskouei, A. G., Balafar, M. A., & Motamed, C. FKMAWCW: categorical fuzzy k-modes clustering with automated attribute-weight and cluster-weight learning. Chaos, Solitons & Fractals, 153, 111494, 2021.
Y. Zhang, Y. Yang, T. Li, and H. Fujita, “A multitask multiview clustering algorithm in heterogeneous situations based on LLE and LE,” Knowl Based Syst, vol. 163, pp. 776–786, 2019.
A. J. Gates and Y.-Y. Ahn, “The impact of random models on clustering similarity,” arXiv preprint arXiv:1701.06508, 2017.
Everitt, B. S., Landau, S., & Leese, M. Handbook of cluster analysis. CRC press, 2019.
Yuan, F., Yang, Y., & Yuan, T. A dissimilarity measure for mixed nominal and ordinal attribute data in k-Modes algorithm. Applied Intelligence, 50(5), 1498-1509, 2020.
Alves, G., Couceiro, M., & Napoli, A. Similarity Measure Selection for Categorical Data Clustering, 2019.
Jahwar, A. F., & Abdulazeez, A. M. Meta-heuristic algorithms for K-means clustering: A review. PalArch's Journal of Archaeology of Egypt/Egyptology, 17(7), 12002-12020, 2020.
Gharaei, N., Bakar, K. A., Hashim, S. Z. M., & Pourasl, A. H. Inter-and intra-cluster movement of mobile sink algorithms for cluster-based networks to enhance the network lifetime. Ad Hoc Networks, 85, 60-70, 2019.
Wei, Q., Bai, K., Zhou, L., Hu, Z., Jin, Y., & Li, J. A cluster-based energy optimization algorithm in wireless sensor networks with mobile sink. Sensors, 21(7), 2523, 2021.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Optimization in Time and Score using IID Algorithm for K-Modes Clustering
Pages: 1705−1713
Copyright (c) 2023 Farah Yulianti, Tjong Wan Sen

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).





















