Analisa Perbandingan Latent Semantic Indexing (LSI) dan Latent Dirichlet Allocation (LDA) untuk Topic Modelling Aplikasi Identitas Kependudukan Digital (IKD)
Abstract
This study aims to analyze and compare two topic modeling methods, Latent Semantic Indexing (LSI) and Latent Dirichlet Allocation (LDA), in understanding user reviews of the Digital Population Identity (IKD) Application obtained from the Google Play Store. The main problem addressed is the large number of user reviews with diverse topics that are difficult to categorize manually, necessitating an automated method to identify the main themes in the data. The research process began with scraping 5,000 recent reviews, followed by data preprocessing (Remove Punctuation, Lowercase, and Tokenization) and vectorization using Bag of Words and DOC2BOW. Subsequently, topic modeling was performed using LSI and LDA, and the results were evaluated using the Coherence Score metric. The findings indicated that Latent Dirichlet Allocation (LDA) outperformed LSI, achieving a Coherence Score of 0.4163 compared to LSI's 0.3512, indicating that Latent Dirichlet Allocation (LDA) is more effective in identifying hidden topics within user reviews. Latent Dirichlet Allocation (LDA) is a superior method for topic modeling in IKD application reviews and can assist developers in understanding user needs and issues, thereby enhancing the application's service quality.
Downloads
References
V. Salsa Bella dan D. Widodo, “Implementasi Aplikasi Identitas Kependudukan Digital (IKD) Dalam Menunjang Pelayanan Publik Masyarakat Di Kecamatan Tambaksari,” Saraq Opat: Jurnal Administrasi Publik, vol. 6, no. 1, hlm. 14–31, Okt 2023, doi: 10.55542/saraqopat.v6i1.833.
Kementerian Dalam Negeri Republik Indonesia, “Dukcapil Terus Dukung Pengembangan IKD Menjadi INA-Pass,” https://www.kemendagri.go.id/beritaArtikel/beritakemendagri?id=36562.
Muhammad Khumaidi Nursyarif, Muhamad Wahyu Tirta, Tri Wahyudi, Siti Patimah, Siti Muawwanah, dan Arbansyah Arbansyah, “Sosialisasi Identitas Kependudukan Digital Dalam Meningkatkan Partisipasi Masyarakat Kota Samarinda Pada Revolusi Digital,” Pandawa : Pusat Publikasi Hasil Pengabdian Masyarakat, vol. 2, no. 1, hlm. 56–63, Des 2023, doi: 10.61132/pandawa.v2i1.426.
B. Setiawan, K. Ahmad Baihaqi, E. Nurlaelasari, dan H. Hikmayanti Handayani, “Analisis Sentimen Ulasan Aplikasi Identitas Kependudukan Digital Menggunakan Algoritma Logistic Regression dan K-Nearest Neighbor,” Technology and Science (BITS), vol. 6, no. 1, hlm. 533–540, 2024, doi: 10.47065/bits.v6i1.5389.
A. R. Lubis, S. Prayudani, Y. Fatmi, dan O. Nugroho, “Latent Semantic Indexing (LSI) and Hierarchical Dirichlet Process (HDP) Models on News Data,” dalam 2022 5th International Conference of Computer and Informatics Engineering (IC2IE), IEEE, Sep 2022, hlm. 314–319. doi: 10.1109/IC2IE56416.2022.9970067.
D. Yamunathangam, C. B. Priya, G. Shobana, dan L. Latha, “An Overview of Topic Representation and Topic Modelling Methods for Short Texts and Long Corpus,” dalam 2021 International Conference on Advancements in Electrical, Electronics, Communication, Computing and Automation (ICAECA), IEEE, Okt 2021, hlm. 1–6. doi: 10.1109/ICAECA52838.2021.9675579.
U. Nur Khadijah dan N. Cahyono, “ANALISIS TOPIC MODELLING PARIWISATA YOGYAKARTA MENGGUNAKAN LATENT DIRICHLET ALLOCATION (LDA),” Indonesian Journal of Computer Science, 2024.
D. Zakeshia Tiara Kannitha dan P. Kartikasari, “PEMODELAN TOPIK PADA KELUHAN PELANGGAN MENGGUNAKAN ALGORITMA LATENT DIRICHLET ALLOCATION DALAM MEDIA SOSIAL TWITTER,” vol. 11, no. 2, hlm. 266–277, 2022, [Daring]. Tersedia pada: https://ejournal3.undip.ac.id/index.php/gaussian/
A. Reni Dwi Astuti dan N. Cahyono, “Analisis Topic Modelling Persepsi Pengguna Internet Menggunakan Metode Latent Dirichlet Allocation” Indonesian Journal of Computer Science Attribution, vol. 12, no. 1, hlm. 2023–326, 2023, https://doi.org/10.33022/ijcs.v12i1.3155.
S. Rosales, R. Reátegui, dan C. C. Toledo, “A Topic Modeling Approach to Analyze Teaching Innovation Projects,” dalam Proceedings - 2023 4th International Conference on Information Systems and Software Technologies, ICI2ST 2023, Institute of Electrical and Electronics Engineers Inc., 2023, hlm. 46–53. doi: 10.1109/ICI2ST62251.2023.00014.
Y. A. Singgalen, “Analisis Sentimen dan Pemodelan Topik dalam Optimalisasi Pemasaran Destinasi Pariwisata Prioritas di Indonesia,” Journal of Information Systems and Informatics, vol. 4, no. 1, 2021.
N. Cahyono, “Ekstraksi Informasi Terstruktur Profil Pengguna Website Iklan Baris” Jurnal Buana Informatika, vol. 12, no. 1, hlm. 39–48, 2021, doi :https://doi.org/10.24002/jbi.v12i1.4400.
A. Oktavia Praneswara dan N. Cahyono, “Analisis Sentimen Ulasan Aplikasi TikTok Shop Seller Center di Google Playstore Menggunakan Algoritma Naive Bayes,” Indonesian Journal of Computer Science Attribution, vol. 12, no. 6, hlm. 3925, 2023, doi : https://doi.org/10.33022/ijcs.v12i6.3473.
Y. HaCohen-Kerner, D. Miller, dan Y. Yigal, “The influence of preprocessing on text classification using a bag-of-words representation,” PLoS One, vol. 15, no. 5, Mei 2020, doi: 10.1371/journal.pone.0232525.
G. Ma, X. Wu, Z. Lin, dan S. Hu, “Drop your Decoder: Pre-training with Bag-of-Word Prediction for Dense Passage Retrieval.,” dalam SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, Association for Computing Machinery, Inc, Jul 2024, hlm. 1818–1827. doi: 10.1145/3626772.3657792.
Q. Yang, “LDA-based Topic Mining Research on China’s Government Data Governance Policy,” Social Security and Administration Management, vol. 3, no. 2, hlm. 33–42, 2022, doi: 10.23977/socsam.2022.030205.
H. Yang, J. Li, dan S. Chen, “TopicRefiner: Coherence-Guided Steerable LDA for Visual Topic Enhancement,” IEEE Trans Vis Comput Graph, vol. 30, no. 8, hlm. 4542–4557, Agu 2024, doi: 10.1109/TVCG.2023.3266890.
A. S. K. Sumpter dan E. Pines, “Evaluation of Topic Models and Information Retrieval Methods in Support of Lessons Learned and Knowledge Management,” dalam 2024 International Conference on System Science and Engineering (ICSSE), IEEE, Jun 2024, hlm. 1–6. doi: 10.1109/ICSSE61472.2024.10608962.
J. Guo, Y. Cai, Y. Fan, F. Sun, R. Zhang, dan X. Cheng, “Semantic Models for the First-Stage Retrieval: A Comprehensive Review,” ACM Trans Inf Syst, vol. 40, no. 4, hlm. 1–42, Okt 2022, doi: 10.1145/3486250.
F. Alzami dkk., “[20] LDA Topic Analysis for Product Reviews in Social Media Platform,” Moneter: Jurnal Keuangan dan Perbankan, vol. 11, no. 2, hlm. 277–283, 2023.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Analisa Perbandingan Latent Semantic Indexing (LSI) dan Latent Dirichlet Allocation (LDA) untuk Topic Modelling Aplikasi Identitas Kependudukan Digital (IKD)
Pages: 1638-1647
Copyright (c) 2024 Nuri Cahyono, Narwanto Nurcahyo, Akmal Fauzan Restu Agung

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).