Penerapan Metode N-Gram dan Cosine Similarity Dalam Pencarian Pada Repositori Artikel Jurnal Publikasi
Abstract
Digital repository is one source of data in human information needs, especially in an organization. In a digital repository, various digital documents are stored that can be used by users, for example, a publication journal repository. Every day the published articles in the repository grow in the hundreds or even thousands in number, besides publication journals usually consist of various formats and languages. This will cause the search of results relatively low level of relevance. To optimize search results today, the application of an information retrieval system in a repository is important. Preprocessing is one of the most important stages of the development of a retrieval system, especially in the process of selecting a stemming algorithm to generate basic words (terms) which will later be used in determining the level of similarity between queries and documents in a search process. N-Gram is a method of character decomposition from a string that can be used to analyze words or sentences which are words or sentences from what language will later affect the determination of the stemming algorithm. Cosine Similarity is a method to determine the level of similarity, which will calculate the angle that represents the query vector and the document vector. In this study, a repository will be built that implements retrieval systems using N-Gram and Cosine Similarity, then the system performance will be calculated where the average total accuracy for Indonesian-language queries and English-language queries is 0.967, precision is 0.851 while the average recall is obtained. 0.869.
Downloads
References
A. Rachmat C., “Analisis Rancang Bangun Sistem Repositori Institusi Berbasis Metadata Dublin Core di UKDW Yogyakarta,” J. Ultim. InfoSys, vol. 5, no. 2, pp. 65–74, 2014, doi: 10.31937/si.v5i2.267.
I. G. Anugrah and H. Rosyid, “Penerapan Information Retrieval Menggunakan Pemodelan Topik Pada Deskripsi Portal Multimedia,” J. Nas. Komputasi dan Teknol. Inf., vol. 2, no. 1, p. 48, 2019, doi: 10.32672/jnkti.v2i1.1057.
S. Azizurahman, Y. Firdaus, and A. A. Suryani, “Analisis Dan Implementasi Metode N-Gram Pada Information Retrieval,” 2011.
R. T. Wahyuni, D. Prastiyanto, and E. Supraptono, “Penerapan Algoritma Cosine Similarity dan Pembobotan TF-IDF pada Sistem Klasifikasi Dokumen Skripsi,” J. Tek. Elektro, vol. 9, no. 1, pp. 18–23, 2017, doi: 10.15294/jte.v9i1.10955.
S. Sugiyamto, B. Surarso, and A. Sugiharto, “Analisa Performa Metode Cosine Dan Jacard Pada Pengujian Kesamaan Dokumen,” J. Masy. Inform., vol. 5, no. 10, 2014, doi: 10.14710/jmasif.5.10.1-8.
B. Zaman, E. Hariyanti, and E. Purwanti, “Sistem Deteksi Bahasa pada Dokumen menggunakan N-Gram,” Multinetics, vol. 1, no. 2, p. 21, 2015, doi: 10.32722/vol1.no2.2015.pp21-26.
D. L. Khuseri Andesa, “Implementasi Vector Space Model Untuk Meningkatkan,” Semin. Nas. Inform. 2015, no. May 2013, pp. 8–15, 2015.
D. D. A. Yani, H. S. Pratiwi, and H. Muhardi, “Implementasi Web Scraping untuk Pengambilan Data pada Situs Marketplace,” J. Sist. dan Teknol. Inf., vol. 7, no. 4, p. 257, 2019, doi: 10.26418/justin.v7i4.30930.
I. Fakhruddin and I. G. Anugrah, “Implementation of Winnowing Algorithm and Simple Additive Weighting SAW for Publication Reference Journal Search System,” J. Dev. Res., vol. 5, no. 2, pp. 61–72, 2021, doi: 10.28926/jdr.v5i2.141.
D. N. Chandra, G. Indrawan, and I. N. Sukajaya, “Klasifikasi Berita Lokal Radar Malang Menggunakan Metode Naïve Bayes Dengan Fitur N-Gram,” J. Ilm. Teknol. Inf. Asia, vol. 10, no. 1, pp. 11–19, 2016.
D. S. Indraloka and B. Santosa, “Penerapan Text Mining untuk Melakukan Clustering Data Tweet Shopee Indonesia,” J. Sains dan Seni ITS, vol. 6, no. 2, pp. 6–11, 2017, doi: 10.12962/j23373520.v6i2.24419.
M. S. Anwar, I. M. I. Subroto, and S. Mulyono, “Sistem Pencarian E-Journal Menggunakan Metode Stopword Removal Dan Stemming,” Pros. Konf. Ilm. Mhs. UNISSULA 2, pp. 58–70, 2019, [Online]. Available: http://lppm-unissula.com/jurnal.unissula.ac.id/index.php/kimueng/article/viewFile/8420/3887.
P. F. Ariyani, A. Rahmala, and N. Juliasari, “Implementasi Metode Stemming Tala Dan Fungsi Jaccard Pada Aplikasi Katalog Perpustakaan,” Semin. Nas. Inov. dan Apl. Teknol. di Ind. 2019, pp. 128–133, 2019.
I. M. A. Agastya, “Pengaruh Stemmer Bahasa Indonesia Terhadap Peforma Analisis Sentimen Terjemahan Ulasan Film,” J. Tekno Kompak, vol. 12, no. 1, p. 18, 2018, doi: 10.33365/jtk.v12i1.70.
I. Made Suwija Putra, N. Putu Ayu Widiari, and I. Wayan Gunaya, “Implementasi Generalized Vector Space Model (GVSM) dalam Pencarian Buku di Perpustakaan,” J. Ilm. Merpati (Menara Penelit. Akad. Teknol. Informasi), vol. 7, no. 1, p. 86, 2019, doi: 10.24843/jim.2019.v07.i01.p10.
A. Apriani, H. Zakiyudin, and K. Marzuki, “Penerapan Algoritma Cosine Similarity dan Pembobotan TF-IDF System Penerimaan Mahasiswa Baru pada Kampus Swasta,” J. Bumigora Inf. Technol., vol. 3, no. 1, pp. 19–27, 2021, doi: 10.30812/bite.v3i1.1110.
I. W. Saputro and B. W. Sari, “Uji Performa Algoritma Naïve Bayes untuk Prediksi Masa Studi Mahasiswa,” Creat. Inf. Technol. J., vol. 6, no. 1, p. 1, 2020, doi: 10.24076/citec.2019v6i1.178.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Penerapan Metode N-Gram dan Cosine Similarity Dalam Pencarian Pada Repositori Artikel Jurnal Publikasi
Pages: 275-284
Copyright (c) 2021 Indra Gita Anugrah

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).





















