Recommendation System from Microsoft News Data using TF-IDF and Cosine Similarity Methods

  • Gisela Yunanda * Mail Telkom University, Bandung, Indonesia
  • Dade Nurjanah Telkom University, Bandung, Indonesia
  • Selly Meliana Telkom University, Bandung, Indonesia
  • (*) Corresponding Author
Keywords: Recommendation System; News; Microsoft News; TF-IDF; Cosine Similarity


The rapidly growing information causes information overload, so news portals publish information massively. Readers need time to search and read more news, but the time relevance of news wears off quickly. A recommendation system is needed that can recommend news according to the preferences of readers. This study recommends news using the TF-IDF method. TF-IDF gives weight to each word in the news title, and then looks for similarity between stories using cosine similarity. To prove the accuracy of whether the system recommendation results were actually clicked by the reader, the recommendation results were matched with the reader's news history on the online news portal Microsoft News using a hit-rate. The hit-rate result in this study was 80.77%.


Download data is not yet available.


F. Wu et al., “MIND: A Large-scale Dataset for News Recommendation,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 2020, pp. 3597–3606. doi: 10.18653/v1/2020.acl-main.331.

S. N. Mohanty, J. M. Chatterjee, S. Jain, A. A. Elngar, and P. Gupta, Eds., Recommender System with Machine Learning and Artificial Intelligence: Practical Tools and Applications in Medical, Agricultural and Other Industries, 1st ed. Wiley, 2020. doi: 10.1002/9781119711582.

A. Susanto, “Program Studi Informatika Fakultas Teknologi Informasi dan Elektro Universitas Teknologi Yogyakarta,” p. 14.

A. E. Wijaya and D. Alfian, “Sistem Rekomendasi Laptop Menggunakan Collaborative Filtering dan Content-Based Filtering,” p. 17.

M. W. Putri, A. Muchayan, and M. Kamisutara, “Sistem Rekomendasi Produk Pena Eksklusif Menggunakan Metode Content-Based Filtering dan TF-IDF,” JOINTECS J. Inf. Technol. Comput. Sci., vol. 5, no. 3, p. 229, Sep. 2020, doi: 10.31328/jointecs.v5i3.1563.

F. Indriani and M. R. Faisal, “Sistem Rekomendasi Berita Online dengan Menggunakan Pembobotan TF-IDF dan Cosine Similarity,” vol. 2, p. 10, 2019.

I. Mawanta and T. S. Gunawan, “Uji Kemiripan Kalimat Judul Tugas Akhir dengan Metode Cosine Similarity dan Pembobotan TF-IDF,” vol. 5, p. 13, 2021.

S. Chhipa, V. Berwal, T. Hirapure, and S. Banerjee, “Recipe Recommendation System Using TF-IDF,” ITM Web Conf., vol. 44, p. 02006, 2022, doi: 10.1051/itmconf/20224402006.

A. Irvandani, K. Auliasari, and R. Primaswara Prasetya, “Sistem Rekomendasi Pemilihan Fotografer dengan Metode Haversine dan TF-IDF di Malang Raya,” JATI J. Mhs. Tek. Inform., vol. 4, no. 1, pp. 137–146, Aug. 2020, doi: 10.36040/jati.v4i1.2330.

C. H. Yutika and S. A. Faraby, “Analisis Sentimen Berbasis Aspek pada Review Female Daily Menggunakan TF-IDF dan Naïve Bayes,” p. 9.

M. I. Huda, N. D. W. Cahyani, and H. Nurrahmi, “Pemanfaatan Metode Cosine Similarity untuk Mengidentifikasi Cyberbullying pada Twitter,” p. 9.

R. Banik, Hands-On Recommendation Systems with Python. 2018. Accessed: Jun. 26, 2022. [Online]. Available:

D. Oleh, “Sistem Rekomendasi Buku Menggunakan Metode Content Based Filtering,” p. 59.

T. Jo, Text Mining, vol. 45. Cham: Springer International Publishing, 2019. doi: 10.1007/978-3-319-91815-0.

P. C. Purnama and S. A. Faraby, “Analisis Perbandingan Metode Similarity Pearson dan Cosine pada Sistem Rekomendasi Film dengan Pendekatan User-Based Collaborative Filtering,” p. 22.

Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Recommendation System from Microsoft News Data using TF-IDF and Cosine Similarity Methods

Dimensions Badge
Article History
Submitted: 2022-06-11
Published: 2022-06-30
Abstract View: 714 times
PDF Download: 545 times
How to Cite
Yunanda, G., Nurjanah, D., & Meliana, S. (2022). Recommendation System from Microsoft News Data using TF-IDF and Cosine Similarity Methods. Building of Informatics, Technology and Science (BITS), 4(1), 277−284.