Sentiment Analysis on Movie Review from Rotten Tomatoes Using Logistic Regression and Information Gain Feature Selection
Abstract
The advancement and development of technology today can have a positive influence on the use of the internet and also on the dissemination of information it contains, including information about the world of cinema. With this convenience, there are many movie reviews that can be obtained easily. Movie reviews are very influential in the various ways movies are available. Thanks to the ease of various information on the internet, the number of movie reviews has become diverse. Therefore, it is necessary to do a sentiment analysis. In this research, the classification method used is Logistic Regression. The method was chosen because it has accurate classification accuracy. In this study, Information Gain was also chosen as a feature selection because it is good enough to do a filter approach in classification. Furthermore, for feature extraction, TF-IDF was chosen because it can overcome data imbalance in the dataset. The best model resulting from this research is a model built without using stemming in the preprocessing stage, without using information gain feature selection, and using parameters in Logistic Regression which produces an f1-score of 76.50%.
Downloads
References
F. T. Laily and A. P. Purbantina, “Digitalisasi Industri Perfilman Korea Selatan Melalui Netflix Sebagai Alternatif Pasar Ekspor Film,” Expo. J. Ilmu Komun., vol. 4, no. 2, p. 141, 2021, doi: 10.33021/exp.v4i2.1494.
R. S. Sasmita, “Research & Learning in Primary Education Pemanfaatan Internet Sebagai Sumber Belajar,” J. Pendidik. Dan Konseling, vol. 1, pp. 1–5, 2020.
C. A. Putri, “Analisis Sentimen Review Film Berbahasa Inggris Dengan Pendekatan Bidirectional Encoder Representations from Transformers,” JATISI (Jurnal Tek. Inform. dan Sist. Informasi), vol. 6, no. 2, pp. 181–193, 2020, doi: 10.35957/jatisi.v6i2.206.
T. Chamidy, M. Informatika, U. Islam, N. Maulana, M. Ibrahim, and A. Mechanism, “Bidirectional GRU dengan Attention Mechanism pada Analisis Sentimen PLN Mobile,” vol. 22, no. 2, pp. 358–372, 2023.
R. I. Pristiyanti, M. A. Fauzi, and L. Muflikhah, “Sentiment Analysis Peringkasan Review Film Menggunakan Metode Information Gain dan K-Nearest Neighbor,” vol. 2, no. 3, pp. 1179–1186, 2018.
C. G. Kencana and Y. Sibaroni, “Klasifikasi Sentiment Analysis pada Review Buku Novel Berbahasa Inggris dengan Menggunakan Metode Support Vector Machine ( SVM ),” vol. 6, no. 3, pp. 10451–10462, 2019.
S. Priyanka and V. Ramya, “Classification Model To Determine the Polarity of Movie Review Using Logistic Regression,” Int. Res. J. Comput. Sci. IRJCS Mendeley (Elsevier Indexed) CiteFactor J. Citations Impact Factor, vol. 1, no. 06, pp. 76–81, 2019.
A. Syahadati, N. C. Lengkong, O. Safitri, S. Machsus, Y. R. Putra, and R. Nooraeni, “ANALISIS SENTIMEN PENERAPAN PSBB DI DKI JAKARTA DAN DAMPAKNYA TERHADAP PERGERAKAN IHSG,” vol. 15, no. 1, pp. 20–25, 2021.
B. Jonathan, J. I. Sihotang, and S. Martin, “Sentiment Analysis of Customer Reviews in Zomato Bangalore Restaurants Using Random Forest Classifier,” vol. 7, no. 1, pp. 1719–1728, 2019.
S. Wulan, U. Vitandy, A. A. Supianto, and F. A. Bachtiar, “Analisis Sentimen Evaluasi Kinerja Dosen menggunakan Term Frequency- Inverse Document Frequency dan Naïve Bayes Classifier,” vol. 3, no. 6, 2019.
A. Purnamawati, M. N. Winarto, and M. Mailasari, “Analisis Sentimen Aplikasi TikTok menggunakan Metode BM25 dan Improved K-NN Fitur Chi-Square,” vol. 7, no. 1, pp. 97–105, 2023.
A. Riyani, M. Zidny, and A. Burhanuddin, “Penerapan Cosine Similarity dan Pembobotan TF-IDF untuk Mendeteksi Kemiripan Dokumen,” vol. 2, no. 1, pp. 23–27, 2019.
A. B. P. Negara, H. Muhardi, and I. M. Putri, “Analisis Sentimen Maskapai Penerbangan Menggunakan Metode Naive Bayes dan Seleksi Fitur Information Gain,” J. Teknol. Inf. dan Ilmu Komput., vol. 7, no. 3, p. 599, 2020, doi: 10.25126/jtiik.2020711947.
R. Wati, S. Ernawati, and H. Rachmi, “Pembobotan TF-IDF Menggunakan Naïve Bayes Pada Sentimen Masyarakat Mengenai Isu Kenaikan BIPIH TF-IDF Weighting Using Naïve Bayes on Public Sentiment on The Issue of Rising BIPIH,” vol. 13, no. April, pp. 84–93, 2023.
Z. N. Syarif, “Penerapan Information Gain dan Algoritma K-Means Untuk Klasterisasi Kedisiplinan Pegawai Menggunakan Rapidminer,” vol. 13, no. 1, pp. 1–12, 2023, doi: 10.36350/jbs.v13i1.165.
M. Metode, K. N. Dan, and L. Regression, “Implementasi data mining untuk memprediksi penyakit jantung menggunakan metode k-nearest neighbor dan logistic regression,” vol. 5, pp. 493–501, 2022, doi: 10.37600/tekinkom.v5i2.698.
M. Shandy, T. Putra, and Y. Azhar, “Perbandingan Model Logistic Regression dan Artificial Neural Network pada Prediksi Pembatalan Hotel,” vol. 6, no. 1, pp. 29–37, 2021.
A. Novantika, “Analisis Sentimen Ulasan Pengguna Aplikasi Video Conference Google Meet menggunakan Metode SVM dan Logistic Regression,” Prism. Pros. Semin. Nas. Mat., vol. 5, pp. 808–813, 2022.
Y. S. HARIYANI, S. HADIYOSO, and T. S. SIADARI, “Deteksi Penyakit Covid-19 Berdasarkan Citra X-Ray Menggunakan Deep Residual Network,” ELKOMIKA J. Tek. Energi Elektr. Tek. Telekomun. Tek. Elektron., vol. 8, no. 2, p. 443, 2020, doi: 10.26760/elkomika.v8i2.443.
D. Chrisinta and J. E. Simarmata, “Analisis Sentimen Penilaian Masyarakat Terhadap Pejabat Publik Menggunakan Algoritma Naïve Bayes Classifier Sentiment Analysis of Society Assessment of Public Officials Using Naïve Bayes Classifier Algorithm,” vol. 12, no. 148, 2023, doi: 10.34010/komputika.v12i1.9638.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Sentiment Analysis on Movie Review from Rotten Tomatoes Using Logistic Regression and Information Gain Feature Selection
Pages: 162−170
Copyright (c) 2023 Arsenio Jusuf Abimanyu, Mahendra Dwifebri, Widi Astuti

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).





















