Optimasi LSTM Mengurangi Overfitting untuk Klasifikasi Teks Menggunakan Kumpulan Data Ulasan Film Kaggle IMDB
Abstract
This study aims to develop and optimize a Long Short-Term Memory (LSTM) model to reduce overfitting in text classification using the Kaggle IMDB movie review dataset. Overfitting is a common problem in machine learning that causes the model to overfit to the training data, thus degrading its performance on the test data. In this study, various optimization techniques such as regularization, dropout, and careful training methods are applied to improve the generalization of the LSTM model. This study shows that overfitting reduction techniques, such as dropout and the use of the RMSProp optimizer, significantly improve the performance of the Long Short-Term Memory (LSTM) model in IMDB movie review text classification. The optimized LSTM model achieves an accuracy of 83.45%, an increase of 2.07% compared to the standard model which has an accuracy of 81.38%. The precision of the optimized model increases to 89.65%, compared to 84.46% in the standard model, although the recall is slightly lower (75.69% compared to 76.91%). The F1-score of the optimized model is also higher, which is 82.07% compared to 80.53% in the standard model. The experimental results show that the techniques successfully improve the accuracy and reliability of the text classification model, with better performance on the test data. This research makes a significant contribution to understanding and overfitting in deep learning models in the context of natural language processing, and offers insights into best practices in applying LSTM models to text classification.
Downloads
References
A. Jasinska-Piadlo et al., “Data-driven versus a domain-led approach to k-means clustering on an open heart failure dataset,” Int. J. Data Sci. Anal., vol. 15, no. 1, pp. 49–66, 2023, doi: 10.1007/s41060-022-00346-9.
Y. Wang, F. Sibaii, K. Lee, M. J. Gill, and J. L. Hatch, “CluSA: Clustering-based Spatial Analysis framework through Graph Neural Network for Chronic Kidney Disease Prediction using Histopathology Images,” medRxiv, vol. 1, no. 165, pp. 1–13, 2021.
M. R. Purba, “A hybrid convolutional long short-term memory (CNN-LSTM) based natural language processing (NLP) model for sentiment analysis of customer product reviews in Bangla,” J. Discret. Math. Sci. Cryptogr., vol. 25, no. 7, pp. 2111–2120, 2022, doi: 10.1080/09720529.2022.2133250.
G. Kaur and A. Sharma, “A deep learning-based model using hybrid feature extraction approach for consumer sentiment analysis,” J. Big Data, vol. 10, no. 1, 2023, doi: 10.1186/s40537-022-00680-6.
P. Alkhairi and A. P. Windarto, “Classification Analysis of Back propagation-Optimized CNN Performance in Image Processing,” J. Syst. Eng. Inf. Technol., vol. 2, no. 1, pp. 8–15, 2023.
S. Defit, A. P. Windarto, and P. Alkhairi, “Comparative Analysis of Classification Methods in Sentiment Analysis: The Impact of Feature Selection and Ensemble Techniques Optimization,” Telematika, vol. 17, no. 1, pp. 52–67, 2024.
Z. A. Sejuti and M. S. Islam, “A hybrid CNN–KNN approach for identification of COVID-19 with 5-fold cross validation,” Sensors Int., vol. 4, no. November 2022, p. 100229, 2023, doi: 10.1016/j.sintl.2023.100229.
A. A. SG. Mas Karunia Maharani, Komang Oka Saputra, and Ni Made Ary Esta Dewi Wirastuti, “Komparasi Metode Backpropagation Neural Network dan Convolutional Neural Network Pada Pengenalan Pola Tulisan Tangan,” J. Comput. Sci. Informatics Eng., vol. 6, no. 1, pp. 56–63, 2022, doi: 10.29303/jcosine.v6i1.431.
N. Habib and M. M. Rahman, “Diagnosis of corona diseases from associated genes and X-ray images using machine learning algorithms and deep CNN,” Informatics Med. Unlocked, vol. 24, p. 100621, 2021, doi: 10.1016/j.imu.2021.100621.
J. Amose, P. Manimegalai, and R. Pon Selchiya, “Classification of Adventitious Lung Sounds: Wheeze, Crackle using Machine Learning Techniques,” Int. J. Intell. Syst. Appl. Eng., vol. 11, no. 3, pp. 1143–1152, 2023.
A. Rafiee and M. R. Saradjian, “Classification of buildings and roads using support vector machine; case study: Shiraz city,” Proc. - Digit. Image Comput. Tech. Appl. DICTA 2008, pp. 111–116, 2008, doi: 10.1109/DICTA.2008.57.
H. Rostami, J. Y. Dantan, and L. Homri, “Review of data mining applications for quality assessment in manufacturing industry: Support vector machines,” Int. J. Metrol. Qual. Eng., vol. 6, no. 4, 2015, doi: 10.1051/ijmqe/2015023.
A. P. Windarto, T. Herawan, and P. Alkhairi, “Early Detection of Breast Cancer Based on Patient Symptom Data Using Naive Bayes Algorithm on Genomic Data,” in Artificial Intelligence, Data Science and Applications, Y. Farhaoui, A. Hussain, T. Saba, H. Taherdoost, and A. Verma, Eds., Cham: Springer Nature Switzerland, 2024, pp. 478–484.
P. G. Pavani, “A semantic contour based segmentation of lungs from chest x-rays for the classification of tuberculosis using Naïve Bayes classifier,” Int. J. Imaging Syst. Technol., vol. 31, no. 4, pp. 2189–2203, 2021, doi: 10.1002/ima.22556.
J. Rashid, S. Batool, J. Kim, M. W. Nisar, and A. Hussain, “An Augmented Artificial Intelligence Approach for Chronic Diseases Prediction,” vol. 10, no. March, pp. 1–20, 2022, doi: 10.3389/fpubh.2022.860396.
J. Amose, “Classification of Adventitious Lung Sounds: Wheeze, Crackle using Machine Learning Techniques,” Int. J. Intell. Syst. Appl. Eng., vol. 11, no. 3, pp. 1143–1152, 2023.
A. P. Windarto, I. R. Rahadjeng, M. N. H. Siregar, and P. Alkhairi, “Deep Learning to Extract Animal Images With the U-Net Model on the Use of Pet Images,” J. MEDIA Inform. BUDIDARMA, vol. 8, no. 1, pp. 468–476, 2024.
E. Rahimi, “The Efficiency of Long Short-Term Memory (LSTM) in Phenology-Based Crop Classification,” Korean J. Remote Sens., vol. 40, no. 1, pp. 57–69, 2024, doi: 10.7780/kjrs.2024.40.1.6.
B. Ruhani, “Hydrogen production via renewable-based energy system: Thermoeconomic assessment and Long Short-Term Memory (LSTM) optimization approach,” Int. J. Hydrogen Energy, vol. 52, pp. 505–519, 2024, doi: 10.1016/j.ijhydene.2023.03.456.
D. Malhotra, “Hybrid Deep Learning Model for COVID-19 Prediction Using Convolutional Neural Network (CNN) and Bidirectional Long Short-Term Memory (LSTM) Network,” Int. J. Comput. Theory Eng., vol. 15, no. 3, pp. 125–129, 2023, doi: 10.7763/IJCTE.2023.V15.1341.
R. Herwanto, I. Darmajaya, and H. Fabric, “ANALISA DAN PREDIKSI IKLAN LOWONGAN KERJA PALSU DENGAN METODE DENGAN METODE NATURAL LANGUAGE PROGRAMING DAN,” no. June, 2021, doi: 10.30873/ji.v21i1.2865.
P. Alkhairi, W. Wanayumini, and B. H. Hayadi, “Analysis of the adaptive learning rate and momentum effects on prediction problems in increasing the training time of the backpropagation algorithm,” AIP Conf. Proc., vol. 3048, no. 1, p. 20049, 2024, doi: 10.1063/5.0203374.
A. P. Windarto, T. Herawan, and P. Alkhairi, “Prediction of Kidney Disease Progression Using K-Means Algorithm Approach on Histopathology Data,” in Artificial Intelligence, Data Science and Applications, Y. Farhaoui, A. Hussain, T. Saba, H. Taherdoost, and A. Verma, Eds., Cham: Springer Nature Switzerland, 2024, pp. 492–497.
S. S. A. Laros, D. B. M. Dickerscheid, S. P. Blazis, and J. A. van der Heide, “Machine learning classification of mediastinal lymph node metastasis in NSCLC: a multicentre study in a Western European patient population,” EJNMMI Phys., vol. 9, no. 1, 2022, doi: 10.1186/s40658-022-00494-8.
M. Ahmad et al., “A Lightweight Convolutional Neural Network Model for Liver Segmentation in Medical Diagnosis,” vol. 2022, 2022.
A. E. K. Gunawan, “Stock Price Movement Classification Using Ensembled Model of Long Short-Term Memory (LSTM) and Random Forest (RF),” Int. J. Informatics Vis., vol. 7, no. 4, pp. 2255–2262, 2023, doi: 10.30630/joiv.7.4.1640.
J. Hernández-Rodríguez, “Convolutional Neural Networks for Multi-scale Lung Nodule Classification in CT: Influence of Hyperparameter Tuning on Performance,” TEM J., vol. 11, no. 1, pp. 297–306, 2022, doi: 10.18421/TEM111-37.
S. Y. Xiong, “A Proposed Hybrid CNN-RNN Architecture for Student Performance Prediction,” Int. J. Intell. Syst. Appl. Eng., vol. 10, no. 3, pp. 347–355, 2022.
L. Zhao, “CNN, RNN, or ViT? An Evaluation of Different Deep Learning Architectures for Spatio-Temporal Representation of Sentinel Time Series,” IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., vol. 16, pp. 44–56, 2023, doi: 10.1109/JSTARS.2022.3219816.
N. W. Shen, “Univariate and Multivariate Long Short Term Memory (LSTM) Model to Predict Covid-19 Cases in Malaysia Using Integrated Meteorological Data,” Malaysian J. Fundam. Appl. Sci., vol. 19, no. 4, pp. 653–667, 2023, doi: 10.11113/mjfas.v19n4.2814.
S. K. Selvakumar, “Divination of stock market exploration using long short-term memory (LSTM),” Multidiscip. Sci. J., vol. 6, no. 1, 2023, doi: 10.31893/multiscience.2024001.
Y. Yi, “Digital twin-long short-term memory (LSTM) neural network based real-time temperature prediction and degradation model analysis for lithium-ion battery,” J. Energy Storage, vol. 64, 2023, doi: 10.1016/j.est.2023.107203.
N. Mohamed Ali, M. M. A. El Hamid, and A. Youssif, “Sentiment Analysis for Movies Reviews Dataset Using Deep Learning Models,” Int. J. Data Min. Knowl. Manag. Process, vol. 09, no. 03, pp. 19–27, 2019, doi: 10.5121/ijdkp.2019.9302.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Optimasi LSTM Mengurangi Overfitting untuk Klasifikasi Teks Menggunakan Kumpulan Data Ulasan Film Kaggle IMDB
Pages: 1142−1150
Copyright (c) 2024 Putrama Alkhairi, Agus Perdana Windarto, M.Kom, Muhamad Masjun Efendi
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).