Pengaruh Data Preprocessing terhadap Imbalanced Dataset pada Klasifikasi Citra Sampah menggunakan Algoritma Convolutional Neural Network
Abstract
Garbage is one of Indonesia's most significant problems with an increase in waste each year reaching 187.2 million tonnes/year. Various efforts to reduce the amount of waste such as Garbage Banks have been encouraged. However, this program has not run well, because some people have difficulty distinguishing the type of waste. One solution to overcome this problem is that need a system that can classify the type of waste. The deep learning approach with the CNN algorithm is currently widely used to solve classification problems. This method requires a large number of datasets to increase the level of accuracy. Getting a garbage dataset is a particular problem in the training process because the dataset is unbalanced. The dataset used amounted to 2527 data consisting of 6 classes. Several treatments such as undersampling and image augmentation are applied to overcome imbalanced datasets. Other treatments such as the type of input image channel and the use of filters are combined into 24 experimental scenarios to achieve the highest accuracy. The results of the experiment get the best scenario, namely, the dataset is undersampling and then augmented with 5 geometric transformation parameters with the input image being RGB and applying a sharpening filter to get an accuracy value of 0.9919 with 20 epochs.
Downloads
References
B. G. K. M. Alblooshi, S. Z. Ahmad, M. Hussain, and S. K. Singh, “Sustainable management of electronic waste: Empirical evidences from a stakeholders’ perspective,” Bus Strategy Environ, vol. 31, no. 4, 2022, doi: 10.1002/bse.2987.
R. Panca Sakti, Ulfa Sulaeman, and Abd. Gafur, “Peran Mallsampah dalam Efektivitas Pengelolaan Sampah (Studi Kasus di PT. Mallsampah Indonesia),” Window of Public Health Journal, vol. 2, no. 2, pp. 1004–1018, 2021, doi: 10.33096/woph.v2i2.197.
N. Istiqomah, I. Mafruhah, E. Gravitiani, and S. Supriyadi, “Konsep Reduce, Reuse, Recycle dan Replace dalam Pengelolaan Sampah Rumah Tangga di Desa Polanharjo Kabupaten Klaten,” SEMAR (Jurnal Ilmu Pengetahuan, Teknologi, dan Seni bagi Masyarakat), vol. 8, no. 2, pp. 30–38, 2019, doi: 10.20961/semar.v8i2.26682.
D. N. Patel, C. Dasari, A. Chembarpu, and A. Sasi, “Smart Waste Segregation using ML Techniques,” International Journal of Innovative Science and Research Technology-2019, 2020.
Y. Chen et al., “Classification of lungs infected COVID-19 images based on inception-ResNet,” Comput Methods Programs Biomed, vol. 225, p. 107053, 2022, doi: 10.1016/j.cmpb.2022.107053.
R. Vankdothu and M. A. Hameed, “Brain tumor MRI images identification and classification based on the recurrent convolutional neural network,” Measurement: Sensors, vol. 24, no. August, p. 100412, 2022, doi: 10.1016/j.measen.2022.100412.
Kusrini, M. R. A. Yudianto, and H. Al Fatta, “The effect of Gaussian filter and data preprocessing on the classification of Punakawan puppet images with the convolutional neural network algorithm,” International Journal of Electrical and Computer Engineering, vol. 12, no. 4, pp. 3752–3761, 2022, doi: 10.11591/ijece.v12i4.pp3752-3761.
M. G. Banish, U. Amogha, and U. Apoorva, “Segregation of Trash for Recyclability,” Ijresm.Com, no. 8, 2019.
C. Cheng, X. Wei, and Z. Jian, “Emotion recognition algorithm based on convolution neural network,” Proceedings of the 2017 12th International Conference on Intelligent Systems and Knowledge Engineering, ISKE 2017, vol. 2018-January, pp. 1–5, 2017, doi: 10.1109/ISKE.2017.8258786.
Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, “A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects,” IEEE Trans Neural Netw Learn Syst, pp. 1–21, 2021, doi: 10.1109/tnnls.2021.3084827.
F. A. Breve, “COVID-19 detection on Chest X-ray images: A comparison of CNN architectures and ensembles[Formula presented],” Expert Syst Appl, vol. 204, no. May, p. 117549, 2022, doi: 10.1016/j.eswa.2022.117549.
J. Huang et al., “BM-Net: CNN-Based MobileNet-V3 and Bilinear Structure for Breast Cancer Detection in Whole Slide Images,” Bioengineering, vol. 9, no. 6, pp. 1–16, 2022, doi: 10.3390/bioengineering9060261.
S. L. Rabano, M. K. Cabatuan, E. Sybingco, E. P. Dadios, and E. J. Calilung, “Common garbage classification using mobilenet,” 2018 IEEE 10th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment and Management, HNICEM 2018, pp. 1–4, 2018, doi: 10.1109/HNICEM.2018.8666300.
K. De Angeli et al., “Class imbalance in out-of-distribution datasets: Improving the robustness of the TextCNN for the classification of rare cancer types,” J Biomed Inform, vol. 125, no. June 2021, p. 103957, 2022, doi: 10.1016/j.jbi.2021.103957.
J. M. Durden, B. Hosking, B. J. Bett, D. Cline, and H. A. Ruhl, “Automated classification of fauna in seabed photographs: The impact of training and validation dataset size, with considerations for the class imbalance,” Prog Oceanogr, vol. 196, no. July 2020, p. 102612, 2021, doi: 10.1016/j.pocean.2021.102612.
T. Hasanin and T. M. Khoshgoftaar, “The effects of random undersampling with simulated class imbalance for big data,” Proceedings - 2018 IEEE 19th International Conference on Information Reuse and Integration for Data Science, IRI 2018, pp. 70–79, 2018, doi: 10.1109/IRI.2018.00018.
P. M. Blok, F. K. van Evert, A. P. M. Tielen, E. J. van Henten, and G. Kootstra, “The effect of data augmentation and network simplification on the image-based detection of broccoli heads with Mask R-CNN,” J Field Robot, vol. 38, no. 1, pp. 85–104, 2021, doi: 10.1002/rob.21975.
S. N. Gowda and C. Yuan, ColorNet: Investigating the Importance of Color Spaces for Image Classification, vol. 11364 LNCS. Springer International Publishing, 2019. doi: 10.1007/978-3-030-20870-7_36.
F. M. Hana and I. D. Maulida, “Analysis of contrast limited adaptive histogram equalization (CLAHE) parameters on finger knuckle print identification,” J Phys Conf Ser, vol. 1764, no. 1, 2021, doi: 10.1088/1742-6596/1764/1/012049.
F. Thabtah, S. Hammoud, F. Kamalov, and A. Gonsalves, “Data imbalance in classification: Experimental evaluation,” Inf Sci (N Y), vol. 513, 2020, doi: 10.1016/j.ins.2019.11.004.
T. A. Soomro et al., “Impact of Image Enhancement Technique on CNN Model for Retinal Blood Vessels Segmentation,” IEEE Access, vol. 7, pp. 158183–158197, 2019, doi: 10.1109/ACCESS.2019.2950228.
I. Astuti, W. W. Ariestya, and B. Solehudin, “Deteksi Objek Daun Semanggi Secara Real Time Menggunakan CNN-Single Shot Multibox Detector (SSD),” Jurnal Ilmiah FIFO, vol. 14, no. 1, 2022.
M. Grandini, E. Bagli, and G. Visani, “Metrics for multi-class classification: an overview,” arXiv preprint arXiv:2008.05756, 2020.
B. G. K. M. Alblooshi, S. Z. Ahmad, M. Hussain, and S. K. Singh, “Sustainable management of electronic waste: Empirical evidences from a stakeholders’ perspective,” Bus Strategy Environ, vol. 31, no. 4, 2022, doi: 10.1002/bse.2987.
R. Panca Sakti, Ulfa Sulaeman, and Abd. Gafur, “Peran Mallsampah dalam Efektivitas Pengelolaan Sampah (Studi Kasus di PT. Mallsampah Indonesia),” Window of Public Health Journal, vol. 2, no. 2, pp. 1004–1018, 2021, doi: 10.33096/woph.v2i2.197.
N. Istiqomah, I. Mafruhah, E. Gravitiani, and S. Supriyadi, “Konsep Reduce, Reuse, Recycle dan Replace dalam Pengelolaan Sampah Rumah Tangga di Desa Polanharjo Kabupaten Klaten,” SEMAR (Jurnal Ilmu Pengetahuan, Teknologi, dan Seni bagi Masyarakat), vol. 8, no. 2, pp. 30–38, 2019, doi: 10.20961/semar.v8i2.26682.
D. N. Patel, C. Dasari, A. Chembarpu, and A. Sasi, “Smart Waste Segregation using ML Techniques,” International Journal of Innovative Science and Research Technology-2019, 2020.
Y. Chen et al., “Classification of lungs infected COVID-19 images based on inception-ResNet,” Comput Methods Programs Biomed, vol. 225, p. 107053, 2022, doi: 10.1016/j.cmpb.2022.107053.
R. Vankdothu and M. A. Hameed, “Brain tumor MRI images identification and classification based on the recurrent convolutional neural network,” Measurement: Sensors, vol. 24, no. August, p. 100412, 2022, doi: 10.1016/j.measen.2022.100412.
Kusrini, M. R. A. Yudianto, and H. Al Fatta, “The effect of Gaussian filter and data preprocessing on the classification of Punakawan puppet images with the convolutional neural network algorithm,” International Journal of Electrical and Computer Engineering, vol. 12, no. 4, pp. 3752–3761, 2022, doi: 10.11591/ijece.v12i4.pp3752-3761.
M. G. Banish, U. Amogha, and U. Apoorva, “Segregation of Trash for Recyclability,” Ijresm.Com, no. 8, 2019.
C. Cheng, X. Wei, and Z. Jian, “Emotion recognition algorithm based on convolution neural network,” Proceedings of the 2017 12th International Conference on Intelligent Systems and Knowledge Engineering, ISKE 2017, vol. 2018-January, pp. 1–5, 2017, doi: 10.1109/ISKE.2017.8258786.
Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, “A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects,” IEEE Trans Neural Netw Learn Syst, pp. 1–21, 2021, doi: 10.1109/tnnls.2021.3084827.
F. A. Breve, “COVID-19 detection on Chest X-ray images: A comparison of CNN architectures and ensembles[Formula presented],” Expert Syst Appl, vol. 204, no. May, p. 117549, 2022, doi: 10.1016/j.eswa.2022.117549.
J. Huang et al., “BM-Net: CNN-Based MobileNet-V3 and Bilinear Structure for Breast Cancer Detection in Whole Slide Images,” Bioengineering, vol. 9, no. 6, pp. 1–16, 2022, doi: 10.3390/bioengineering9060261.
S. L. Rabano, M. K. Cabatuan, E. Sybingco, E. P. Dadios, and E. J. Calilung, “Common garbage classification using mobilenet,” 2018 IEEE 10th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment and Management, HNICEM 2018, pp. 1–4, 2018, doi: 10.1109/HNICEM.2018.8666300.
K. De Angeli et al., “Class imbalance in out-of-distribution datasets: Improving the robustness of the TextCNN for the classification of rare cancer types,” J Biomed Inform, vol. 125, no. June 2021, p. 103957, 2022, doi: 10.1016/j.jbi.2021.103957.
J. M. Durden, B. Hosking, B. J. Bett, D. Cline, and H. A. Ruhl, “Automated classification of fauna in seabed photographs: The impact of training and validation dataset size, with considerations for the class imbalance,” Prog Oceanogr, vol. 196, no. July 2020, p. 102612, 2021, doi: 10.1016/j.pocean.2021.102612.
T. Hasanin and T. M. Khoshgoftaar, “The effects of random undersampling with simulated class imbalance for big data,” Proceedings - 2018 IEEE 19th International Conference on Information Reuse and Integration for Data Science, IRI 2018, pp. 70–79, 2018, doi: 10.1109/IRI.2018.00018.
P. M. Blok, F. K. van Evert, A. P. M. Tielen, E. J. van Henten, and G. Kootstra, “The effect of data augmentation and network simplification on the image-based detection of broccoli heads with Mask R-CNN,” J Field Robot, vol. 38, no. 1, pp. 85–104, 2021, doi: 10.1002/rob.21975.
S. N. Gowda and C. Yuan, ColorNet: Investigating the Importance of Color Spaces for Image Classification, vol. 11364 LNCS. Springer International Publishing, 2019. doi: 10.1007/978-3-030-20870-7_36.
F. M. Hana and I. D. Maulida, “Analysis of contrast limited adaptive histogram equalization (CLAHE) parameters on finger knuckle print identification,” J Phys Conf Ser, vol. 1764, no. 1, 2021, doi: 10.1088/1742-6596/1764/1/012049.
F. Thabtah, S. Hammoud, F. Kamalov, and A. Gonsalves, “Data imbalance in classification: Experimental evaluation,” Inf Sci (N Y), vol. 513, 2020, doi: 10.1016/j.ins.2019.11.004.
T. A. Soomro et al., “Impact of Image Enhancement Technique on CNN Model for Retinal Blood Vessels Segmentation,” IEEE Access, vol. 7, pp. 158183–158197, 2019, doi: 10.1109/ACCESS.2019.2950228.
I. Astuti, W. W. Ariestya, and B. Solehudin, “Deteksi Objek Daun Semanggi Secara Real Time Menggunakan CNN-Single Shot Multibox Detector (SSD),” Jurnal Ilmiah FIFO, vol. 14, no. 1, 2022.
M. Grandini, E. Bagli, and G. Visani, “Metrics for multi-class classification: an overview,” arXiv preprint arXiv:2008.05756, 2020.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Pengaruh Data Preprocessing terhadap Imbalanced Dataset pada Klasifikasi Citra Sampah menggunakan Algoritma Convolutional Neural Network
Pages: 1367−1375
Copyright (c) 2022 Muhammad Resa Arif Yudianto, Pristi Sukmasetya, Rofi Abul Hasani, Dimas Sasongko

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).





















