Pengaruh Data Preprocessing terhadap Imbalanced Dataset pada Klasifikasi Citra Sampah menggunakan Algoritma Convolutional Neural Network


  • Muhammad Resa Arif Yudianto Universitas Muhammadiyah Magelang, Magelang, Indonesia
  • Pristi Sukmasetya Universitas Muhammadiyah Magelang, Magelang, Indonesia
  • Rofi Abul Hasani Universitas Muhammadiyah Magelang, Magelang, Indonesia
  • Dimas Sasongko * Mail Universitas Muhammadiyah Magelang, Magelang, Indonesia
  • (*) Corresponding Author
Keywords: Trash Classification; Imbalanced Dataset; Convolutional Neural Network; MobileNet Architecture; Deep Learning

Abstract

Garbage is one of Indonesia's most significant problems with an increase in waste each year reaching 187.2 million tonnes/year. Various efforts to reduce the amount of waste such as Garbage Banks have been encouraged. However, this program has not run well, because some people have difficulty distinguishing the type of waste. One solution to overcome this problem is that need a system that can classify the type of waste. The deep learning approach with the CNN algorithm is currently widely used to solve classification problems. This method requires a large number of datasets to increase the level of accuracy. Getting a garbage dataset is a particular problem in the training process because the dataset is unbalanced. The dataset used amounted to 2527 data consisting of 6 classes. Several treatments such as undersampling and image augmentation are applied to overcome imbalanced datasets. Other treatments such as the type of input image channel and the use of filters are combined into 24 experimental scenarios to achieve the highest accuracy. The results of the experiment get the best scenario, namely, the dataset is undersampling and then augmented with 5 geometric transformation parameters with the input image being RGB and applying a sharpening filter to get an accuracy value of 0.9919 with 20 epochs.

Downloads

Download data is not yet available.

References

B. G. K. M. Alblooshi, S. Z. Ahmad, M. Hussain, and S. K. Singh, “Sustainable management of electronic waste: Empirical evidences from a stakeholders’ perspective,” Bus Strategy Environ, vol. 31, no. 4, 2022, doi: 10.1002/bse.2987.

R. Panca Sakti, Ulfa Sulaeman, and Abd. Gafur, “Peran Mallsampah dalam Efektivitas Pengelolaan Sampah (Studi Kasus di PT. Mallsampah Indonesia),” Window of Public Health Journal, vol. 2, no. 2, pp. 1004–1018, 2021, doi: 10.33096/woph.v2i2.197.

N. Istiqomah, I. Mafruhah, E. Gravitiani, and S. Supriyadi, “Konsep Reduce, Reuse, Recycle dan Replace dalam Pengelolaan Sampah Rumah Tangga di Desa Polanharjo Kabupaten Klaten,” SEMAR (Jurnal Ilmu Pengetahuan, Teknologi, dan Seni bagi Masyarakat), vol. 8, no. 2, pp. 30–38, 2019, doi: 10.20961/semar.v8i2.26682.

D. N. Patel, C. Dasari, A. Chembarpu, and A. Sasi, “Smart Waste Segregation using ML Techniques,” International Journal of Innovative Science and Research Technology-2019, 2020.

Y. Chen et al., “Classification of lungs infected COVID-19 images based on inception-ResNet,” Comput Methods Programs Biomed, vol. 225, p. 107053, 2022, doi: 10.1016/j.cmpb.2022.107053.

R. Vankdothu and M. A. Hameed, “Brain tumor MRI images identification and classification based on the recurrent convolutional neural network,” Measurement: Sensors, vol. 24, no. August, p. 100412, 2022, doi: 10.1016/j.measen.2022.100412.

Kusrini, M. R. A. Yudianto, and H. Al Fatta, “The effect of Gaussian filter and data preprocessing on the classification of Punakawan puppet images with the convolutional neural network algorithm,” International Journal of Electrical and Computer Engineering, vol. 12, no. 4, pp. 3752–3761, 2022, doi: 10.11591/ijece.v12i4.pp3752-3761.

M. G. Banish, U. Amogha, and U. Apoorva, “Segregation of Trash for Recyclability,” Ijresm.Com, no. 8, 2019.

C. Cheng, X. Wei, and Z. Jian, “Emotion recognition algorithm based on convolution neural network,” Proceedings of the 2017 12th International Conference on Intelligent Systems and Knowledge Engineering, ISKE 2017, vol. 2018-January, pp. 1–5, 2017, doi: 10.1109/ISKE.2017.8258786.

Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, “A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects,” IEEE Trans Neural Netw Learn Syst, pp. 1–21, 2021, doi: 10.1109/tnnls.2021.3084827.

F. A. Breve, “COVID-19 detection on Chest X-ray images: A comparison of CNN architectures and ensembles[Formula presented],” Expert Syst Appl, vol. 204, no. May, p. 117549, 2022, doi: 10.1016/j.eswa.2022.117549.

J. Huang et al., “BM-Net: CNN-Based MobileNet-V3 and Bilinear Structure for Breast Cancer Detection in Whole Slide Images,” Bioengineering, vol. 9, no. 6, pp. 1–16, 2022, doi: 10.3390/bioengineering9060261.

S. L. Rabano, M. K. Cabatuan, E. Sybingco, E. P. Dadios, and E. J. Calilung, “Common garbage classification using mobilenet,” 2018 IEEE 10th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment and Management, HNICEM 2018, pp. 1–4, 2018, doi: 10.1109/HNICEM.2018.8666300.

K. De Angeli et al., “Class imbalance in out-of-distribution datasets: Improving the robustness of the TextCNN for the classification of rare cancer types,” J Biomed Inform, vol. 125, no. June 2021, p. 103957, 2022, doi: 10.1016/j.jbi.2021.103957.

J. M. Durden, B. Hosking, B. J. Bett, D. Cline, and H. A. Ruhl, “Automated classification of fauna in seabed photographs: The impact of training and validation dataset size, with considerations for the class imbalance,” Prog Oceanogr, vol. 196, no. July 2020, p. 102612, 2021, doi: 10.1016/j.pocean.2021.102612.

T. Hasanin and T. M. Khoshgoftaar, “The effects of random undersampling with simulated class imbalance for big data,” Proceedings - 2018 IEEE 19th International Conference on Information Reuse and Integration for Data Science, IRI 2018, pp. 70–79, 2018, doi: 10.1109/IRI.2018.00018.

P. M. Blok, F. K. van Evert, A. P. M. Tielen, E. J. van Henten, and G. Kootstra, “The effect of data augmentation and network simplification on the image-based detection of broccoli heads with Mask R-CNN,” J Field Robot, vol. 38, no. 1, pp. 85–104, 2021, doi: 10.1002/rob.21975.

S. N. Gowda and C. Yuan, ColorNet: Investigating the Importance of Color Spaces for Image Classification, vol. 11364 LNCS. Springer International Publishing, 2019. doi: 10.1007/978-3-030-20870-7_36.

F. M. Hana and I. D. Maulida, “Analysis of contrast limited adaptive histogram equalization (CLAHE) parameters on finger knuckle print identification,” J Phys Conf Ser, vol. 1764, no. 1, 2021, doi: 10.1088/1742-6596/1764/1/012049.

F. Thabtah, S. Hammoud, F. Kamalov, and A. Gonsalves, “Data imbalance in classification: Experimental evaluation,” Inf Sci (N Y), vol. 513, 2020, doi: 10.1016/j.ins.2019.11.004.

T. A. Soomro et al., “Impact of Image Enhancement Technique on CNN Model for Retinal Blood Vessels Segmentation,” IEEE Access, vol. 7, pp. 158183–158197, 2019, doi: 10.1109/ACCESS.2019.2950228.

I. Astuti, W. W. Ariestya, and B. Solehudin, “Deteksi Objek Daun Semanggi Secara Real Time Menggunakan CNN-Single Shot Multibox Detector (SSD),” Jurnal Ilmiah FIFO, vol. 14, no. 1, 2022.

M. Grandini, E. Bagli, and G. Visani, “Metrics for multi-class classification: an overview,” arXiv preprint arXiv:2008.05756, 2020.

B. G. K. M. Alblooshi, S. Z. Ahmad, M. Hussain, and S. K. Singh, “Sustainable management of electronic waste: Empirical evidences from a stakeholders’ perspective,” Bus Strategy Environ, vol. 31, no. 4, 2022, doi: 10.1002/bse.2987.

R. Panca Sakti, Ulfa Sulaeman, and Abd. Gafur, “Peran Mallsampah dalam Efektivitas Pengelolaan Sampah (Studi Kasus di PT. Mallsampah Indonesia),” Window of Public Health Journal, vol. 2, no. 2, pp. 1004–1018, 2021, doi: 10.33096/woph.v2i2.197.

N. Istiqomah, I. Mafruhah, E. Gravitiani, and S. Supriyadi, “Konsep Reduce, Reuse, Recycle dan Replace dalam Pengelolaan Sampah Rumah Tangga di Desa Polanharjo Kabupaten Klaten,” SEMAR (Jurnal Ilmu Pengetahuan, Teknologi, dan Seni bagi Masyarakat), vol. 8, no. 2, pp. 30–38, 2019, doi: 10.20961/semar.v8i2.26682.

D. N. Patel, C. Dasari, A. Chembarpu, and A. Sasi, “Smart Waste Segregation using ML Techniques,” International Journal of Innovative Science and Research Technology-2019, 2020.

Y. Chen et al., “Classification of lungs infected COVID-19 images based on inception-ResNet,” Comput Methods Programs Biomed, vol. 225, p. 107053, 2022, doi: 10.1016/j.cmpb.2022.107053.

R. Vankdothu and M. A. Hameed, “Brain tumor MRI images identification and classification based on the recurrent convolutional neural network,” Measurement: Sensors, vol. 24, no. August, p. 100412, 2022, doi: 10.1016/j.measen.2022.100412.

Kusrini, M. R. A. Yudianto, and H. Al Fatta, “The effect of Gaussian filter and data preprocessing on the classification of Punakawan puppet images with the convolutional neural network algorithm,” International Journal of Electrical and Computer Engineering, vol. 12, no. 4, pp. 3752–3761, 2022, doi: 10.11591/ijece.v12i4.pp3752-3761.

M. G. Banish, U. Amogha, and U. Apoorva, “Segregation of Trash for Recyclability,” Ijresm.Com, no. 8, 2019.

C. Cheng, X. Wei, and Z. Jian, “Emotion recognition algorithm based on convolution neural network,” Proceedings of the 2017 12th International Conference on Intelligent Systems and Knowledge Engineering, ISKE 2017, vol. 2018-January, pp. 1–5, 2017, doi: 10.1109/ISKE.2017.8258786.

Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, “A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects,” IEEE Trans Neural Netw Learn Syst, pp. 1–21, 2021, doi: 10.1109/tnnls.2021.3084827.

F. A. Breve, “COVID-19 detection on Chest X-ray images: A comparison of CNN architectures and ensembles[Formula presented],” Expert Syst Appl, vol. 204, no. May, p. 117549, 2022, doi: 10.1016/j.eswa.2022.117549.

J. Huang et al., “BM-Net: CNN-Based MobileNet-V3 and Bilinear Structure for Breast Cancer Detection in Whole Slide Images,” Bioengineering, vol. 9, no. 6, pp. 1–16, 2022, doi: 10.3390/bioengineering9060261.

S. L. Rabano, M. K. Cabatuan, E. Sybingco, E. P. Dadios, and E. J. Calilung, “Common garbage classification using mobilenet,” 2018 IEEE 10th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment and Management, HNICEM 2018, pp. 1–4, 2018, doi: 10.1109/HNICEM.2018.8666300.

K. De Angeli et al., “Class imbalance in out-of-distribution datasets: Improving the robustness of the TextCNN for the classification of rare cancer types,” J Biomed Inform, vol. 125, no. June 2021, p. 103957, 2022, doi: 10.1016/j.jbi.2021.103957.

J. M. Durden, B. Hosking, B. J. Bett, D. Cline, and H. A. Ruhl, “Automated classification of fauna in seabed photographs: The impact of training and validation dataset size, with considerations for the class imbalance,” Prog Oceanogr, vol. 196, no. July 2020, p. 102612, 2021, doi: 10.1016/j.pocean.2021.102612.

T. Hasanin and T. M. Khoshgoftaar, “The effects of random undersampling with simulated class imbalance for big data,” Proceedings - 2018 IEEE 19th International Conference on Information Reuse and Integration for Data Science, IRI 2018, pp. 70–79, 2018, doi: 10.1109/IRI.2018.00018.

P. M. Blok, F. K. van Evert, A. P. M. Tielen, E. J. van Henten, and G. Kootstra, “The effect of data augmentation and network simplification on the image-based detection of broccoli heads with Mask R-CNN,” J Field Robot, vol. 38, no. 1, pp. 85–104, 2021, doi: 10.1002/rob.21975.

S. N. Gowda and C. Yuan, ColorNet: Investigating the Importance of Color Spaces for Image Classification, vol. 11364 LNCS. Springer International Publishing, 2019. doi: 10.1007/978-3-030-20870-7_36.

F. M. Hana and I. D. Maulida, “Analysis of contrast limited adaptive histogram equalization (CLAHE) parameters on finger knuckle print identification,” J Phys Conf Ser, vol. 1764, no. 1, 2021, doi: 10.1088/1742-6596/1764/1/012049.

F. Thabtah, S. Hammoud, F. Kamalov, and A. Gonsalves, “Data imbalance in classification: Experimental evaluation,” Inf Sci (N Y), vol. 513, 2020, doi: 10.1016/j.ins.2019.11.004.

T. A. Soomro et al., “Impact of Image Enhancement Technique on CNN Model for Retinal Blood Vessels Segmentation,” IEEE Access, vol. 7, pp. 158183–158197, 2019, doi: 10.1109/ACCESS.2019.2950228.

I. Astuti, W. W. Ariestya, and B. Solehudin, “Deteksi Objek Daun Semanggi Secara Real Time Menggunakan CNN-Single Shot Multibox Detector (SSD),” Jurnal Ilmiah FIFO, vol. 14, no. 1, 2022.

M. Grandini, E. Bagli, and G. Visani, “Metrics for multi-class classification: an overview,” arXiv preprint arXiv:2008.05756, 2020.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Pengaruh Data Preprocessing terhadap Imbalanced Dataset pada Klasifikasi Citra Sampah menggunakan Algoritma Convolutional Neural Network

Dimensions Badge
Article History
Submitted: 2022-11-23
Published: 2022-12-26
Abstract View: 2148 times
PDF Download: 1310 times
How to Cite
Resa Arif Yudianto, M., Sukmasetya, P., Abul Hasani, R., & Sasongko, D. (2022). Pengaruh Data Preprocessing terhadap Imbalanced Dataset pada Klasifikasi Citra Sampah menggunakan Algoritma Convolutional Neural Network. Building of Informatics, Technology and Science (BITS), 4(3), 1367−1375. https://doi.org/10.47065/bits.v4i3.2575
Issue
Section
Articles

Most read articles by the same author(s)

1 2 > >>