Milk Production Estimation Model for Cattle Based on Image Processing using Random Forest, XGBoost, and LightGBM


  • Za'imatun Niswati * Mail IPB University, Bogor, Indonesia
  • Sri Nurdiati IPB University, Bogor, Indonesia
  • Agus Buono IPB University, Bogor, Indonesia
  • Cece Sumantri IPB University, Bogor, Indonesia
  • (*) Corresponding Author
Keywords: Milk Production; Random Forest; XGBoost; LightGBM

Abstract

Milk is a livestock product consumed by individuals of all ages. Therefore, it is essential to increase milk production in Indonesia to meet domestic demand. The growth of dairy cattle populations and milk production has not been able to keep up with rising consumption, resulting in a reliance on imports for most dairy products and their derivatives, with imports steadily increasing over the years. Therefore, alternative solutions are needed to enhance the milk production. One approach is to develop a milk production estimation model to determine the optimal number of dairy cattle to be cultivated by farmers and livestock companies to meet domestic demand. The objective of this study was to create a dairy milk production estimation model through image analysis using the Random Forest, XGBoost, and LightGBM algorithms. The milk production estimation model used in this study used CLAHE for contrast enhancement and VGG-16 for feature extraction. The results showed that XGBoost provided the best performance, explaining 74% of the data variation in the Y variable with a relatively small estimation error of 0.92. After parameter tuning using Grid Search, an improvement was observed, where XGBoost explained 86% of the data variation in the Y variable, and the estimation error decreased to 0.72. Image processing and machine learning technologies are part of precision agriculture that aims to improve the efficiency, productivity, and sustainability of livestock operations.

Downloads

Download data is not yet available.

References

N. A. Poulsen and L. Bach Larsen, “Genetic factors affecting the composition and quality of cow’s milk,” Burleigh Dodds Ser. Agric. Sci., pp. 501–532, 2022, doi: 10.19103/as.2022.0099.15.

L. Liu et al., “Ten Genetic Loci Identified for Milk Yield, Fat, and Protein in Holstein Cattle,” Animals, vol. 10, no. 11, pp. 1–15, 2020, doi: 10.3390/ani10112048.

J. Moran, Nutrient requirements of dairy cattle. The National Academies Press, 2020.

Dairy Training Centre, Dairy Cattle Feeding and Nutrition Management., 2017.

S. Kaskous, “Optimization of Milk Performance and Quality in Dairy Farms by using a Quarter individual Milking System ‘MultiLactor,’” Int. J. Environ. Agric. Biotechnol., vol. 5, no. 4, pp. 943–952, 2020, doi: 10.22161/ijeab.54.14.

M. R. Gianegitz et al., “Contributions of genetic improvement programs for dairy livestock farming,” Biol. Agric. Sci. Theory Pract., 2024, doi: 10.56238/sevened2024.008-004.

R. da R. Righi, G. Goldschmidt, R. Kunst, C. Deon, and C. A. da Costa, “Towards combining data prediction and internet of things to manage milk production on dairy cows,” Comput. Electron. Agric., vol. 169, no. December 2019, p. 105156, 2020, doi: 10.1016/j.compag.2019.105156.

A. Liseune, M. Salamone, D. Van den Poel, B. Van Ranst, and M. Hostens, “Leveraging latent representations for milk yield prediction and interpolation using deep learning,” Comput. Electron. Agric., vol. 175, no. June, p. 105600, 2020, doi: 10.1016/j.compag.2020.105600.

A. Liseune, M. Salamone, D. Van den Poel, B. van Ranst, and M. Hostens, “Predicting the milk yield curve of dairy cows in the subsequent lactation period using deep learning,” Comput. Electron. Agric., vol. 180, no. September 2020, p. 105904, 2021, doi: 10.1016/j.compag.2020.105904.

K. S. Themistokleous, N. Sakellariou, and E. Kiossis, “A deep learning algorithm predicts milk yield and production stage of dairy cows utilizing ultrasound echotexture analysis of the mammary gland,” Comput. Electron. Agric., vol. 198, no. September 2021, p. 106992, 2022, doi: 10.1016/j.compag.2022.106992.

C. O’Leary and C. Lynch, “An Evaluation of Machine Learning Approaches for Milk Volume Prediction in Ireland,” 2022 33rd Irish Signals Syst. Conf. ISSC 2022, pp. 0–7, 2022, doi: 10.1109/ISSC55427.2022.9826160.

M. Seyyedattar, S. Zendehboudi, A. Ghamartale, and M. Afshar, “Advancing hydrogen storage predictions in metal-organic frameworks: A comparative study of LightGBM and random forest models with data enhancement,” Int. J. Hydrogen Energy, vol. 69, no. March, pp. 158–172, 2024, doi: 10.1016/j.ijhydene.2024.04.230.

S. Wolfert, L. Ge, C. Verdouw, and M. J. Bogaardt, “Big Data in Smart Farming – A review,” Agric. Syst., vol. 153, pp. 69–80, 2017, doi: 10.1016/j.agsy.2017.01.023.

J. H. Friedman, “Greedy function approximation: A gradient boosting machine,” Ann. Stat., vol. 29, no. 5, pp. 1189–1232, 2001, doi: 10.1214/aos/1013203451.

T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., 13-17-Augu, pp. 785–794, 2016, doi: 10.1145/2939672.2939785.

T. Y. Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, “LightGBM: A Highly Efficient Gradient Boosting Decision Tree.,” Adv. Neural Inf. Process. Syst., vol. 30, 2017.

M. R. Islam and M. Nahiduzzaman, “Complex features extraction with deep learning model for the detection of COVID19 from CT scan images using ensemble based machine learning approach,” Expert Syst. Appl., vol. 195, no. February, p. 116554, 2022, doi: 10.1016/j.eswa.2022.116554.

A. Svyatkovskiy, J. Kates-Harbeck, and W. Tang, “Training distributed deep recurrent neural networks with mixed precision on GPU clusters.,” Proceedings of the Machine Learning on HPC Environments , 2017, doi: 10.1145/3146347.3146358.

T. K. G. Taranjit Kaur, “Automated Brain Image Classification Based on VGG-16 and Transfer Learning,” International Conference on Information Technology (ICIT), 2019, doi: 10.1109/ICIT48102.2019.00023.

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 3rd Int. Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc., pp. 1–14, 2015.

L. Grinsztajn, E. Oyallon, and G. Varoquaux, “Why do tree-based models still outperform deep learning on typical tabular data?,” Adv. Neural Inf. Process. Syst., vol. 35, 2022.

Z. Zhou, L. Zhang, and S. Cai, “Application of Ensemble Learning in Agricultural Yield Prediction: A Review.,” Agric. Syst., vol. 178, 2020.

P. Septiana Rizky, R. Haiban Hirzi, and U. Hidayaturrohman, “Perbandingan Metode LightGBM dan XGBoost dalam Menangani Data dengan Kelas Tidak Seimbang,” J Stat. J. Ilm. Teor. dan Apl. Stat., vol. 15, no. 2, pp. 228–236, 2022, doi: 10.36456/jstat.vol15.no2.a5548.

A. Rafi, A. Wahyu, E. P. A.R, S. Fadilah, and A. M. Rizki, “Perbandingan Algoritma Lightgbm Dan Ann Untuk Menentukan Kualitas Anggur Merah,” JATI (Jurnal Mhs. Tek. Inform.), vol. 9, no. 1, pp. 1572–1579, 2025.

P. Jain, S. Singh, and A. Kumar, “Relationship between udder traits and milk production in dairy animals: A review.,” Vet. World, vol. 12, no. 9, pp. 1452–1457, 2019.

M. R. Machado, S. Karray, and I. T. De Sousa, “LightGBM: An effective decision tree gradient boosting method to predict customer loyalty in the finance industry,” 14th Int. Conf. Comput. Sci. Educ. ICCSE 2019, no. Nips, pp. 1111–1116, 2019, doi: 10.1109/ICCSE.2019.8845529.

M. S. Oughali and S. A. E.-R. Bahloul, Mariah, “Analysis of NBA players and shot prediction using random forest and XGBoost models,” in International Conference on Computer and Information Sciences (ICCIS), 2019, pp. 1–5, doi: • 10.1109/ICCISci.2019.8716412.

A. X. V. I. Simp and S. Remoto, “PM2.5 Prediction Based on Random Forest, XGBoost, and Deep Learning Using Multisource Remote Sensing Data Mehdi,” Ambient Aerosol Measurements in Different Environments , vol. 10, no. 7, 1992, pp. 6425–6432, 2013.

W. Zhang, C. Wu, H. Zhong, Y. Li, and L. Wang, “Prediction of undrained shear strength using extreme gradient boosting and random forest based on Bayesian optimization,” Geosci. Front., vol. 12, no. 1, pp. 469–477, 2021, doi: 10.1016/j.gsf.2020.03.007.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Milk Production Estimation Model for Cattle Based on Image Processing using Random Forest, XGBoost, and LightGBM

Dimensions Badge
Article History
Submitted: 2025-06-14
Published: 2025-09-05
Abstract View: 427 times
PDF Download: 280 times
How to Cite
Niswati, Z., Nurdiati, S., Buono, A., & Sumantri, C. (2025). Milk Production Estimation Model for Cattle Based on Image Processing using Random Forest, XGBoost, and LightGBM. Building of Informatics, Technology and Science (BITS), 7(2), 1302-1308. https://doi.org/10.47065/bits.v7i2.7585
Section
Articles