Milk Production Estimation Model for Cattle Based on Image Processing using Random Forest, XGBoost, and LightGBM
Abstract
Milk is a livestock product consumed by individuals of all ages. Therefore, it is essential to increase milk production in Indonesia to meet domestic demand. The growth of dairy cattle populations and milk production has not been able to keep up with rising consumption, resulting in a reliance on imports for most dairy products and their derivatives, with imports steadily increasing over the years. Therefore, alternative solutions are needed to enhance the milk production. One approach is to develop a milk production estimation model to determine the optimal number of dairy cattle to be cultivated by farmers and livestock companies to meet domestic demand. The objective of this study was to create a dairy milk production estimation model through image analysis using the Random Forest, XGBoost, and LightGBM algorithms. The milk production estimation model used in this study used CLAHE for contrast enhancement and VGG-16 for feature extraction. The results showed that XGBoost provided the best performance, explaining 74% of the data variation in the Y variable with a relatively small estimation error of 0.92. After parameter tuning using Grid Search, an improvement was observed, where XGBoost explained 86% of the data variation in the Y variable, and the estimation error decreased to 0.72. Image processing and machine learning technologies are part of precision agriculture that aims to improve the efficiency, productivity, and sustainability of livestock operations.
Downloads
References
N. A. Poulsen and L. Bach Larsen, “Genetic factors affecting the composition and quality of cow’s milk,” Burleigh Dodds Ser. Agric. Sci., pp. 501–532, 2022, doi: 10.19103/as.2022.0099.15.
L. Liu et al., “Ten Genetic Loci Identified for Milk Yield, Fat, and Protein in Holstein Cattle,” Animals, vol. 10, no. 11, pp. 1–15, 2020, doi: 10.3390/ani10112048.
J. Moran, Nutrient requirements of dairy cattle. The National Academies Press, 2020.
Dairy Training Centre, Dairy Cattle Feeding and Nutrition Management., 2017.
S. Kaskous, “Optimization of Milk Performance and Quality in Dairy Farms by using a Quarter individual Milking System ‘MultiLactor,’” Int. J. Environ. Agric. Biotechnol., vol. 5, no. 4, pp. 943–952, 2020, doi: 10.22161/ijeab.54.14.
M. R. Gianegitz et al., “Contributions of genetic improvement programs for dairy livestock farming,” Biol. Agric. Sci. Theory Pract., 2024, doi: 10.56238/sevened2024.008-004.
R. da R. Righi, G. Goldschmidt, R. Kunst, C. Deon, and C. A. da Costa, “Towards combining data prediction and internet of things to manage milk production on dairy cows,” Comput. Electron. Agric., vol. 169, no. December 2019, p. 105156, 2020, doi: 10.1016/j.compag.2019.105156.
A. Liseune, M. Salamone, D. Van den Poel, B. Van Ranst, and M. Hostens, “Leveraging latent representations for milk yield prediction and interpolation using deep learning,” Comput. Electron. Agric., vol. 175, no. June, p. 105600, 2020, doi: 10.1016/j.compag.2020.105600.
A. Liseune, M. Salamone, D. Van den Poel, B. van Ranst, and M. Hostens, “Predicting the milk yield curve of dairy cows in the subsequent lactation period using deep learning,” Comput. Electron. Agric., vol. 180, no. September 2020, p. 105904, 2021, doi: 10.1016/j.compag.2020.105904.
K. S. Themistokleous, N. Sakellariou, and E. Kiossis, “A deep learning algorithm predicts milk yield and production stage of dairy cows utilizing ultrasound echotexture analysis of the mammary gland,” Comput. Electron. Agric., vol. 198, no. September 2021, p. 106992, 2022, doi: 10.1016/j.compag.2022.106992.
C. O’Leary and C. Lynch, “An Evaluation of Machine Learning Approaches for Milk Volume Prediction in Ireland,” 2022 33rd Irish Signals Syst. Conf. ISSC 2022, pp. 0–7, 2022, doi: 10.1109/ISSC55427.2022.9826160.
M. Seyyedattar, S. Zendehboudi, A. Ghamartale, and M. Afshar, “Advancing hydrogen storage predictions in metal-organic frameworks: A comparative study of LightGBM and random forest models with data enhancement,” Int. J. Hydrogen Energy, vol. 69, no. March, pp. 158–172, 2024, doi: 10.1016/j.ijhydene.2024.04.230.
S. Wolfert, L. Ge, C. Verdouw, and M. J. Bogaardt, “Big Data in Smart Farming – A review,” Agric. Syst., vol. 153, pp. 69–80, 2017, doi: 10.1016/j.agsy.2017.01.023.
J. H. Friedman, “Greedy function approximation: A gradient boosting machine,” Ann. Stat., vol. 29, no. 5, pp. 1189–1232, 2001, doi: 10.1214/aos/1013203451.
T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., 13-17-Augu, pp. 785–794, 2016, doi: 10.1145/2939672.2939785.
T. Y. Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, “LightGBM: A Highly Efficient Gradient Boosting Decision Tree.,” Adv. Neural Inf. Process. Syst., vol. 30, 2017.
M. R. Islam and M. Nahiduzzaman, “Complex features extraction with deep learning model for the detection of COVID19 from CT scan images using ensemble based machine learning approach,” Expert Syst. Appl., vol. 195, no. February, p. 116554, 2022, doi: 10.1016/j.eswa.2022.116554.
A. Svyatkovskiy, J. Kates-Harbeck, and W. Tang, “Training distributed deep recurrent neural networks with mixed precision on GPU clusters.,” Proceedings of the Machine Learning on HPC Environments , 2017, doi: 10.1145/3146347.3146358.
T. K. G. Taranjit Kaur, “Automated Brain Image Classification Based on VGG-16 and Transfer Learning,” International Conference on Information Technology (ICIT), 2019, doi: 10.1109/ICIT48102.2019.00023.
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 3rd Int. Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc., pp. 1–14, 2015.
L. Grinsztajn, E. Oyallon, and G. Varoquaux, “Why do tree-based models still outperform deep learning on typical tabular data?,” Adv. Neural Inf. Process. Syst., vol. 35, 2022.
Z. Zhou, L. Zhang, and S. Cai, “Application of Ensemble Learning in Agricultural Yield Prediction: A Review.,” Agric. Syst., vol. 178, 2020.
P. Septiana Rizky, R. Haiban Hirzi, and U. Hidayaturrohman, “Perbandingan Metode LightGBM dan XGBoost dalam Menangani Data dengan Kelas Tidak Seimbang,” J Stat. J. Ilm. Teor. dan Apl. Stat., vol. 15, no. 2, pp. 228–236, 2022, doi: 10.36456/jstat.vol15.no2.a5548.
A. Rafi, A. Wahyu, E. P. A.R, S. Fadilah, and A. M. Rizki, “Perbandingan Algoritma Lightgbm Dan Ann Untuk Menentukan Kualitas Anggur Merah,” JATI (Jurnal Mhs. Tek. Inform.), vol. 9, no. 1, pp. 1572–1579, 2025.
P. Jain, S. Singh, and A. Kumar, “Relationship between udder traits and milk production in dairy animals: A review.,” Vet. World, vol. 12, no. 9, pp. 1452–1457, 2019.
M. R. Machado, S. Karray, and I. T. De Sousa, “LightGBM: An effective decision tree gradient boosting method to predict customer loyalty in the finance industry,” 14th Int. Conf. Comput. Sci. Educ. ICCSE 2019, no. Nips, pp. 1111–1116, 2019, doi: 10.1109/ICCSE.2019.8845529.
M. S. Oughali and S. A. E.-R. Bahloul, Mariah, “Analysis of NBA players and shot prediction using random forest and XGBoost models,” in International Conference on Computer and Information Sciences (ICCIS), 2019, pp. 1–5, doi: • 10.1109/ICCISci.2019.8716412.
A. X. V. I. Simp and S. Remoto, “PM2.5 Prediction Based on Random Forest, XGBoost, and Deep Learning Using Multisource Remote Sensing Data Mehdi,” Ambient Aerosol Measurements in Different Environments , vol. 10, no. 7, 1992, pp. 6425–6432, 2013.
W. Zhang, C. Wu, H. Zhong, Y. Li, and L. Wang, “Prediction of undrained shear strength using extreme gradient boosting and random forest based on Bayesian optimization,” Geosci. Front., vol. 12, no. 1, pp. 469–477, 2021, doi: 10.1016/j.gsf.2020.03.007.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Milk Production Estimation Model for Cattle Based on Image Processing using Random Forest, XGBoost, and LightGBM
Pages: 1302-1308
Copyright (c) 2025 Za'imatun Niswati, Sri Nurdiati, Agus Buono, Cece Sumantri

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).





















