Predicting Diabetes with Machine Learning: Evaluating Tree-Based and Ensemble Models with Custom Metrics and Statistical Validation
Abstract
This study investigates the predictive performance of machine learning models in diagnosing diabetes using the Pima Indians Diabetes Dataset. Seven models, including Logistic Regression, Random Forest, Gradient Boosting, XGBoost, LightGBM, Stacking Classifier, and Voting Classifier, were evaluated. A 10-fold cross-validation strategy was employed to ensure robust and reliable performance assessment. The evaluation incorporated standard metrics such as accuracy, precision, recall, F1 score, and ROC AUC, as well as a custom metric designed to prioritize recall while maintaining precision, addressing the clinical importance of minimizing false negatives. LightGBM and Random Forest emerged as the top-performing individual models, achieving competitive scores across metrics. Ensemble methods, particularly the Stacking Classifier, demonstrated robustness by leveraging the complementary strengths of base models. Statistical validation using the Friedman test confirmed significant differences in model rankings, with a test statistic of 22.77 and a p-value of 0.00088. However, pairwise comparisons using the Wilcoxon signed-rank test revealed that the differences between top models, such as LightGBM and Random Forest, were not statistically significant. These results emphasize the effectiveness of tree-based and ensemble models in addressing clinical diagnostic challenges. The study highlights the importance of using a custom metric to align model evaluation with clinical priorities. Future work should explore hybrid modeling approaches and larger datasets to further enhance predictive accuracy and generalizability in real-world healthcare applications.
Downloads
References
M. L. Avilés-Santa, A. Monroig-Rivera, A. Soto-Soto, and N. M. Lindberg, “Current state of diabetes mellitus prevalence, awareness, treatment, and control in Latin America: challenges and innovative solutions to improve health outcomes across the continent,” Curr. Diab. Rep., vol. 20, pp. 1–44, 2020.
Z. L. Teo et al., “Global prevalence of diabetic retinopathy and projection of burden through 2045: systematic review and meta-analysis,” Ophthalmology, vol. 128, no. 11, pp. 1580–1591, 2021.
L. Jiang et al., “A global view of hypertensive disorders and diabetes mellitus during pregnancy,” Nat. Rev. Endocrinol., vol. 18, no. 12, pp. 760–775, 2022.
J. L. Harding, M. B. Weber, and J. E. Shaw, “The Global Burden of Diabetes,” Textb. Diabetes, pp. 28–40, 2024.
U. Ramraj, “Living with diabetes: managing treatment and the psycho-social aspects of the disease,” 2023.
H. Wang, S. Akbari-Alavijeh, R. S. Parhar, R. Gaugler, and S. Hashmi, “Partners in diabetes epidemic: A global perspective,” World J. Diabetes, vol. 14, no. 10, p. 1463, 2023.
M. Zakir et al., “Cardiovascular complications of diabetes: from microvascular to macrovascular pathways,” Cureus, vol. 15, no. 9, 2023.
D. Crosby et al., “Early detection of cancer,” Science (80-. )., vol. 375, no. 6586, p. eaay9040, 2022.
T. De Francesco, J. Bacharach, O. Smith, and M. Shah, “Early diagnostics and interventional glaucoma,” Ther. Adv. Ophthalmol., vol. 16, p. 25158414241287430, 2024.
S. Asif et al., “Advancements and Prospects of Machine Learning in Medical Diagnostics: Unveiling the Future of Diagnostic Precision,” Arch. Comput. Methods Eng., pp. 1–31, 2024.
E. Afrifa-Yamoah et al., “Pathways to chronic disease detection and prediction: Mapping the potential of machine learning to the pathophysiological processes while navigating ethical challenges,” Chronic Dis. Transl. Med., 2024.
K. Contributors, “Pima Indians Diabetes Database.” 2016.
V. G. Costa and C. E. Pedreira, “Recent advances in decision trees: An updated survey,” Artif. Intell. Rev., vol. 56, no. 5, pp. 4765–4800, 2023.
M. S. Reza, U. Hafsha, R. Amin, R. Yasmin, and S. Ruhi, “Improving SVM performance for type II diabetes prediction with an improved non-linear kernel: Insights from the PIMA dataset,” Comput. Methods Programs Biomed. Updat., vol. 4, p. 100118, 2023.
H. Naz and S. Ahuja, “Deep learning approach for diabetes prediction using PIMA Indian dataset,” J. Diabetes & Metab. Disord., vol. 19, pp. 391–403, 2020.
A. D. Waberi, R. W. Mwangi, and R. M. Rimiru, “Advancing Type II Diabetes Predictions with a Hybrid LSTM-XGBoost Approach,” J. Data Anal. Inf. Process., vol. 12, no. 02, pp. 163–188, 2024.
N. Javaid, M. Akbar, A. Aldegheishem, N. Alrajeh, E. A. Mohammed, and others, “Employing a machine learning boosting classifiers based stacking ensemble model for detecting non technical losses in smart grids,” IEEE Access, vol. 10, pp. 121886–121899, 2022.
K. A. Reed et al., “Metrics as tools for bridging climate science and applications,” Wiley Interdiscip. Rev. Clim. Chang., vol. 13, no. 6, p. e799, 2022.
S. Gholampour, “Impact of Nature of Medical Data on Machine and Deep Learning for Imbalanced Datasets: Clinical Validity of SMOTE Is Questionable,” Mach. Learn. Knowl. Extr., vol. 6, no. 2, pp. 827–841, 2024.
A. M. Sowjanya and O. Mrudula, “Effective treatment of imbalanced datasets in health care using modified SMOTE coupled with stacked deep learning algorithms,” Appl. Nanosci., vol. 13, no. 3, pp. 1829–1840, 2023.
M. Salmi, D. Atif, D. Oliva, A. Abraham, and S. Ventura, “Handling imbalanced medical datasets: review of a decade of research,” Artif. Intell. Rev., vol. 57, no. 10, p. 273, 2024.
V.-E. Baciu, J. Stiens, and B. da Silva, “MLino bench: A comprehensive benchmarking tool for evaluating ML models on edge devices,” J. Syst. Archit., vol. 155, p. 103262, 2024.
N. L. Rane, S. K. Mallick, O. Kaya, and J. Rane, “Tools and frameworks for machine learning and deep learning: A review,” Appl. Mach. Learn. Deep Learn. Archit. Tech., pp. 80–95, 2024.
N. O. Nikitin et al., “Automated evolutionary approach for the design of composite machine learning pipelines,” Futur. Gener. Comput. Syst., vol. 127, pp. 109–125, 2022.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Predicting Diabetes with Machine Learning: Evaluating Tree-Based and Ensemble Models with Custom Metrics and Statistical Validation
Pages: 1818-1827
Copyright (c) 2024 Gregorius Airlangga

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).