Deteksi Penyakit Jantung Menggunakan SVM dan XGBoost dengan Interpretabilitas SHAP dan Integrasi LLM
Abstract
Cardiovascular disease remains the leading cause of death globally, demanding accurate early detection, yet limited access to specialist medical personnel in developing countries often hinders timely diagnosis. This study aims to address the critical gap between the high accuracy of machine learning models in academic research and the minimal adoption of practical clinical applications by developing a safe and trustworthy hybrid artificial intelligence-based heart disease triage system. The proposed methodology integrates a dual-model architecture in which Support Vector Machine serves as the primary prediction model and Extreme Gradient Boosting as a second-opinion model, both optimized with SMOTE oversampling technique to handle class imbalance, and implements SHAP to provide transparency in black-box model decisions. The system is enriched with Dynamic Prompt Engineering innovation on the Mistral-7B Large Language Model to translate numerical probabilities into safe, personalized, and empathetic medical narratives. Experimental results show that the Support Vector Machine model with RBF kernel delivers superior performance with an accuracy of 90.22% and sensitivity of 94.12%, which is crucial for minimizing false negative cases in medical screening, outperforming the Extreme Gradient Boosting model which recorded 88.04% accuracy. Interpretability analysis identified chest pain type, cholesterol level, and maximum heart rate as the primary risk indicators, validating the model's alignment with standard cardiology guidelines. A dual safety validation mechanism through programmed risk thresholds and language generation temperature control ensures the system does not produce harmful diagnostic hallucinations. In conclusion, the system implemented as a FastAPI-based microservice is proven technically feasible with low latency, offering an accurate, transparent, and communicative early screening solution to support healthcare service efficiency.
Downloads
References
S. Simatupang, R. Ramadhansyah, R. Tumanggor, E. P. Tan, and S. A. Fajar, “Prediction of Heart Disease Risk Based on Patient Health History Using the Support Vector Machine (SVM) Algorithm,” ZERO: Jurnal Sains, Matematika dan Terapan, vol. 9, no. 2, p. 612, Nov. 2025, doi: 10.30829/zero.v9i2.26087.
S. A. Bangun, E. S. Ompusunggu, W. Wilson, and E. K. Harefa, “Support Vector Machine for Classifying Heart Failure, Hypertension, and Normal Heart Condition,” JUSIFO (Jurnal Sistem Informasi), vol. 11, no. 1, pp. 53–60, Jun. 2025, doi: 10.19109/jusifo.v11i1.28113.
T. Roopa and G. D. Ramanjinappa, “Heart Disease Predictive Modeling with XGBoost and SMOTE-Driven Class Imbalance Mitigation,” Engineering, Technology & Applied Science Research, vol. 15, no. 6, pp. 29914–29918, Dec. 2025, doi: 10.48084/etasr.14301.
D. Rohmayani, C. A. Sugianto, R. S. Perdana, and M. M. Nafea, “Improving Extreme Gradient Boosting Model for Heart Disease Prediction Using SMOTE for Class Imbalance,” Jurnal Teknik Informatika (Jutif), vol. 6, no. 4, pp. 1717–1728, Aug. 2025, doi: 10.52436/1.jutif.2025.6.4.4753.
U. Nagavelli, D. Samanta, and P. Chakraborty, “Machine Learning Technology-Based Heart Disease Detection Models,” J. Healthc. Eng., vol. 2022, pp. 1–9, 2022, doi: 10.1155/2022/7351061.
S. Lundberg and S.-I. Lee, “A Unified Approach to Interpreting Model Predictions,” in Advances in Neural Information Processing Systems, Nov. 2017, pp. 4765–4774. [Online]. Available: https://proceedings.neurips.cc/paper/2017/hash/8a20a8621978632d76c43dfd28b67767-Abstract.html
S. M. Ganie, P. K. D. Pramanik, and Z. Zhao, “Ensemble learning with explainable AI for improved heart disease prediction based on multiple datasets,” Sci. Rep., vol. 15, no. 1, p. 13912, Apr. 2025, doi: 10.1038/s41598-025-97547-6.
E. Tjoa and C. Guan, “A Survey on Explainable Artificial Intelligence (XAI): Toward Medical XAI,” IEEE Trans. Neural Netw. Learn. Syst., vol. 32, no. 11, pp. 4793–4813, Nov. 2021, doi: 10.1109/TNNLS.2020.3027314.
M. Ghassemi, L. Oakden-Rayner, and A. L. Beam, “The false hope of current approaches to explainable artificial intelligence in health care,” Lancet Digit. Health, vol. 3, no. 11, pp. e745–e750, Nov. 2021, doi: 10.1016/S2589-7500(21)00208-9.
K. Singhal et al., “Large language models encode clinical knowledge,” Nature, vol. 620, no. 7972, pp. 172–180, Aug. 2023, doi: 10.1038/s41586-023-06291-2.
A. Q. Jiang et al., “Mistral 7B,” arXiv preprint, arXiv:2310.06825, Oct. 2023, doi: 10.48550/arXiv.2310.06825.
B. A. Majeed, A. Y. Hardan, B. Y. Hardan, and D. F. Munaf, “Accurate AI-Based Chatbot to Diagnose Heart Diseases Pre-Human Doctor Consultation,” Revue d’Intelligence Artificielle, vol. 38, no. 1, pp. 213–220, Feb. 2024, doi: 10.18280/ria.380121.
S. E. Antia et al., “Healthy Heart Assistant, a WhatsApp-Based Generative Pretrained Transformer Technology, for Self-Care in Hypertensive Patients,” Mayo Clinic Proceedings: Digital Health, vol. 3, no. 3, p. 100243, Sep. 2025, doi: 10.1016/j.mcpdig.2025.100243.
A. Nurlita and M. Munawaroh, “Pengembangan Chatbot Dengan Metode Natural Language Processing Untuk Layanan Pelanggan (Studi Kasus PT Masterlink Internet Solution),” Jurnal Informatika dan Teknik Elektro Terapan, vol. 13, no. 3S1, Oct. 2025, doi: 10.23960/jitet.v13i3S1.8176.
S. Boit and R. Patil, “A Prompt Engineering Framework for Large Language Model–Based Mental Health Chatbots: Conceptual Framework,” JMIR Ment. Health, vol. 12, pp. e75078–e75078, Nov. 2025, doi: 10.2196/75078.
B. Meskó, “Prompt Engineering as an Important Emerging Skill for Medical Professionals: Tutorial,” J. Med. Internet Res., vol. 25, p. e50638, Oct. 2023, doi: 10.2196/50638.
M. Muhetaer, A. Yusupu, W. Yifan, M. Mutalipu, and F. Hao, “Medical QA dialogue datasets in RAG systems performance evaluation and ChatGPT optimization,” Sci. Rep., vol. 15, no. 1, p. 44467, Dec. 2025, doi: 10.1038/s41598-025-28015-4.
O. T. Odofin, B. I. Adekunle, E. Ogbuefi, J. C. Ogeawuchi, O. S. Adanigbo, and T. P. Gbenle, “Improving Healthcare Data Intelligence through Custom NLP Pipelines and Fast API Micro services,” Journal of Frontiers in Multidisciplinary Research, vol. 4, no. 1, pp. 390–397, 2023, doi: 10.54660/.JFMR.2023.4.1.390-397.
J. Yang and J. Guan, “A Heart Disease Prediction Model Based on Feature Optimization and Smote-Xgboost Algorithm,” Information, vol. 13, no. 10, p. 475, Oct. 2022, doi: 10.3390/info13100475.
K. Budholiya, S. K. Shrivastava, and V. Sharma, “An optimized XGBoost based diagnostic system for effective prediction of heart disease,” Journal of King Saud University - Computer and Information Sciences, vol. 34, no. 7, pp. 4514–4523, Jul. 2022, doi: 10.1016/j.jksuci.2020.10.013.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Deteksi Penyakit Jantung Menggunakan SVM dan XGBoost dengan Interpretabilitas SHAP dan Integrasi LLM
Pages: 14-26
Copyright (c) 2026 Raihan Al Aziz, Egia Rosi Subhiyakto

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).





















