Comparative Analysis of Loss Functions for Predicting Autoimmunity from Molecular Descriptors Using Deep Learning

Candra Gunawan; Robet Robet; Hendri Hendri

doi:10.47065/bits.v7i3.8581

Candra Gunawan * STMIK Time, Medan, Indonesia
Robet Robet STMIK Time, Medan, Indonesia
Hendri Hendri STMIK Time, Medan, Indonesia

(*) Corresponding Author

DOI: https://doi.org/10.47065/bits.v7i3.8581

Keywords: Autoimmunity; Molecular Descriptors; Deep Learning; Loss Function; Class Imbalance

Abstract

Drug-induced autoimmunity (DIA) presents a complex obstacle in pharmacological safety due to its rare occurrence and unpredictable manifestation, often compounded by class imbalance in clinical datasets. This study investigates the influence of three loss functions, Binary Cross-Entropy (BCE), Focal Loss, and Dice Loss, on the performance of deep learning architectures comprising Multi-Layer Perceptron (MLP), Convolutional Neural Network (CNN), and 2-Layer Neural Network (SimpleNN). Models were trained using numerical molecular descriptors from the publicly available DIA dataset. The architectures were chosen based on their complementary properties: MLP is suitable for high-dimensional tabular descriptor data, CNN was examined to explore whether 1D convolutions can capture localized feature interactions among correlated descriptors, and 2-Layer Neural Network served as a lightweight baseline for comparison. A stratified 5-fold cross-validation strategy was employed to ensure statistical robustness. The results demonstrate that the MLP model, optimized with Focal Loss, consistently delivered the highest classification performance, achieving average scores of 94% accuracy, 93% precision, 95% recall, 94% F1-score, and an AUC of 0.97. In contrast, CNN and SimpleNN architectures yielded less favorable outcomes under the same loss configurations. These findings highlight the importance of aligning loss function choice with model complexity in the context of imbalanced biomedical data. The insights from this work contribute to the development of more reliable computational frameworks for early-phase immunogenicity screening and support the advancement of precision pharmacovigilance strategies.

Downloads

Download data is not yet available.

References

D. Yang, X. Peng, S. Zheng, and S. Peng, “Deep learning-based prediction of autoimmune diseases,” Sci. Rep., vol. 15, no. 1, pp. 1–15, 2025, doi: 10.1038/s41598-025-88477-4.

L. Huang, P. Liu, and X. Huang, “InterDIA: Interpretable prediction of drug-induced autoimmunity through ensemble machine learning approaches,” Toxicology, vol. 511, p. 154064, Feb. 2025, doi: 10.1016/J.TOX.2025.154064.

A. Begum and R. Kumar, “Design an Archetype to Predict the impact of diet and lifestyle interventions in autoimmune diseases using Deep Learning and Artificial Intelligence,” 2022, [Online]. Available: https://www.researchsquare.com/article/rs-1405206/v1

S. Singh, R. Kumar, S. Payra, and S. K. Singh, “Artificial Intelligence and Machine Learning in Pharmacological Research: Bridging the Gap Between Data and Drug Discovery.,” Cureus, vol. 15, no. 8, p. e44359, Aug. 2023, doi: 10.7759/cureus.44359.

UCI Machine Learning Repository, “Drug Induced Autoimmunity Prediction Data Set.” University of California, Irvine, 2024. [Online]. Available: https://archive.ics.uci.edu/dataset/1104/drug_induced_autoimmunity_prediction

N. K. Iyortsuun, S. H. Kim, M. Jhon, H. J. Yang, and S. Pant, “A Review of Machine Learning and Deep Learning Approaches on Mental Health Diagnosis,” Healthc., vol. 11, no. 3, 2023, doi: 10.3390/healthcare11030285.

S. Jadon, “A survey of loss functions for semantic segmentation,” 2020 IEEE Conf. Comput. Intell. Bioinforma. Comput. Biol. CIBCB 2020, 2020, doi: 10.1109/CIBCB48159.2020.9277638.

E. Karakullukcu, “Leveraging convolutional neural networks for image-based classification of feature matrix data,” Expert Syst. Appl., vol. 281, p. 127625, Jul. 2025, doi: 10.1016/J.ESWA.2025.127625.

R. Shwartz-Ziv, M. Goldblum, Y. L. Li, C. B. Bruss, and A. G. Wilson, “Simplifying Neural Network Training Under Class Imbalance,” Adv. Neural Inf. Process. Syst., vol. 36, no. NeurIPS, 2023.

C. Wang, J. Balazs, G. Szarvas, P. Ernst, L. Poddar, and P. Danchenko, “Calibrating Imbalanced Classifiers with Focal Loss: An Empirical Study,” EMNLP 2022 - Proc. 2022 Conf. Empir. Methods Nat. Lang. Process. Ind. Track, pp. 155–163, 2022, doi: 10.18653/v1/2022.emnlp-industry.14.

P. Lin, Tsung-Yi and Goyal, Priya and Girshick, Ross and He, Kaiming and Dollár, “Focal Loss for Dense Object Detection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 2, pp. 318–327, 2020, doi: 10.1109/TPAMI.2018.2858826.

“An Introduction to Focal Loss | Baeldung on Computer Science.” Accessed: Oct. 21, 2025. [Online]. Available: https://www.baeldung.com/cs/focal-loss

X. Ma et al., “On the value of imbalance loss functions in enhancing deep learning-based vulnerability detection,” Expert Syst. Appl., vol. 291, p. 128504, Oct. 2025, doi: 10.1016/J.ESWA.2025.128504.

X. Li, X. Sun, Y. Meng, J. Liang, F. Wu, and J. Li, “Dice loss for data-imbalanced NLP tasks,” Proc. Annu. Meet. Assoc. Comput. Linguist., no. 2, pp. 465–476, 2020, doi: 10.18653/v1/2020.acl-main.45.

H. Fan, W. Yan, L. Wang, J. Liu, Y. Bin, and J. Xia, “Deep learning-based multi-functional therapeutic peptides prediction with a multi-label focal dice loss function,” Bioinformatics, vol. 39, no. 6, p. btad334, Jun. 2023, doi: 10.1093/bioinformatics/btad334.

M. Yeung, E. Sala, C. B. Schönlieb, and L. Rundo, “Unified Focal loss: Generalising Dice and cross entropy-based losses to handle class imbalanced medical image segmentation,” Comput. Med. Imaging Graph., vol. 95, no. December 2021, 2022, doi: 10.1016/j.compmedimag.2021.102026.

T. Li, H. Fan, J. Zhao, X. Yang, and J. Xia, “MultiPep-DLCL: recognition of multifunctional therapeutic peptides through deep learning with label-sequence contrastive learning.,” Brief. Bioinform., vol. 26, no. 3, May 2025, doi: 10.1093/bib/bbaf274.

M. Cabezas and Y. Diez, “An Analysis of Loss Functions for Heavily Imbalanced Lesion Segmentation,” Sensors, vol. 24, no. 6, pp. 1–21, 2024, doi: 10.3390/s24061981.

“Deep Learning for Tabular Data | Medium,” Strong Analytics. Accessed: Oct. 21, 2025. [Online]. Available: https://medium.com/@stronganalytics/deep-learning-for-tabular-data-an-overview-3a742d72246b

M. Li, X. Zhang, C. Thrampoulidis, J. Chen, and S. Oymak, “AutoBalance: Optimized Loss Functions for Imbalanced Data,” Adv. Neural Inf. Process. Syst., vol. 4, no. NeurIPS, pp. 3163–3177, 2021.

Z. Małyjurek, D. de Beer, E. Joubert, and B. Walczak, “Combining class-modelling and discriminant methods for improvement of products authentication,” Chemom. Intell. Lab. Syst., vol. 228, p. 104620, Sep. 2022, doi: 10.1016/J.CHEMOLAB.2022.104620.

X. Liu, L. Wang, L. Ma, and C. Wang, “DRFL: Dynamic-Recall Focal Loss for Image Classification and Segmentation,” Appl. Artif. Intell., vol. 38, no. 1, 2024, doi: 10.1080/08839514.2024.2411845.

M. N. Razali, N. Arbaiy, P. C. Lin, and S. Ismail, “Optimizing Multiclass Classification Using Convolutional Neural Networks with Class Weights and Early Stopping for Imbalanced Datasets,” Electron., vol. 14, no. 4, pp. 1–14, 2025, doi: 10.3390/electronics14040705.

S. Rajaraman, P. Ganesan, and S. Antani, “Deep learning model calibration for improving performance in class-imbalanced medical image classification tasks,” PLoS One, vol. 17, no. 1, p. e0262838, Jan. 2022, doi: 10.1371/journal.pone.0262838.

M. Yeung, L. Rundo, Y. Nan, E. Sala, C. B. Schönlieb, and G. Yang, “Calibrating the Dice Loss to Handle Neural Network Overconfidence for Biomedical Image Segmentation,” J. Digit. Imaging, vol. 36, no. 2, pp. 739–752, 2023, doi: 10.1007/s10278-022-00735-3.

J. He, Z. Liu, and X. Tang, “A deep learning model for predicting systemic lupus erythematosus-associated epitopes,” BMC Med. Inform. Decis. Mak., vol. 25, no. 1, 2025, doi: 10.1186/s12911-025-03056-x.

D. Chicco and G. Jurman, “The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation,” BMC Genomics, vol. 21, no. 1, pp. 1–13, 2020, doi: 10.1186/s12864-019-6413-7.

E. Richardson, R. Trevizani, J. A. Greenbaum, H. Carter, M. Nielsen, and B. Peters, “The receiver operating characteristic curve accurately assesses imbalanced datasets,” Patterns, vol. 5, no. 6, p. 100994, 2024, doi: 10.1016/j.patter.2024.100994.

Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Comparative Analysis of Loss Functions for Predicting Autoimmunity from Molecular Descriptors Using Deep Learning