Optimasi Support Vector Machine Menggunakan RandomizedSearchCV dan SMOTE untuk Klasifikasi Kebugaran Berdasarkan Parameter Fisiologis


  • Gema Amran Nathansyach Universitas Dian Nuswantoro, Semarang, Indonesia
  • Purwanto Purwanto * Mail Universitas Dian Nuswantoro, Semarang, Indonesia
  • (*) Corresponding Author
Keywords: Support Vector Machine; RandomizedSearchCV; SMOTE; Machine Learning; Fitness Classification

Abstract

This study aims to improve the accuracy of the Support Vector Machine (SVM) model in classifying fitness status (fit/unfit) based on physiological parameters and lifestyle using the Fitness Classification Dataset, which is a synthetic dataset designed to represent fitness indicators such as BMI, height, weight, heart rate, blood pressure, nutritional quality, sleep duration, and activity index. The dataset has an imbalanced class distribution and contains a combination of numerical and categorical features, thus requiring comprehensive preprocessing. This study applies two optimization techniques, namely RandomizedSearchCV for efficient hyperparameter tuning and SMOTE for handling class imbalance. The experimental results show that the baseline SVM model produces an accuracy of 75.75%, while the combination of SVM + RandomizedSearchCV + SMOTE increases the accuracy to 80%, or an increase of 4.25%. In addition, the AUC value also increased from 0.835 in the baseline to 0.850 in the optimized model. These findings indicate that the integration of RandomizedSearchCV and SMOTE significantly improves the model's ability to capture non-linear patterns while increasing sensitivity to minority classes. Overall, this study proves that the optimized SVM pipeline is capable of providing more stable and accurate performance in fitness status classification tasks and can be used as a reference for developing predictive models in other health domains.

Downloads

Download data is not yet available.

References

M. Lopez and Y. Huang, “Wearable Sensor Data for Predicting Physical Fitness Levels Using Support Vector Machines,” Sensors, vol. 21, no. 13, p. 4412, 2021, doi: 10.3390/s21134412.

J. Smith, T. Wang, and R. Patel, “Machine Learning Models for Lifestyle-Based Fitness Classification,” IEEE Access, vol. 8, pp. 155920–155930, 2020, doi: 10.1109/ACCESS.2020.3018892.

M. Rahman and F. Li, “Synthetic Data and SMOTE Applications for Balanced Health Classification,” Artif Intell Med, vol. 140, p. 102535, 2023, doi: 10.1016/j.artmed.2023.102535.

S. Kim and H. Park, “Sleep, Activity, and Heart Rate Features for Fitness Status Prediction,” PLoS One, vol. 17, no. 4, p. e0266503, 2022, doi: 10.1371/journal.pone.0266503.

L. Zhou and D. Chen, “Evaluating SVM Performance on Lifestyle and Physiological Health Metrics,” IEEE J Biomed Health Inform, vol. 28, no. 1, pp. 220–230, 2024, doi: 10.1109/JBHI.2023.3332104.

M. Garcia and P. Torres, “Multimodal Physiological Data for Fitness-Level Prediction Using ML Pipelines,” Comput Biol Med, vol. 127, p. 104067, 2020, doi: 10.1016/j.compbiomed.2020.104067.

A. Rossi and V. Silva, “Physical Activity Recognition and Fitness Assessment Using Wearables,” J Biomed Inform, vol. 98, p. 103283, 2019, doi: 10.1016/j.jbi.2019.103283.

Q. Nguyen and L. Tran, “Optimizing SVM for Health Classification With RandomizedSearchCV,” Expert Syst Appl, vol. 176, p. 114895, 2021, doi: 10.1016/j.eswa.2021.114895.

M. Alam and T. Chowdhury, “Feature Engineering and SVM for Predictive Health Analytics,” Inf Sci (N Y), vol. 604, pp. 240–255, 2022, doi: 10.1016/j.ins.2022.05.110.

A. Singh and R. Mehta, “Lifestyle Indicators and ML-Based Fitness Classification,” Health Informatics J, vol. 29, no. 2, pp. 1460–1475, 2023, doi: 10.1177/14604582231100211.

Y. Zhang and H. Wu, “Using Physiological Markers to Predict Fitness Category via SVM and Kernel Approximation,” J Med Syst, vol. 49, no. 1, p. 12, 2025, doi: 10.1007/s10916-024-02010-y.

K. Foster and B. Adams, “Impact of SMOTE Variants on Imbalanced Health Datasets,” Machine Learning with Applications, vol. 2, p. 100015, 2020, doi: 10.1016/j.mlwa.2020.100015.

L. Chen and P. Zhao, “A Pipeline Approach for Fitness Status Classification Using ML Techniques,” Applied Sciences, vol. 11, no. 18, p. 8573, 2021, doi: 10.3390/app11188573.

X. Li and J. Meng, “Kernel Approximation Methods for Large-Scale SVM in Health Monitoring,” Neural Comput Appl, vol. 34, pp. 10045–10061, 2022, doi: 10.1007/s00521-021-06705-5.

C. Okafor and A. Musa, “Nutrition, Activity, and Sleep Features for Fitness Prediction,” Int J Med Inform, vol. 176, p. 105109, 2023, doi: 10.1016/j.ijmedinf.2023.105109.

T. Yamada and K. Sato, “Using Wearable Sensor Streams to Model Daily Fitness Readiness,” Sensors, vol. 19, no. 21, p. 4705, 2019, doi: 10.3390/s19214705.

R. Bhandari and P. Shrestha, “Sleep–Nutrition Interaction Effects in Predictive Health Models,” Healthcare Analytics, vol. 1, p. 100003, 2020, doi: 10.1016/j.health.2020.100003.

F. Corrales and M. Diaz, “Evaluating ML Methods for Predicting Fitness Fatigue From Lifestyle Metrics,” Sci Rep, vol. 14, p. 5521, 2024, doi: 10.1038/s41598-024-53355-y.

R. Williams and N. Patel, “Hybrid SVM Pipelines for Lifestyle-Driven Fitness Classification,” IEEE Trans Affect Comput, vol. 16, no. 1, pp. 33–45, 2025, doi: 10.1109/TAFFC.2024.3339011.

U. Khan and A. Rehman, “Permutation Importance for Model Explainability in Health Classification,” J Healthc Eng, vol. 2021, p. 6654983, 2021, doi: 10.1155/2021/6654983.

A. Musavi, “Logistic Regression on Fitness Data,” 2025. [Online]. Available: https://www.kaggle.com/code/alimusavi8686/logistic-regression-on-fitness-data

M. Samy, “Logistic Regression and SVM,” 2025. [Online]. Available: https://www.kaggle.com/code/mahisamy/logistic-regression-and-svm

C. Shekar, “Notebook Analysis,” 2025. [Online]. Available: https://www.kaggle.com/code/hermitsays/notebook5f1adc342b


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Optimasi Support Vector Machine Menggunakan RandomizedSearchCV dan SMOTE untuk Klasifikasi Kebugaran Berdasarkan Parameter Fisiologis

Dimensions Badge
Article History
Submitted: 2025-11-22
Published: 2025-12-26
Abstract View: 296 times
PDF Download: 227 times
How to Cite
Nathansyach, G., & Purwanto, P. (2025). Optimasi Support Vector Machine Menggunakan RandomizedSearchCV dan SMOTE untuk Klasifikasi Kebugaran Berdasarkan Parameter Fisiologis. Building of Informatics, Technology and Science (BITS), 7(3), 1854-1865. https://doi.org/10.47065/bits.v7i3.8770
Issue
Section
Articles