Optimasi Support Vector Machine Menggunakan RandomizedSearchCV dan SMOTE untuk Klasifikasi Kebugaran Berdasarkan Parameter Fisiologis
Abstract
This study aims to improve the accuracy of the Support Vector Machine (SVM) model in classifying fitness status (fit/unfit) based on physiological parameters and lifestyle using the Fitness Classification Dataset, which is a synthetic dataset designed to represent fitness indicators such as BMI, height, weight, heart rate, blood pressure, nutritional quality, sleep duration, and activity index. The dataset has an imbalanced class distribution and contains a combination of numerical and categorical features, thus requiring comprehensive preprocessing. This study applies two optimization techniques, namely RandomizedSearchCV for efficient hyperparameter tuning and SMOTE for handling class imbalance. The experimental results show that the baseline SVM model produces an accuracy of 75.75%, while the combination of SVM + RandomizedSearchCV + SMOTE increases the accuracy to 80%, or an increase of 4.25%. In addition, the AUC value also increased from 0.835 in the baseline to 0.850 in the optimized model. These findings indicate that the integration of RandomizedSearchCV and SMOTE significantly improves the model's ability to capture non-linear patterns while increasing sensitivity to minority classes. Overall, this study proves that the optimized SVM pipeline is capable of providing more stable and accurate performance in fitness status classification tasks and can be used as a reference for developing predictive models in other health domains.
Downloads
References
M. Lopez and Y. Huang, “Wearable Sensor Data for Predicting Physical Fitness Levels Using Support Vector Machines,” Sensors, vol. 21, no. 13, p. 4412, 2021, doi: 10.3390/s21134412.
J. Smith, T. Wang, and R. Patel, “Machine Learning Models for Lifestyle-Based Fitness Classification,” IEEE Access, vol. 8, pp. 155920–155930, 2020, doi: 10.1109/ACCESS.2020.3018892.
M. Rahman and F. Li, “Synthetic Data and SMOTE Applications for Balanced Health Classification,” Artif Intell Med, vol. 140, p. 102535, 2023, doi: 10.1016/j.artmed.2023.102535.
S. Kim and H. Park, “Sleep, Activity, and Heart Rate Features for Fitness Status Prediction,” PLoS One, vol. 17, no. 4, p. e0266503, 2022, doi: 10.1371/journal.pone.0266503.
L. Zhou and D. Chen, “Evaluating SVM Performance on Lifestyle and Physiological Health Metrics,” IEEE J Biomed Health Inform, vol. 28, no. 1, pp. 220–230, 2024, doi: 10.1109/JBHI.2023.3332104.
M. Garcia and P. Torres, “Multimodal Physiological Data for Fitness-Level Prediction Using ML Pipelines,” Comput Biol Med, vol. 127, p. 104067, 2020, doi: 10.1016/j.compbiomed.2020.104067.
A. Rossi and V. Silva, “Physical Activity Recognition and Fitness Assessment Using Wearables,” J Biomed Inform, vol. 98, p. 103283, 2019, doi: 10.1016/j.jbi.2019.103283.
Q. Nguyen and L. Tran, “Optimizing SVM for Health Classification With RandomizedSearchCV,” Expert Syst Appl, vol. 176, p. 114895, 2021, doi: 10.1016/j.eswa.2021.114895.
M. Alam and T. Chowdhury, “Feature Engineering and SVM for Predictive Health Analytics,” Inf Sci (N Y), vol. 604, pp. 240–255, 2022, doi: 10.1016/j.ins.2022.05.110.
A. Singh and R. Mehta, “Lifestyle Indicators and ML-Based Fitness Classification,” Health Informatics J, vol. 29, no. 2, pp. 1460–1475, 2023, doi: 10.1177/14604582231100211.
Y. Zhang and H. Wu, “Using Physiological Markers to Predict Fitness Category via SVM and Kernel Approximation,” J Med Syst, vol. 49, no. 1, p. 12, 2025, doi: 10.1007/s10916-024-02010-y.
K. Foster and B. Adams, “Impact of SMOTE Variants on Imbalanced Health Datasets,” Machine Learning with Applications, vol. 2, p. 100015, 2020, doi: 10.1016/j.mlwa.2020.100015.
L. Chen and P. Zhao, “A Pipeline Approach for Fitness Status Classification Using ML Techniques,” Applied Sciences, vol. 11, no. 18, p. 8573, 2021, doi: 10.3390/app11188573.
X. Li and J. Meng, “Kernel Approximation Methods for Large-Scale SVM in Health Monitoring,” Neural Comput Appl, vol. 34, pp. 10045–10061, 2022, doi: 10.1007/s00521-021-06705-5.
C. Okafor and A. Musa, “Nutrition, Activity, and Sleep Features for Fitness Prediction,” Int J Med Inform, vol. 176, p. 105109, 2023, doi: 10.1016/j.ijmedinf.2023.105109.
T. Yamada and K. Sato, “Using Wearable Sensor Streams to Model Daily Fitness Readiness,” Sensors, vol. 19, no. 21, p. 4705, 2019, doi: 10.3390/s19214705.
R. Bhandari and P. Shrestha, “Sleep–Nutrition Interaction Effects in Predictive Health Models,” Healthcare Analytics, vol. 1, p. 100003, 2020, doi: 10.1016/j.health.2020.100003.
F. Corrales and M. Diaz, “Evaluating ML Methods for Predicting Fitness Fatigue From Lifestyle Metrics,” Sci Rep, vol. 14, p. 5521, 2024, doi: 10.1038/s41598-024-53355-y.
R. Williams and N. Patel, “Hybrid SVM Pipelines for Lifestyle-Driven Fitness Classification,” IEEE Trans Affect Comput, vol. 16, no. 1, pp. 33–45, 2025, doi: 10.1109/TAFFC.2024.3339011.
U. Khan and A. Rehman, “Permutation Importance for Model Explainability in Health Classification,” J Healthc Eng, vol. 2021, p. 6654983, 2021, doi: 10.1155/2021/6654983.
A. Musavi, “Logistic Regression on Fitness Data,” 2025. [Online]. Available: https://www.kaggle.com/code/alimusavi8686/logistic-regression-on-fitness-data
M. Samy, “Logistic Regression and SVM,” 2025. [Online]. Available: https://www.kaggle.com/code/mahisamy/logistic-regression-and-svm
C. Shekar, “Notebook Analysis,” 2025. [Online]. Available: https://www.kaggle.com/code/hermitsays/notebook5f1adc342b
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Optimasi Support Vector Machine Menggunakan RandomizedSearchCV dan SMOTE untuk Klasifikasi Kebugaran Berdasarkan Parameter Fisiologis
Pages: 1854-1865
Copyright (c) 2025 Gema Amran Nathansyach, Purwanto Purwanto

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).





















