The Improvement of Android Malware Family Detection through System Call Feature Analysis and Machine Learning
Abstract
Malware poses a significant threat to cybersecurity, particularly for Android users. Each type of malware is categorized into distinct categories and families, each exhibiting unique malicious capabilities. Accurately identifying these categories and families is crucial for developing effective prevention and mitigation strategies, allowing for the control of threats before they worsen. Throughout the years, numerous techniques have been proposed for detecting malware families, with system calls emerging as a vital feature. Collected through dynamic analysis, system calls offer in-depth insights into the activities executed by malware, making them a powerful classification tool. This study aims to enhance the detection of Android malware families and categories by analyzing system calls with feature selection method. Using the Gain Ratio algorithm, significant system calls are identified to improve detection accuracy and reduce the complexity of the feature set. The study assesses machine learning algorithms, particularly Random Forest, J48, Naïve Bayes, and Decision Table. The findings show that Random Forest consistently outperforms other algorithms, achieving an accuracy of 88.01% for malware family detection and 89.65% for category detection, with high precision and recall across most metrics. The application of the Gain Ratio feature selection method led to a 68.83% feature reduction and improved model-building speed by 50.26%. This integration of feature selection and machine learning provides a more effective approach to detecting malware families and categories, thus contributing to enhanced Android security.
Downloads
References
G DATA, “Mobile Malware Report, No Let Up with Android Malware.” https://www.gdatasoftware.co.uk/news/2019/07/35228-mobile-malware-report-no-let-up-with-android-malware (accessed Mar. 15, 2020).
S. Morgan, “Global Ransomware Damage Costs Predicted To Exceed $265 Billion By 2031,” 2024. https://cybersecurityventures.com/global-ransomware-damage-costs-predicted-to-reach-250-billion-usd-by-2031/ (accessed Oct. 09, 2024).
Kaspersky, “What is Heuristic Analysis?,” 2019. https://usa.kaspersky.com/resource-center/definitions/heuristic-analysis (accessed Jan. 29, 2021).
T. Sutter, T. Kehrer, M. Rennhard, B. Tellenbach, and J. Klein, “Dynamic Security Analysis on Android: A Systematic Literature Review,” IEEE Access, vol. 12, pp. 57261–57287, 2024, doi: 10.1109/ACCESS.2024.3390612.
Y. Pan, X. Ge, C. Fang, and Y. Fan, “A Systematic Literature Review of Android Malware Detection Using Static Analysis,” IEEE Access, vol. 8, pp. 116363–116379, 2020, doi: 10.1109/ACCESS.2020.3002842.
Z. Wang, Q. Liu, and Y. Chi, “Review of Android Malware Detection Based on Deep Learning,” IEEE Access, vol. 8, pp. 181102–181126, 2020, doi: 10.1109/ACCESS.2020.3028370.
M. Abuthawabeh and K. Mahmoud, “Enhanced android malware detection and family classification, using conversation-level network traffic features,” Int. Arab J. Inf. Technol., vol. 17, no. 4 Special Issue, pp. 607–614, 2020, doi: 10.34028/iajit/17/4A/4.
S. Fallah and A. J. Bidgoly, “Android malware detection using network traffic based on sequential deep learning models,” Softw. Pract. Exp., vol. 52, no. 9, pp. 1987–2004, Sep. 2022, doi: 10.1002/spe.3112.
M. Gohari, S. Hashemi, and L. Abdi, “Android Malware Detection and Classification Based on Network Traffic Using Deep Learning,” 2021 7th Int. Conf. Web Res. ICWR 2021, pp. 71–77, 2021, doi: 10.1109/ICWR51868.2021.9443025.
L. Taheri, A. F. A. Kadir, and A. H. Lashkari, “Extensible android malware detection and family classification using network-flows and API-calls,” Proc. - Int. Carnahan Conf. Secur. Technol., vol. 2019-Octob, no. December, 2019, doi: 10.1109/CCST.2019.8888430.
W. Wang et al., “Constructing Features for Detecting Android Malicious Applications: Issues, Taxonomy and Directions,” IEEE Access, vol. 7, pp. 67602–67631, 2019, doi: 10.1109/ACCESS.2019.2918139.
Q. Wang, M. Tang, and J. Fu, “EavesDroid: Eavesdropping User Behaviors via OS Side Channels on Smartphones,” IEEE Internet Things J., vol. 11, no. 3, pp. 3979–3993, 2024, doi: 10.1109/JIOT.2023.3298992.
“Different Types of System Calls in OS.” https://www.geeksforgeeks.org/different-types-of-system-calls-in-os/ (accessed Oct. 01, 2024).
I. Burguera, U. Zurutuza, and S. Nadjm-Tehrani, “Crowdroid: Behavior-Based Malware Detection System for Android,” in Proceedings of the 1st ACM workshop on Security and privacy in smartphones and mobile devices, Oct. 2011, pp. 15–26, doi: 10.1145/2046614.2046619.
T. S. John, T. Thomas, and S. Emmanuel, “Graph Convolutional Networks for Android Malware Detection with System Call Graphs,” in 2020 Third ISEA Conference on Security and Privacy (ISEA-ISAP), Feb. 2020, pp. 162–170, doi: 10.1109/ISEA-ISAP49340.2020.235015.
Y. Ding, X. Xia, S. Chen, and Y. Li, “A malware detection method based on family behavior graph,” Comput. Secur., vol. 73, pp. 73–86, 2018, doi: 10.1016/j.cose.2017.10.007.
H. H. R. Manzil and S. Manohar Naik, “Android malware category detection using a novel feature vector-based machine learning model,” Cybersecurity, vol. 6, no. 1, 2023, doi: 10.1186/s42400-023-00139-y.
A. Hashem, E. Fiky, A. El Shenawy, and M. A. Madkour, “Android Malware Category and Family Detection and Identification using Machine Learning,” 2021.
S. Shakya and M. Dave, “Analysis, Detection, and Classification of Android Malware using System Calls,” Arxiv, 2022, [Online]. Available: http://arxiv.org/abs/2208.06130.
S. Malik and K. Khatter, “System call analysis of Android Malware families,” Indian J. Sci. Technol., vol. 9, no. 21, 2016, doi: 10.17485/ijst/2016/v9i21/90273.
A. H. Lashkari, A. F. A. Kadir, L. Taheri, and A. A. Ghorbani, “Toward Developing a Systematic Approach to Generate Benchmark Android Malware Datasets and Classification,” Proc. - Int. Carnahan Conf. Secur. Technol., vol. 2018-Octob, no. Cic, pp. 1–7, 2018, doi: 10.1109/CCST.2018.8585560.
S. Mahdavifar, D. Alhadidi, and A. A. Ghorbani, Effective and Efficient Hybrid Android Malware Classification Using Pseudo-Label Stacked Auto-Encoder, vol. 30, no. 1. Springer US, 2022.
S. Mahdavifar, A. F. Abdul Kadir, R. Fatemi, D. Alhadidi, and A. A. Ghorbani, “Dynamic Android Malware Category Classification using Semi-Supervised Deep Learning,” in 2020 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), Aug. 2020, pp. 515–522, doi: 10.1109/DASC-PICom-CBDCom-CyberSciTech49142.2020.00094.
Statcounter, “Android Version Market Share.” https://gs.statcounter.com/os-version-market-share/android (accessed Sep. 02, 2022).
Z. Ni, M. Yang, Z. Ling, J. N. Wu, and J. Luo, “Real-Time Detection of Malicious Behavior in Android Apps,” Proc. - 2016 Int. Conf. Adv. Cloud Big Data, CBD 2016, pp. 221–227, 2017, doi: 10.1109/CBD.2016.046.
Android, “System Call Android API 24/Nougat.” https://android.googlesource.com/platform/bionic/+/refs/heads/nougat-mr2.1-release/libc/SYSCALLS.TXT (accessed Oct. 07, 2024).
O. S. Community, “pselect6.” https://man7.org/linux/man-pages/man3/pselect.3p.html (accessed Oct. 10, 2024).
O. S. Community, “setsid.” https://man7.org/linux/man-pages/man2/setsid.2.html (accessed Oct. 10, 2024).
O. S. Community, “inotify_init1.” https://man7.org/linux/man-pages/man2/inotify_init1.2.html (accessed Oct. 10, 2024).
O. S. Community, “getpgid.” https://man7.org/linux/man-pages/man3/getpgid.3p.html.
O. S. Community, “ppoll.” https://man7.org/linux/man-pages/man2/poll.2.html (accessed Oct. 10, 2024).
O. S. Community, “fchmod.” https://man7.org/linux/man-pages/man3/fchmod.3p.html (accessed Oct. 10, 2024).
O. S. Community, “inotify_add_watch.” https://man7.org/linux/man-pages/man2/inotify_add_watch.2.html (accessed Oct. 10, 2024).
O. S. Community, “rt_sigtimedwait.” https://man7.org/linux/man-pages/man2/sigwaitinfo.2.html (accessed Oct. 10, 2024).
O. S. Community, “_llseek.” https://man7.org/linux/man-pages/man2/llseek.2.html (accessed Oct. 10, 2024).
O. S. Community, “getppid.” https://man7.org/linux/man-pages/man3/getppid.3p.html (accessed Oct. 10, 2024).
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel The Improvement of Android Malware Family Detection through System Call Feature Analysis and Machine Learning
Pages: 349-358
Copyright (c) 2024 Rajif Agung Yunmar

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).






















