Stock Industry Sector Prediction Based on Financial Reports Using Random Forest


  • Kamil Elian Zhafran * Mail Telkom University, Indonesia
  • Deni Saepudin Telkom University, Indonesia
  • (*) Corresponding Author
Keywords: Industrial Sector Predictions; Indonesia stock exchange; Financial statements; Random Forest; SMOTE

Abstract

This study aims to predict the stock industry sector on the Indonesia Stock Exchange (IDX) based on financial reports using the Random Forest method. Implementing this machine learning approach is crucial due to the complexity of financial data, which demands robust and adaptive methods for accurate predictions. The dataset comprises financial data from companies across 10 industrial sectors on the IDX, spanning 2010-2022, and includes 17 features from each financial report. Notably, there is an imbalance in the number of companies per sector, with sector B representing 14.76% and sector G only 1.98%. This imbalance introduces bias in data analysis, thus necessitating the application of the SMOTE oversampling method to address it. The research process involves data cleaning, splitting the data into 80% training and 20% testing sets, applying the SMOTE oversampling technique, and comparing predictions from imbalanced and balanced datasets. The Random Forest method is chosen for its capability to handle complex datasets for industrial sector classification. Evaluation results indicate that without oversampling, the model achieves an accuracy of 73.57%, precision of 74.29%, recall of 73.57%, and an F1-score of 73.51%. With oversampling, these metrics improve to an accuracy of 80.21%, precision of 81.34%, recall of 80.21%, and an F1-score of 80.45%.

Downloads

Download data is not yet available.

References

W. Budiharto, “Data science approach to stock prices forecasting in Indonesia during Covid-19 using Long Short-Term Memory (LSTM),” J. Big Data, vol. 8, no. 1, 2021, doi: 10.1186/s40537-021-00430-0.

O. D. Madeeh and H. S. Abdullah, “An Efficient Prediction Model based on Machine Learning Techniques for Prediction of the Stock Market,” J. Phys. Conf. Ser., vol. 1804, no. 1, 2021, doi: 10.1088/1742-6596/1804/1/012008.

Y. S. Soekamto, M. Chandra, T. Wiradinata, R. Tanamal, and T. R. D. Saputri, Property Category Prediction Model using Random Forest Classifier to Improve Property Industry in Surabaya. Atlantis Press International BV, 2023. doi: 10.2991/978-94-6463-144-9_24.

D. Makariou, P. Barrieu, and Y. Chen, “A random forest based approach for predicting spreads in the primary catastrophe bond market,” Insur. Math. Econ., vol. 101, no. Breiman 2001, pp. 140–162, 2021, doi: 10.1016/j.insmatheco.2021.07.003.

H. van der Heijden, “Predicting industry sectors from financial statements: An illustration of machine learning in accounting research,” Br. Account. Rev., vol. 54, no. 5, p. 101096, 2022, doi: 10.1016/j.bar.2022.101096.

P. Chakri, S. Pratap, Lakshay, and S. K. Gouda, “An exploratory data analysis approach for analyzing financial accounting data using machine learning,” Decis. Anal. J., vol. 7, no. March, p. 100212, 2023, doi: 10.1016/j.dajour.2023.100212.

C. Lohrmann and P. Luukka, “Classification of intraday S&P500 returns with a Random Forest,” Int. J. Forecast., vol. 35, no. 1, pp. 390–407, 2019, doi: 10.1016/j.ijforecast.2018.08.004.

H. Daori, “Predicting Stock Prices Using the Random Forest Classier,” 2022, [Online]. Available: https://doi.org/10.21203/rs.3.rs-2266733/v1

P. Ghosh, A. Neufeld, and J. K. Sahoo, “Forecasting directional movements of stock prices for intraday trading using LSTM and random forests,” Financ. Res. Lett., vol. 46, no. December 2018, 2022, doi: 10.1016/j.frl.2021.102280.

M. Vijh, D. Chandola, V. A. Tikkiwal, and A. Kumar, “Stock Closing Price Prediction using Machine Learning Techniques,” Procedia Comput. Sci., vol. 167, no. 2019, pp. 599–606, 2020, doi: 10.1016/j.procs.2020.03.326.

B. Mohammadi ivatlood, C. Spampinato, R. Chopra, K. C. Lee, and S. S. Roy, “Random forest, gradient boosted machines and deep neural network for stock price forecasting: a comparative analysis on South Korean companies,” Int. J. Ad Hoc Ubiquitous Comput., vol. 33, no. 1, p. 62, 2020, doi: 10.1504/ijahuc.2020.10026453.

A. M. N. Alzubaidi and E. S. Al-Shamery, “Projection pursuit Random Forest using discriminant feature analysis model for churners prediction in telecom industry,” Int. J. Electr. Comput. Eng., vol. 10, no. 2, pp. 1406–1421, 2020, doi: 10.11591/ijece.v10i2.pp1406-1421.

X. Zhong and D. Enke, “Predicting the daily return direction of the stock market using hybrid machine learning algorithms,” Financ. Innov., vol. 5, no. 1, 2019, doi: 10.1186/s40854-019-0138-0.

A. Bin Omar, S. Huang, A. A. Salameh, H. Khurram, and M. Fareed, “Stock Market Forecasting Using the Random Forest and Deep Neural Network Models Before and During the COVID-19 Period,” Front. Environ. Sci., vol. 10, no. July, pp. 1–10, 2022, doi: 10.3389/fenvs.2022.917047.

E. González-Núñez, L. A. Trejo, and M. Kampouridis, “A Comparative Study for Stock Market Forecast Based on a New Machine Learning Model,” Big Data Cogn. Comput., vol. 8, no. 4, 2024, doi: 10.3390/bdcc8040034.

K. Kaczmarczyk and M. Hernes, “Financial decisions support using the supervised learning method based on random forests,” Procedia Comput. Sci., vol. 176, pp. 2802–2811, 2020, doi: 10.1016/j.procs.2020.09.276.

J. Shen and M. O. Shafiq, “Short-term stock market price trend prediction using a comprehensive deep learning system,” J. Big Data, vol. 7, no. 1, 2020, doi: 10.1186/s40537-020-00333-6.

N. Rouf et al., “Stock market prediction using machine learning techniques: A decade survey on methodologies, recent developments, and future directions,” Electron., vol. 10, no. 21, 2021, doi: 10.3390/electronics10212717.

P. Sadorsky, “A Random Forests Approach to Predicting Clean Energy Stock Prices,” J. Risk Financ. Manag., vol. 14, no. 2, 2021, doi: 10.3390/jrfm14020048.

T. P. Ogundunmade, A. A. Adepoju, and A. Allam, “Stock Price Forecasting: Machine Learning Models with K-fold and Repeated Cross Validation Approaches,” Mod. Econ. Manag., no. June, 2022, doi: 10.53964/mem.2022001.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Stock Industry Sector Prediction Based on Financial Reports Using Random Forest

Dimensions Badge
Article History
Submitted: 2024-08-06
Published: 2024-09-12
Abstract View: 12 times
PDF Download: 19 times
How to Cite
Zhafran, K., & Saepudin, D. (2024). Stock Industry Sector Prediction Based on Financial Reports Using Random Forest. Building of Informatics, Technology and Science (BITS), 6(2), 1002-1011. https://doi.org/10.47065/bits.v6i2.5743
Section
Articles