Clustering-Based Stock Return Prediction using K-Medoids and Long Short-Term Memory (LSTM)
Abstract
This research focuses on predicting stock returns using the K-Medoids clustering method and the Long Short-Term Memory (LSTM) model. The primary challenge lies in forecasting stock prices, which are then converted into return predictions. Clustering is performed to group stocks with similar price movements, facilitating the preparation of data for training the LSTM model within each cluster. This issue is crucial for aiding investors in making more informed investment decisions by leveraging predictions within specific stock clusters. Through clustering with K-Medoids, based on average returns and return standard deviation, the LSTM model is trained to predict daily returns for each stock within different clusters using the average stock price in each cluster. The data is divided into training (2013-2019) and testing (2020-2022) datasets, with model evaluation conducted using Root Mean Square Error (RMSE). The implementation results indicate prediction performance measured by RMSE for each cluster, with Cluster 3 showing the best performance with a testing RMSE of 0.0300, while Cluster 4 exhibited the worst performance with an RMSE of 0.3995. In the formation of an equal weight portfolio, tested from May 2020 to January 2023, the portfolio value grew from 1 to 2.50, with an average return of 0.0014 and a return standard deviation of 0.0158, indicating potential gains with lower risk compared to the LQ45 index.
Downloads
References
M. Li, Y. Zhu, Y. Shen, and M. Angelova, “Clustering-enhanced stock price prediction using deep learning,” World Wide Web, vol. 26, no. 1, pp. 207–232, 2023, doi: 10.1007/s11280-021-01003-0.
N. Naik and B. R. Mohan, Study of stock return predictions using recurrent neural networks with LSTM, vol. 1000. Springer International Publishing, 2019. doi: 10.1007/978-3-030-20257-6_39.
M. Ashrafzadeh, H. M. Taheri, M. Gharehgozlou, and S. Hashemkhani Zolfani, “Clustering-based return prediction model for stock pre-selection in portfolio optimization using PSO-CNN+MVF,” Journal of King Saud University - Computer and Information Sciences, vol. 35, no. 9, p. 101737, 2023, doi: 10.1016/j.jksuci.2023.101737.
J. Vásquez Sáenz, F. M. Quiroga, and A. F. Bariviera, “Data vs. information: Using clustering techniques to enhance stock returns forecasting,” International Review of Financial Analysis, vol. 88, no. November 2022, p. 102657, 2023, doi: 10.1016/j.irfa.2023.102657.
K. Nakagawa, M. Imamura, and K. Yoshida, “Stock price prediction using k-medoids clustering with indexing dynamic time warping,” Electronics and Communications in Japan, vol. 102, no. 2, pp. 3–8, 2019, doi: 10.1002/ecj.12140.
D. O. Sunday, “Application of Long Short-Term Memory (LSTM) in Stock Price Prediction,” International Journal of Development and Economic Sustainability, vol. 12, no. 3, pp. 36–45, 2024.
M. Umer Ghani, M. Awais, and M. Muzammul, “Stock Market Prediction Using Machine Learning(ML)Algorithms,” Advances in Distributed Computing and Artificial Intelligence Journal, vol. 8, no. 4, pp. 97–116, 2019, doi: 10.14201/ADCAIJ20198497116.
G. N. Mulyono, D. Saepudin, and A. A. Rohmawati, “Portfolio Optimization Based on Return Prediction and Semi Absolute Deviation (SAD),” International Journal on Information and Communication Technology (IJoICT), vol. 9, no. 1, pp. 14–26, 2023, doi: 10.21108/ijoict.v9i1.698.
B. He, E. Gong, L. Li, and Y. Yang, “A Stock Price Prediction Method based on LSTM and K-Means,” Frontiers in Science and Engineering, vol. 3, no. 6, pp. 44–57, 2023, doi: 10.54691/fse.v3i6.5121.
W. Bessler and D. Wolff, “Portfolio Optimization with Sector Return Prediction Models,” JRFM, vol. 17, no. 6, p. 254, Jun. 2024, doi: 10.3390/jrfm17060254.
M. Mallikarjuna and R. P. Rao, “Evaluation of forecasting methods from selected stock market returns,” Financial Innovation, vol. 5, no. 1, 2019, doi: 10.1186/s40854-019-0157-x.
W. Wang, W. Li, N. Zhang, and K. Liu, “Portfolio formation with preselection using deep learning from long-term financial data,” Expert Systems with Applications, vol. 143, p. 113042, 2020, doi: 10.1016/j.eswa.2019.113042.
Y. Ma, R. Han, and W. Wang, “Portfolio optimization with return prediction using deep learning and machine learning,” Expert Systems with Applications, vol. 165, no. September 2020, p. 113973, 2021, doi: 10.1016/j.eswa.2020.113973.
O. B. Sezer, M. U. Gudelek, and A. M. Ozbayoglu, “Financial time series forecasting with deep learning: A systematic literature review: 2005–2019,” Applied Soft Computing Journal, vol. 90, p. 106181, 2020, doi: 10.1016/j.asoc.2020.106181.
T. B. Shahi, A. Shrestha, A. Neupane, and W. Guo, “Stock Price Forecasting with Deep Learning: A Comparative Study,” Mathematics, vol. 8, no. 9, p. 1441, Aug. 2020, doi: 10.3390/math8091441.
Z. Moeini Najafabadi, M. Bijari, and M. Khashei, “Making investment decisions in stock markets using a forecasting-Markowitz based decision-making approaches,” JM2, vol. 15, no. 2, pp. 647–659, Nov. 2019, doi: 10.1108/JM2-12-2018-0217.
D. Dwi Aulia and N. Nurahman, “Comparison Performance of K-Medoids and K-Means Algorithms In Clustering Community Education Levels,” j. nas. pendidik. teknik. inform., vol. 12, no. 2, pp. 273–282, Jul. 2023, doi: 10.23887/janapati.v12i2.59789.
G. Amato, C. Gennaro, V. Oria, and M. Radovanović, Eds., Similarity Search and Applications: 12th International Conference, SISAP 2019, Newark, NJ, USA, October 2–4, 2019, Proceedings, vol. 11807. in Lecture Notes in Computer Science, vol. 11807. Cham: Springer International Publishing, 2019. doi: 10.1007/978-3-030-32047-8.
H. Qian, “Stock Predicting based on LSTM and ARIMA,” in Proceedings of the 2022 2nd International Conference on Economic Development and Business Culture (ICEDBC 2022), vol. 225, Y. Jiang, Y. Shvets, and H. Mallick, Eds., in Advances in Economics, Business and Management Research, vol. 225. , Dordrecht: Atlantis Press International BV, 2022, pp. 485–490. doi: 10.2991/978-94-6463-036-7_72.
P. S. Kumar, H. S. Behera, K. Anisha Kumari, J. Nayak, and B. Naik, “Advancement from neural networks to deep learning in software effort estimation: Perspective of two decades,” Computer Science Review, vol. 38, p. 100288, 2020, doi: 10.1016/j.cosrev.2020.100288.
A. Kumar et al., “Generative adversarial network (GAN) and enhanced root mean square error (ERMSE): deep learning for stock price movement prediction,” Multimedia Tools and Applications, vol. 81, no. 3, pp. 3995–4013, 2022, doi: 10.1007/s11042-021-11670-w.
A. Chaweewanchon and R. Chaysiri, “Markowitz Mean-Variance Portfolio Optimization with Predictive Stock Selection Using Machine Learning,” IJFS, vol. 10, no. 3, p. 64, Aug. 2022, doi: 10.3390/ijfs10030064.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Clustering-Based Stock Return Prediction using K-Medoids and Long Short-Term Memory (LSTM)
Pages: 1301-1312
Copyright (c) 2024 Denny Sofyan, Deni Saepudin

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).