Klasifikasi Rentang Gaji Lowongan Pekerjaan di Glints Wilayah Jabodetabek Menggunakan Regresi Logistik dan Random Forest Berbasis Web Scraping


  • Evander Banjarnahor * Mail Universitas Pelita Harapan, Tangerang, Indonesia
  • Theodore Miracle Setiawan Universitas Pelita Harapan, Tangerang, Indonesia
  • Wellson Antonio Charlest Universitas Pelita Harapan, Tangerang, Indonesia
  • Ronald Belferik Universitas Pelita Harapan, Tangerang, Indonesia
  • (*) Corresponding Author
Keywords: Machine Learning; Random Forest; Regresi Logistik; Salary Classification

Abstract

Digital transformation has reshaped the labor market, with online platforms such as Glints serving as large-scale data repositories that connect job seekers with employers. In the Greater Jakarta (Jabodetabek) region, salary information is a critical factor in career decision-making; however, salary-related information asymmetry remains a major challenge. This study begins with a descriptive analysis of 1,497 job vacancies collected through web scraping techniques to examine salary distributions across locations and employment statuses. The salaries were classified into three categories: low salary (≤ IDR 5 million), medium salary (IDR 5–10 million), and high salary (≥ IDR 10 million). The results indicate that the majority of job vacancies fall into the low-salary category (77.09%), followed by the medium-salary category (21.37%), while high-salary positions constitute only 1.54% of the total dataset. Subsequently, this study aims to develop salary category classification models by comparing two machine learning methods: Logistic Regression and Random Forest. Model performance was evaluated using accuracy, precision, recall, and F1-score under multiple training–testing split scenarios. The experimental results demonstrate that Random Forest consistently outperforms Logistic Regression, achieving a highest accuracy of 98.00%, compared to approximately 79% for Logistic Regression. These findings suggest that the relationship between job characteristics and salary categories is complex and non-linear, making it more effectively captured by ensemble-based, non-linear algorithms such as Random Forest. This study contributes to improving salary transparency and supports the development of more accurate and data-driven salary prediction systems.

Downloads

Download data is not yet available.

References

Alsheyab, A. R., Alkhasawneh, M., & Shahin, N. (2025). Job Market Cheat Codes: Prototyping Salary Prediction and Job Grouping with Synthetic Job Listings. Arxiv. http://arxiv.org/abs/2506.15879

Amin, R., & Utami, A. S. F. (2025). Prediksi Nilai Ujian Berdasarkan Kebiasaan Siswa Menggunakan Algoritma Random Forest Regressor. Information System For Educators And Professionals : Journal of Information System, 10(2), 149. https://doi.org/10.51211/isbi.v10i2.3722

Ananda Surya, A., Rizki Darmawan, D., & Solichin, A. (2025). Prediksi Kapabilitas Calon Debitur Menggunakan Analisis Data Machine Learning Dengan Metode Random Forest. Jurnal Algoritma, 22(1), 777–788. https://doi.org/10.33364/algoritma/v.22-1.1929

Banjarnahor, E., Belferik, R., Cendana, W., & Abraham, Y. A. S. (2025). Analisis Implementasi Support Vector Machine dan Random Forest untuk Prediksi Kategori Indeks Kualitas Udara Jakarta. Jurnal INSTEK (Informatika Sains Dan Teknologi), 10(1), 175–184. https://doi.org/10.24252/instek.v10i1.56477

Banjarnahor, E., Sibarani, D. P., Wibawanta, B., Sihotang, D. A. G., & Abraham, Y. A. S. (2025). A Machine Learning Approach to Predicting Student Success Through Data Mining of LMS Moodle Activity Data. 2025 4th International Conference on Electronics Representation and Algorithm (ICERA), 233–238. https://doi.org/10.1109/ICERA66156.2025.11086633

Das, S., Barik, R., & Mukherjee, A. (2020). Salary Prediction Using Regression Techniques. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3526707

Gao, X., Wen, J., & Zhang, C. (2019). An Improved Random Forest Algorithm for Predicting Employee Turnover. Mathematical Problems in Engineering, 2019(1). https://doi.org/10.1155/2019/4140707

Gopal, K., Singh, A., & Sagar, S. (2021). Salary Prediction Using Machine Learning.

Ismail, & Hidayah, A. (2025). Implementasi Machine Learning Dengan Metode Regresi Linear Untuk Prediksi Gaji Karyawan Berdasarkan Masa Kerja. Jurnal Rister, 2(1), 1–7. https://doi.org/10.25126/Rister

Izzatul Mula, & Auliya Ristiani. (2025). Transformasi Struktur Pekerjaan dan Kebutuhan Keterampilan di Era Teknologi AI dan Otomatisasi di Pasar Global. Nian Tana Sikka : Jurnal Ilmiah Mahasiswa, 3(1), 155–167. https://doi.org/10.59603/niantanasikka.v3i1.665

Kundu, Souren, Mikhalev, O., Handerson, S., Bailey, Y. R., Peters, A., & Kundu, S. (2020). Machine Learning for Salary Estimation: Insights from Logistic Regression. https://www.researchgate.net/publication/391735353

Liu, X. (2023). Salary Grades Prediction Using Machine Learning. Applied and Computational Engineering, 8(1), 248–255. https://doi.org/10.54254/2755-2721/8/20230152

Maehendrayuga, A., Setyanto, A., & Kusnawi. (2024). Analisa Prediksi Turnover Karyawan menggunakan Machine Learning. Bit-Tech, 7(2), 648–659. https://doi.org/10.32877/bt.v7i2.1999

Malaiarasan, M. S., Ameer Riyaz, M., & Appadurai, M. (2025). Salary Prediction Using Machine Learning. International Journal of Scientific Research and Engineering Development, 8(2)

ms, G. (2023). Salary Prediction System using Machine Learning. Interantional Journal Of Scientific Research In Engineering And Management, 07(05). https://doi.org/10.55041/IJSREM22822

Ramadhan, B., Firdaus, D., & Adiningrum, N. T. R. (2023). Analisis Data Pegawai Untuk Memprediksi Gaji Berdasarkan Faktor-Faktor Spesifik Dengan Pendekatan Machine Learning. Naratif : Jurnal Nasional Riset, Aplikasi Dan Teknik Informatika, 5(2), 131–139. https://doi.org/10.53580/naratif.v5i2.205

Reskiawati, Somayasa, W., & Adi Wibawa, G. (2025). Pemodelan Data Pemberian Asi Eksklusif Ibu Melahirkan Di Kelurahan Matabubu Dengan Analisis Regresi Logistik Biner. Jurnal Matematika Komputasi Dan Statistika, 5(2). https://doi.org/10.33772/jmks.v5i2.145

Rianti, R., & Andarsyah, R. (2024). Memprediksi Tingkat Atrisi Karyawan Menggunakan Machine Learning. Jurnal Tekno Insentif, 18(1), 39–52. https://doi.org/10.36787/jti.v18i1.1263

Tuah, Y. A. E., & Anyan, A. (2020). Implementasi Model Regresi Linear Sederhana Untuk Prediksi Gaji Berdasarkan Pengalaman Lama Bekerja. JUTECH : Journal Education and Technology, 1(2), 56–70. https://doi.org/10.31932/jutech.v1i2.1289

Wewengkang, R. C., Tirta Nugraha, Z., & Armera, A. M. (2025). Prediksi Gaji Karyawan dengan Machine Learning Menggunakan Teknik Linear Regression dan Decision Tree. Prosiding Seminar Nasional Penelitian LPPM UMJ, 2025, https://jurnal.umj.ac.id/index.php/semnaslit/article/view/29361

Yusuf, D., Razi, F., Arman, S. A., Terisia, V., & Nurjayanti, R. (2025). Prediksi Risiko Stunting pada Balita menggunakan Algoritma Logistic Regression dan Decision Tree berbasis Data Terbuka. Prosiding Semnastek, 2025, https://jurnal.umj.ac.id/index.php/semnastek/article/view/27662


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Klasifikasi Rentang Gaji Lowongan Pekerjaan di Glints Wilayah Jabodetabek Menggunakan Regresi Logistik dan Random Forest Berbasis Web Scraping

Dimensions Badge
Article History
Published: 2026-01-30
Abstract View: 268 times
PDF Download: 117 times
Issue
Section
Articles