Evaluasi Montreal Forced Aligner dan Goodness of Pronunciation untuk Penilaian Pelafalan Bahasa Sunda


  • Abdul Fatahillah * Mail Sekolah Tinggi Teknologi Terpadu Nurul Fikri, Depok, Indonesia
  • Sigit Puspito Wigati Jarot Sekolah Tinggi Teknologi Terpadu Nurul Fikri, Depok, Indonesia
  • (*) Corresponding Author
Keywords: Sundanese Language; Forced Alignment; Goodness of Pronunciation; Low-Resource Language; Automatic Pronunciation Assessment; Single-Word Audio

Abstract

Sundanese is the second most widely spoken regional language in Indonesia, yet automated pronunciation assessment systems for this language remain extremely scarce. This study presents a systematic evaluation of the Montreal Forced Aligner (MFA) and Goodness of Pronunciation (GOP) pipeline for Sundanese pronunciation assessment within a prototype voice-based learning application. The dataset comprises 2,500 valid utterance samples collected from 50 native Sundanese speakers, covering 10 basa loma vocabulary items spanning 20 unique phonemes. MFA evaluation revealed total and systemic alignment failure: all 2,500 files (100%) were identified as problematic, with 17 of 20 phonemes consistently assigned exactly 10-millisecond durations. Three distinct parameter configurations produced identical failure rates (100%), confirming that the failures are intrinsic to MFA's limitations with very short-duration single-word audio (mean 0.69 seconds) for low-resource languages. GOP evaluation yielded a global top-1 accuracy of only 26.1%, characterized by anomalous dominance of the /l/ phoneme as top-1 for 14 of 20 phonemes. Functional testing demonstrated the system's inability to discriminate correct from incorrect utterances. On the technical side, the React Native and FastAPI prototype application was successfully implemented, with 6 of 8 black-box test scenarios passing. This research provides three principal contributions: (1) empirical contribution in the form of the first quantitative evidence that the standard MFA-GOP pipeline cannot be directly applied to Sundanese as a low-resource language with short-duration single-word audio; (2) methodological contribution in the form of an empirical baseline and replicable evaluation framework applicable to other regional languages of Indonesia; and (3) practical contribution in the form of a React Native–FastAPI client-server prototype that serves as a starting point for further development of Sundanese pronunciation assessment systems using alternative approaches.

Downloads

Download data is not yet available.

References

Arisaputra, P., Handoyo, A. T., & Zahra, A. (2024). XLS-R deep learning model for multilingual ASR on low-resource languages: Indonesian, Javanese, and Sundanese. ICIC Express Letters, Part B: Applications, 15(6), 551–559. https://doi.org/10.24507/icicelb.15.06.551

Ashar, D., & Handayani, R. (2025). Pelestarian Bahasa Daerah Untuk Melestarikan Identitas Di Generasi Muda. Basaya: Jurnal Bahasa, Sastra Dan Budaya, 1(2), 40–43. http://jurnal.inovasipendidikankreatif.com/index.php/BASAYA/article/view/54

Cryssiover, A., & Zahra, A. (2024). Speech recognition model design for Sundanese language using WAV2VEC 2.0. International Journal of Speech Technology, 27(1), 171–177. https://doi.org/10.1007/s10772-023-10066-5

Getman, Y., Phan, N., Al-Ghezi, R., Voskoboinik, E., Singh, M., Grosz, T., Kurimo, M., Salvi, G., Svendsen, T., Strombergsson, S., Smolander, A., & Ylinen, S. (2023). Developing an AI-Assisted Low-Resource Spoken Language Learning App for Children. IEEE Access, 11, 86025–86037. https://doi.org/10.1109/ACCESS.2023.3304274

Kaharuddin, K., Kaharuddin, M. N., & Kaharuddin, N. N. (2024). Penetrasi Bahasa dan Ancaman Kepunahan Bahasa Daerah di Era Komunikasi Digital di Provinsi Sulawesi Selatan. Jurnal Idiomatik Jurnal Pendidikan Bahasa Dan Sastra Indonesia, 7(1), 1–14. https://doi.org/https://doi.org/10.46918/idiomatik.v7i1.2303

Kamaly, N., Fuddailah, N., Firsa, P. Z. N., Afrijal, A., & Alqarni, W. (2025). Peran Balai Bahasa Aceh Dalam Meningkatkan Literasi Bahasa Daerah di Kalangan Generasi Muda. Sosietas: Jurnal Pendidikan Sosiologi, 15(1), 101–110. https://doi.org/10.17509/sosietas.v15i1.83694

Kheir, Y., Ali, A., & Chowdhury, S. (2023). Automatic Pronunciation Assessment - A Review. Findings of the Association for Computational Linguistics: EMNLP 2023, 8304–8324. https://doi.org/10.18653/v1/2023.findings-emnlp.557

Kim, E., Jeon, J.-J., Seo, H., & Kim, H. (2022). Automatic Pronunciation Assessment using Self-Supervised Speech Representation Learning. Interspeech 2022, 1411–1415. https://doi.org/10.21437/Interspeech.2022-10245

Liu, Y., binti Ab Rahman, F., & binti Mohamad Zain, F. (2025). A systematic literature review of research on automatic speech recognition in EFL pronunciation. Cogent Education, 12(1). https://doi.org/10.1080/2331186X.2025.2466288

Mandolang, N. O., Lotulung, D. R., & Ranuntu, G. C. (2024). Reconstruction of the Tontemboan-Indonesian Dictionary: Makela’i and Matana’i. Santhet (Jurnal Sejarah Pendidikan Dan Humaniora), 8(2), 2510–2516. https://doi.org/10.36526/santhet.v8i2.3287

Mcauliffe, M., Socolof, M., Mihuc, S., Wagner, M., & Sonderegger, M. (2017). Montreal Forced Aligner: Trainable Text-Speech Alignment Using Kaldi. Interspeech 2017, 498–502. https://doi.org/10.21437/Interspeech.2017-1386

Nurjanah, N., Koswara, D., Santosa Nugraha, H., Rukmanah, H. S., & Ruslan, U. (2025). Strategi Inovatif Dalam Pembelajaran Bahasa Sunda: Digitalisasi Materi Ajar untuk Guru Sekolah Dasar. Jurnal Inovasi Penelitian Pendidikan Dan Pembelajaran, 5(2), 579–587. https://doi.org/https://doi.org/10.51878/learning.v5i2.4724

Pratama, R. S. A., & Amrullah, A. (2024). Analysis of Whisper Automatic Speech Recognition Performance on Low Resource Language. Jurnal Pilar Nusa Mandiri, 20(1), 1–8. https://doi.org/10.33480/pilar.v20i1.4633

Rahmah, D. L., & Juhriah, E. (2021). Aplikasi Mengenal Bahasa Sunda Berbasis Android Dalam Dunia Pendidikan. Jurnal Educatio FKIP UNMA, 7(4), 2136–2145. https://doi.org/10.31949/educatio.v7i4.1605

Rusyana, E., & Rohmah, R. U. N. (2024). Interferensi bahasa Indonesia terhadap bahasa Sunda dalam karangan berbahasa Sunda siswa SMP. Diglosia: Jurnal Kajian Bahasa, Sastra, Dan Pengajarannya, 7(2), 237–246. https://doi.org/10.30872/diglosia.v7i2.954

Sulastri Ai, Ali Irfan Muhammad, Adyatma Rafi, Pradana Surya Rahadyan, & Hamidah Siti. (2023). Geolinguistik: Variasi Dialek Dan Lemahnya Pemertahanan Bahasa Sunda Oleh Generasi Muda. Jurnal Geografi, 13(1), 38–46. https://doi.org/10.24036/geografi/vol13-iss1/3970

Tosolini, A., & Bowern, C. (2025). Multilingual MFA: Forced Alignment on Low-Resource Related Languages. In J. Lachler, G. Agyapong, A. Arppe, S. Moeller, A. Chaudhary, S. Rijhwani, & D. Rosenblum (Eds.), Proceedings of the Eight Workshop on the Use of Computational Methods in the Study of Endangered Languages (pp. 100–109). Association for Computational Linguistics. https://aclanthology.org/2025.computel-main.11/

Tri Pujiani, N. K. I., & Miftahuddin, Y. (2022). Sistem Automatic Speech Recognition Menggunakan PCA dan VQ Untuk Deteksi Kemiripan Kata Bahasa Sunda. E-Proceeding FTI, 1(1). https://eproceeding.itenas.ac.id/index.php/fti/article/view/964

Witt, S., & Young, S. (2000). Phone-level pronunciation scoring and assessment for interactive language learning. Speech Communication, 30, 95–108. https://doi.org/10.1016/S0167-6393(99)00044-8

Yanti Rut Susanti. (2022). Kurangnya penggunaan dan pemahaman berbahasa Sunda di kalangan remaja. DEWANTARA: Jurnal Pendidikan Sosial Humaniora, 1(3), 74–77. https://doi.org/https://doi.org/10.30640/dewantara.v1i3.403

Zhang, J., Zhang, Z., Wang, Y., Yan, Z., Song, Q., Huang, Y., Li, K., Povey, D., & Wang, Y. (2021). speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment. CoRR, abs/2104.01378. https://arxiv.org/abs/2104.01378


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Evaluasi Montreal Forced Aligner dan Goodness of Pronunciation untuk Penilaian Pelafalan Bahasa Sunda

Dimensions Badge
Article History
Published: 2026-05-31
Abstract View: 0 times
PDF Download: 0 times
Issue
Section
Articles

Most read articles by the same author(s)