Pemanfaatan Transformer untuk Peringkasan Teks: Studi Kasus pada Transkripsi Video Pembelajaran

Muhammad Furqon Fadlilah; Aldy Rialdy Atmadja; Muhammad Deden Firdaus

doi:10.47065/bits.v6i3.6342

Muhammad Furqon Fadlilah * Universitas Islam Negeri Sunan Gunung Djati, Bandung, Indonesia
Aldy Rialdy Atmadja Universitas Islam Negeri Sunan Gunung Djati, Bandung, Indonesia https://orcid.org/0000-0001-7798-8078
Muhammad Deden Firdaus Universitas Islam Negeri Sunan Gunung Djati, Bandung, Indonesia

(*) Corresponding Author

DOI: https://doi.org/10.47065/bits.v6i3.6342

Keywords: Whisper Model; Text Summary; Recall-Oriented Understudy for Gisting Evaluation; Text-to-Text Transfer Transformer; Video Transcription

Abstract

Abstract−In the digital era, learning videos are increasingly being used, however, they often contain irrelevant information, making it difficult to comprehend the content. This study proposes an approach based on the Whisper and T5 models to generate text summaries from YouTube educational video transcripts. Whisper is used for speech-to-text transcription, focusing on model variants that offer a low Word Error Rate (WER) and time efficiency. Subsequently, the T5 model is fine-tuned to produce accurate text summaries, with a strategy of segmenting the transcript to address input length limitations. Text preprocessing is not applied as it resulted in better evaluation quality. The results show that the combination of Whisper Turbo and the optimized T5 model provides the best performance, with F1-Scores on the ROUGE metrics of 39.23 (ROUGE-1), 13.17 (ROUGE-2), and 23.84 (ROUGE-L). This approach successfully generates more relevant and comprehensive text summaries, enhancing the effectiveness of video-based learning. Therefore, this research makes a significant contribution to the development of text summarization technology for learning videos.

Downloads

Download data is not yet available.

References

H. M. E. Putry, V. Nuzulul’Adila, R. Sholeha, and D. Hilmi, “Video based learning sebagai tren media pembelajaran di era 4.0,” Tarbiyatuna J. Pendidik. Ilm., vol. 5, no. 1, pp. 1–24, 2020, doi: 10.55187/tarjpi.v5i1.3870.

B. Rahmat and D. Darmiati, “Pengembangan Media Pembelajaran dengan Video Based Learning di Akademi Kebidanan Pelamonia,” Lect. J. Pendidik., vol. 12, no. 2, pp. 149–165, 2021, doi: 10.31849/lectura.v12i2.7268.

D. Wong, “Effectiveness of learning through video clips and video learning improvements between business related postgraduate and undergraduate students,” Int. J. Mod. Educ., vol. 2, no. 7, pp. 119–127, 2020, doi: 10.35631/IJMOE.27009.

H. B. U. Haq, M. Asif, and M. Bin Ahmad, “Video summarization techniques: a review,” Int. J. Sci. Technol. Res, vol. 9, no. 11, pp. 146–153, 2020.

A. Bahari and K. E. Dewi, “Peringkasan Teks Otomatis Abstraktif Menggunakan Transformer Pada Teks Bahasa Indonesia,” Komputa J. Ilm. Komput. dan Inform., vol. 13, no. 1, pp. 83–91, 2024, doi: 10.34010/komputa.v13i1.11197.

R. F. Khoiroh, E. Julianto, S. A. Ardiyansa, H. A. Fajri, A. A. R. Yasa, and B. Sangapta, “Implementasi Speech Recognition Whisper pada Debat Calon Wakil Presiden Republik Indonesia,” Explore, vol. 14, no. 2, pp. 67–74, 2024, doi: 10.35200/ex.v14i2.115.

B. Ay, F. Ertam, G. Fidan, and G. Aydin, “Turkish abstractive text document summarization using text to text transfer transformer,” Alexandria Eng. J., vol. 68, pp. 1–13, 2023, doi: 10.1016/j.aej.2023.01.008.

Y. Singh, R. Kumar, S. Kabdal, and P. Upadhyay, “YouTube Video Summarizer using NLP: A Review,” Int. J. Performability Eng., vol. 19, no. 12, p. 817, 2023, doi: 10.23940/ijpe.23.12.p6.817823.

L. R. S. Gris, R. Marcacini, A. C. Junior, E. Casanova, A. Soares, and S. M. Aluísio, “Evaluating OpenAI’s Whisper ASR for Punctuation Prediction and Topic Modeling of life histories of the Museum of the Person,” arXiv Prepr. arXiv2305.14580, 2023.

I. N. Purnama and N. N. W. Utami, “Implementasi Peringkas Dokumen Berbahasa Indonesia Menggunakan Metode Text To Text Transfer Transformer (T5),” J. Teknol. Inf. dan Komput., vol. 9, no. 4, 2023.

G. Hartawan, D. S. Maylawati, and W. Uriawan, “Bidirectional and Auto-Regressive Transformer (BART) for Indonesian Abstractive Text Summarization,” J. Inform. Polinema, vol. 10, no. 4, pp. 535–542, 2024, doi: 10.33795/jip.v10i4.5242.

D. Ferdiansyah and C. S. K. Aditya, “Implementasi Automatic Speech Recognition Bacaan Al-Qur’an Menggunakan Metode Wav2Vec 2.0 dan OpenAI-Whisper,” J. Tek. Elektro dan Komput. TRIAC, vol. 11, no. 1, pp. 11–16, 2024, doi: 10.21107/triac.v11i1.24332.

C. Raffel et al., “Exploring the limits of transfer learning with a unified text-to-text transformer,” J. Mach. Learn. Res., vol. 21, no. 140, pp. 1–67, 2020.

A. G. Etemad, A. I. Abidi, and M. Chhabra, “Fine-tuned t5 for abstractive summarization,” Int. J. Performability Eng., vol. 17, no. 10, p. 900, 2021, doi: 10.23940/ijpe.21.10.p8.900906.

D. A. Fadhlillah and D. Ikasari, “Optimizing Online News Understanding: Abstractive Summarization Approach with T5 for Comprehend Content.” 2023

E. Zolotareva, T. M. Tashu, and T. Horváth, “Abstractive Text Summarization using Transfer Learning.,” in ITAT, 2020, pp. 75–80.

J. Gabín, M. E. Ares, and J. Parapar, “Enhancing Automatic Keyphrase Labelling with Text-to-Text Transfer Transformer (T5) Architecture: A Framework for Keyphrase Generation and Filtering,” arXiv Prepr. arXiv2409.16760, 2024.

A. A. Magriyanti, “Analisis pengembangan algoritma porter stemming dalam bahasa indonesia,” 2018, doi: 10.31227/osf.io/7ge4v.

M. Barbella and G. Tortora, “Rouge metric evaluation for text summarization techniques,” Available SSRN 4120317, 2022, doi: 10.2139/ssrn.4120317.

Y. Yuliska and K. U. Syaliman, “Literatur Review Terhadap Metode, Aplikasi dan Dataset Peringkasan Dokumen Teks Otomatis untuk Teks Berbahasa Indonesia,” IT J. Res. Dev., vol. 5, no. 1, pp. 19–31, 2020, doi: 10.25299/itjrd.2020.vol5(1).4688.

Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Pemanfaatan Transformer untuk Peringkasan Teks: Studi Kasus pada Transkripsi Video Pembelajaran