Implementation of Tesseract OCR and Bounding Box for Text Extraction on Food Nutrition Labels


  • The Manuel Eric Saputra * Mail Universitas Dian Nuswantoro, Semarang, Indonesia
  • Ajib Susanto Universitas Dian Nuswantoro, Semarang, Indonesia
  • Bastiaans Jessica Carmelita Universitas Dian Nuswantoro, Semarang, Indonesia
  • (*) Corresponding Author
Keywords: OCR; Bounding Box; Preprocessing; Nutrition Labels; Word Error Rate; Character Error Rate; Postprocessing

Abstract

This study focuses on implementing Optical Character Recognition (OCR) using the Tesseract engine, integrated with bounding box detection, to extract nutritional information from food nutrition labels. The research addresses the challenge of limited consumer access to and understanding of nutritional data, a factor contributing to health issues such as obesity and related metabolic disorders. Studies indicate that although Indonesian consumers generally have a good level of knowledge and positive attitudes toward nutritional labels, the actual behavior of reading and understanding these labels remains limited. Additionally, packaged foods consumed outside the home constitute a significant portion of daily caloric intake, which can lead to health complications if not properly managed. With obesity levels among adults in Indonesia rising to concerning rates, this study highlights the importance of providing accessible nutritional data. In this work, MobileNetV1 is used as the backbone model for bounding box detection, effectively identifying and isolating label regions to enhance OCR accuracy. Tesseract OCR, known for its LSTM-based architecture, is applied to predict sequential data patterns, such as rows of text on nutrition labels. Preprocessing techniques, including grayscale conversion, brightness adjustment, CLAHE (Contrast Limited Adaptive Histogram Equalization), and denoising, are used to improve text clarity and further refine OCR output accuracy. Post-processing steps involve rule-based and contextual error correction to handle common OCR inaccuracies. Evaluated on 10 different label images, the system achieved a maximum Word Error Rate (WER) of 10% and a Character Error Rate (CER) of 1.6%, demonstrating high accuracy in nutritional information extraction.

Downloads

Download data is not yet available.

References

A. Mahfudhin and P. Kurnia, “Hubungan pengetahuan dengan perilaku membaca label informasi nilai gizi pada ahli gizi di Surakarta,” Ilmu Gizi Indonesia, vol. 5, no. 1, p. 47, Aug. 2021, doi: 10.35842/ilgi.v5i1.209.

Q. A. Huda and D. R. Andrias, “SIKAP DAN PERILAKU MEMBACA INFORMASI GIZI PADA LABEL PANGAN SERTA PEMILIHAN PANGAN KEMASAN,” Media Gizi Indonesia, vol. 11, no. 2, p. 175, Jan. 2018, doi: 10.20473/mgi.v11i2.175-181.

Y. D. Sari and R. Rachmawati, “KONTRIBUSI ZAT GIZI MAKANAN JAJANAN TERHADAP ASUPAN ENERGI SEHARI DI INDONESIA (ANALISIS DATA SURVEY KONSUMSI MAKANAN INDIVIDU 2014) [FOOD AWAY FROM HOME (FAFH) CONTRIBUTION OF NUTRITION TO DAILY TOTAL ENERGY INTAKE IN INDONESIA],” Penelitian Gizi dan Makanan (The Journal of Nutrition and Food Research), vol. 43, no. 1, pp. 29–40, Sep. 2020, doi: 10.22435/pgm.v43i1.2891.

K. P. A. Nugroho, R. R. M. D. Kurniasari, and T. Noviani, “GAMBARAN POLA MAKAN SEBAGAI PENYEBAB KEJADIAN PENYAKIT TIDAK MENULAR (DIABETES MELLITUS, OBESITAS, DAN HIPERTENSI) DI WILAYAH KERJA PUSKESMAS CEBONGAN, KOTA SALATIGA,” Jurnal Kesehatan Kusuma Husada, pp. 15–23, Jan. 2019, doi: 10.34035/jk.v10i1.324.

T. Seviana, Profil Kesehatan Indonesia 2023. Kementrian Kesehatan Republik Indonesia, 2023.

M. J. Christoph and R. An, “Effect of nutrition labels on dietary quality among college students: a systematic review and meta-analysis,” Nutr Rev, vol. 76, no. 3, pp. 187–203, Mar. 2018, doi: 10.1093/nutrit/nux069.

N. S. Pratt, B. D. Ellison, A. S. Benjamin, and M. T. Nakamura, “Improvements in recall and food choices using a graphical method to deliver information of select nutrients,” Nutrition Research, vol. 36, no. 1, pp. 44–56, Jan. 2016, doi: 10.1016/j.nutres.2015.10.009.

Y. Shah, “Delving Deep into NutriScan: Automated Nutrition Table Extraction and Ingredient Recognition,” Int J Res Appl Sci Eng Technol, vol. 11, no. 11, pp. 1596–1601, Nov. 2023, doi: 10.22214/ijraset.2023.56852.

A. Kaderabek, “Exploring Optical Character Recognition (OCR) as a Method of Capturing Data from Food-Purchase Receipts,” Surv Methods Insights Field, vol. 1, no. 3, pp. 1–15, Nov. 2023.

D. Sporici, E. Cușnir, and C.-A. Boiangiu, “Improving the Accuracy of Tesseract 4.0 OCR Engine Using Convolution-Based Preprocessing,” Symmetry (Basel), vol. 12, no. 5, p. 715, May 2020, doi: 10.3390/sym12050715.

A. Inbasekaran, R. K. Gnanasekaran, and R. Marciano, “Using Transfer Learning to contextually Optimize Optical Character Recognition (OCR) output and perform new Feature Extraction on a digitized cultural and historical dataset,” in 2021 IEEE International Conference on Big Data (Big Data), IEEE, Dec. 2021, pp. 2224–2230. doi: 10.1109/BigData52589.2021.9671586.

H. Seitaj and V. Elangovan, “Information Extraction from Product Labels: A Machine Vision Approach,” International Journal of Artificial Intelligence & Applications, vol. 15, no. 2, pp. 57–76, Mar. 2024, doi: 10.5121/ijaia.2024.15204.

N. A. Putri Kamis and O.-K. Shin, “OCR-Based Safety Check System of Packaged Food for Food Inconvenience Patients,” Journal of Digital Contents Society, vol. 21, no. 6, pp. 1025–1032, Jun. 2020, doi: 10.9728/dcs.2020.21.6.1025.

T. Hegghammer, “OCR with Tesseract, Amazon Textract, and Google Document AI: a benchmarking experiment,” J Comput Soc Sci, vol. 5, no. 1, pp. 861–882, May 2022, doi: 10.1007/s42001-021-00149-1.

P. Jain, Dr. K. Taneja, and Dr. H. Taneja, “Which OCR toolset is good and why? A comparative study,” Kuwait Journal of Science, vol. 48, no. 2, Apr. 2021, doi: 10.48129/kjs.v48i2.9589.

M. Tomaschek, “Evaluation of off-the-shelf OCR technologies,” Universitass Masarykiana, 2017.

T. Palwankar and K. Kothari, “Real Time Object Detection using SSD and MobileNet,” Int J Res Appl Sci Eng Technol, vol. 10, no. 3, pp. 831–834, Mar. 2022, doi: 10.22214/ijraset.2022.40755.

A. G. Howard et al., “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications,” Apr. 2017.

H. Wang, H. Qian, S. Feng, and W. Wang, “L-SSD: lightweight SSD target detection based on depth-separable convolution,” J Real Time Image Process, vol. 21, no. 2, p. 33, Apr. 2024, doi: 10.1007/s11554-024-01413-z.

B. T. Naik and M. F. Hashmi, “MobileNet + SSD: Lightweight Network for Real-Time Detection of Basketball Player,” 2023, pp. 11–19. doi: 10.1007/978-981-19-8742-7_2.

V.-S. Vu and D.-N. Nguyen, “Application of MobileNet-SSD Deep Neural Network for Real-Time Object Detection and Lane Tracking on an Autonomous Vehicle,” 2022, pp. 559–565. doi: 10.1007/978-3-030-91892-7_53.

E. Suharto, Suhartono, A. P. Widodo, and E. A. Sarwoko, “The use of mobilenet v1 for identifying various types of freshwater fish,” J Phys Conf Ser, vol. 1524, no. 1, p. 012105, Apr. 2020, doi: 10.1088/1742-6596/1524/1/012105.

N. Gupta and N. M. Khan, “Efficient and Scalable Object Localization in 3D on Mobile Device,” J Imaging, vol. 8, no. 7, p. 188, Jul. 2022, doi: 10.3390/jimaging8070188.

E. Z. Orji, A. Haydar, İ. Erşan, and O. O. Mwambe, “Advancing OCR Accuracy in Image-to-LaTeX Conversion—A Critical and Creative Exploration,” Applied Sciences, vol. 13, no. 22, p. 12503, Nov. 2023, doi: 10.3390/app132212503.

D. N. Kholifah, H. M. Nawawi, and I. J. Thira, “IMAGE BACKGROUND PROCESSING FOR COMPARING ACCURACY VALUES OF OCR PERFORMANCE,” Jurnal Pilar Nusa Mandiri, vol. 16, no. 1, pp. 33–38, Mar. 2020, doi: 10.33480/pilar.v16i1.1076.

J. Jayanthi and P. U. Maheswari, “Comparative study: enhancing legibility of ancient Indian script images from diverse stone background structures using 34 different pre-processing methods,” Herit Sci, vol. 12, no. 1, p. 63, Feb. 2024, doi: 10.1186/s40494-024-01169-6.

T. Wang, G. T. Kim, M. Kim, and J. Jang, “Contrast Enhancement-Based Preprocessing Process to Improve Deep Learning Object Task Performance and Results,” Applied Sciences, vol. 13, no. 19, p. 10760, Sep. 2023, doi: 10.3390/app131910760.

B. LIU and J. LIU, “Overview of image noise reduction based on non-local mean algorithm,” MATEC Web of Conferences, vol. 232, p. 03029, Nov. 2018, doi: 10.1051/matecconf/201823203029.

J. Reibring, “Photo OCR for Nutrition Labels Combining Machine Learning and General Image Processing for Text Detection of American Nutrition Labels,” 2017.

GitHub, “Tesseract documentation,” github.io. Accessed: Oct. 20, 2024. [Online]. Available: https://tesseract-ocr.github.io/

N. Chigali, S. R. Bobba, K. Suvarna Vani, and S. Rajeswari, “OCR Assisted Translator,” in 2020 7th International Conference on Smart Structures and Systems (ICSSS), IEEE, Jul. 2020, pp. 1–4. doi: 10.1109/ICSSS49621.2020.9202034.

M. Heidarysafa, J. Reed, K. Kowsari, A. C. R. Leviton, J. I. Warren, and D. E. Brown, “From Videos to URLs: A Multi-Browser Guide To Extract User’s Behavior with Optical Character Recognition,” Nov. 2018.

S. Drobac and K. Lindén, “Optical character recognition with neural networks and post-correction with finite state methods,” International Journal on Document Analysis and Recognition (IJDAR), vol. 23, no. 4, pp. 279–295, Dec. 2020, doi: 10.1007/s10032-020-00359-9.

L. L. de Oliveira et al., “Evaluating and mitigating the impact of OCR errors on information retrieval,” International Journal on Digital Libraries, vol. 24, no. 1, pp. 45–62, Mar. 2023, doi: 10.1007/s00799-023-00345-6.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Implementation of Tesseract OCR and Bounding Box for Text Extraction on Food Nutrition Labels

Dimensions Badge
Article History
Submitted: 2024-10-21
Published: 2024-12-03
Abstract View: 80 times
PDF Download: 77 times
How to Cite
Saputra, T. M., Susanto, A., & Carmelita, B. (2024). Implementation of Tesseract OCR and Bounding Box for Text Extraction on Food Nutrition Labels. Building of Informatics, Technology and Science (BITS), 6(3), 1403-1412. https://doi.org/10.47065/bits.v6i3.6107
Issue
Section
Articles