Implementation of Tesseract OCR and Bounding Box for Text Extraction on Food Nutrition Labels
Abstract
This study focuses on implementing Optical Character Recognition (OCR) using the Tesseract engine, integrated with bounding box detection, to extract nutritional information from food nutrition labels. The research addresses the challenge of limited consumer access to and understanding of nutritional data, a factor contributing to health issues such as obesity and related metabolic disorders. Studies indicate that although Indonesian consumers generally have a good level of knowledge and positive attitudes toward nutritional labels, the actual behavior of reading and understanding these labels remains limited. Additionally, packaged foods consumed outside the home constitute a significant portion of daily caloric intake, which can lead to health complications if not properly managed. With obesity levels among adults in Indonesia rising to concerning rates, this study highlights the importance of providing accessible nutritional data. In this work, MobileNetV1 is used as the backbone model for bounding box detection, effectively identifying and isolating label regions to enhance OCR accuracy. Tesseract OCR, known for its LSTM-based architecture, is applied to predict sequential data patterns, such as rows of text on nutrition labels. Preprocessing techniques, including grayscale conversion, brightness adjustment, CLAHE (Contrast Limited Adaptive Histogram Equalization), and denoising, are used to improve text clarity and further refine OCR output accuracy. Post-processing steps involve rule-based and contextual error correction to handle common OCR inaccuracies. Evaluated on 10 different label images, the system achieved a maximum Word Error Rate (WER) of 10% and a Character Error Rate (CER) of 1.6%, demonstrating high accuracy in nutritional information extraction.
Downloads
References
A. Mahfudhin and P. Kurnia, “Hubungan pengetahuan dengan perilaku membaca label informasi nilai gizi pada ahli gizi di Surakarta,” Ilmu Gizi Indonesia, vol. 5, no. 1, p. 47, Aug. 2021, doi: 10.35842/ilgi.v5i1.209.
Q. A. Huda and D. R. Andrias, “SIKAP DAN PERILAKU MEMBACA INFORMASI GIZI PADA LABEL PANGAN SERTA PEMILIHAN PANGAN KEMASAN,” Media Gizi Indonesia, vol. 11, no. 2, p. 175, Jan. 2018, doi: 10.20473/mgi.v11i2.175-181.
Y. D. Sari and R. Rachmawati, “KONTRIBUSI ZAT GIZI MAKANAN JAJANAN TERHADAP ASUPAN ENERGI SEHARI DI INDONESIA (ANALISIS DATA SURVEY KONSUMSI MAKANAN INDIVIDU 2014) [FOOD AWAY FROM HOME (FAFH) CONTRIBUTION OF NUTRITION TO DAILY TOTAL ENERGY INTAKE IN INDONESIA],” Penelitian Gizi dan Makanan (The Journal of Nutrition and Food Research), vol. 43, no. 1, pp. 29–40, Sep. 2020, doi: 10.22435/pgm.v43i1.2891.
K. P. A. Nugroho, R. R. M. D. Kurniasari, and T. Noviani, “GAMBARAN POLA MAKAN SEBAGAI PENYEBAB KEJADIAN PENYAKIT TIDAK MENULAR (DIABETES MELLITUS, OBESITAS, DAN HIPERTENSI) DI WILAYAH KERJA PUSKESMAS CEBONGAN, KOTA SALATIGA,” Jurnal Kesehatan Kusuma Husada, pp. 15–23, Jan. 2019, doi: 10.34035/jk.v10i1.324.
T. Seviana, Profil Kesehatan Indonesia 2023. Kementrian Kesehatan Republik Indonesia, 2023.
M. J. Christoph and R. An, “Effect of nutrition labels on dietary quality among college students: a systematic review and meta-analysis,” Nutr Rev, vol. 76, no. 3, pp. 187–203, Mar. 2018, doi: 10.1093/nutrit/nux069.
N. S. Pratt, B. D. Ellison, A. S. Benjamin, and M. T. Nakamura, “Improvements in recall and food choices using a graphical method to deliver information of select nutrients,” Nutrition Research, vol. 36, no. 1, pp. 44–56, Jan. 2016, doi: 10.1016/j.nutres.2015.10.009.
Y. Shah, “Delving Deep into NutriScan: Automated Nutrition Table Extraction and Ingredient Recognition,” Int J Res Appl Sci Eng Technol, vol. 11, no. 11, pp. 1596–1601, Nov. 2023, doi: 10.22214/ijraset.2023.56852.
A. Kaderabek, “Exploring Optical Character Recognition (OCR) as a Method of Capturing Data from Food-Purchase Receipts,” Surv Methods Insights Field, vol. 1, no. 3, pp. 1–15, Nov. 2023.
D. Sporici, E. Cușnir, and C.-A. Boiangiu, “Improving the Accuracy of Tesseract 4.0 OCR Engine Using Convolution-Based Preprocessing,” Symmetry (Basel), vol. 12, no. 5, p. 715, May 2020, doi: 10.3390/sym12050715.
A. Inbasekaran, R. K. Gnanasekaran, and R. Marciano, “Using Transfer Learning to contextually Optimize Optical Character Recognition (OCR) output and perform new Feature Extraction on a digitized cultural and historical dataset,” in 2021 IEEE International Conference on Big Data (Big Data), IEEE, Dec. 2021, pp. 2224–2230. doi: 10.1109/BigData52589.2021.9671586.
H. Seitaj and V. Elangovan, “Information Extraction from Product Labels: A Machine Vision Approach,” International Journal of Artificial Intelligence & Applications, vol. 15, no. 2, pp. 57–76, Mar. 2024, doi: 10.5121/ijaia.2024.15204.
N. A. Putri Kamis and O.-K. Shin, “OCR-Based Safety Check System of Packaged Food for Food Inconvenience Patients,” Journal of Digital Contents Society, vol. 21, no. 6, pp. 1025–1032, Jun. 2020, doi: 10.9728/dcs.2020.21.6.1025.
T. Hegghammer, “OCR with Tesseract, Amazon Textract, and Google Document AI: a benchmarking experiment,” J Comput Soc Sci, vol. 5, no. 1, pp. 861–882, May 2022, doi: 10.1007/s42001-021-00149-1.
P. Jain, Dr. K. Taneja, and Dr. H. Taneja, “Which OCR toolset is good and why? A comparative study,” Kuwait Journal of Science, vol. 48, no. 2, Apr. 2021, doi: 10.48129/kjs.v48i2.9589.
M. Tomaschek, “Evaluation of off-the-shelf OCR technologies,” Universitass Masarykiana, 2017.
T. Palwankar and K. Kothari, “Real Time Object Detection using SSD and MobileNet,” Int J Res Appl Sci Eng Technol, vol. 10, no. 3, pp. 831–834, Mar. 2022, doi: 10.22214/ijraset.2022.40755.
A. G. Howard et al., “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications,” Apr. 2017.
H. Wang, H. Qian, S. Feng, and W. Wang, “L-SSD: lightweight SSD target detection based on depth-separable convolution,” J Real Time Image Process, vol. 21, no. 2, p. 33, Apr. 2024, doi: 10.1007/s11554-024-01413-z.
B. T. Naik and M. F. Hashmi, “MobileNet + SSD: Lightweight Network for Real-Time Detection of Basketball Player,” 2023, pp. 11–19. doi: 10.1007/978-981-19-8742-7_2.
V.-S. Vu and D.-N. Nguyen, “Application of MobileNet-SSD Deep Neural Network for Real-Time Object Detection and Lane Tracking on an Autonomous Vehicle,” 2022, pp. 559–565. doi: 10.1007/978-3-030-91892-7_53.
E. Suharto, Suhartono, A. P. Widodo, and E. A. Sarwoko, “The use of mobilenet v1 for identifying various types of freshwater fish,” J Phys Conf Ser, vol. 1524, no. 1, p. 012105, Apr. 2020, doi: 10.1088/1742-6596/1524/1/012105.
N. Gupta and N. M. Khan, “Efficient and Scalable Object Localization in 3D on Mobile Device,” J Imaging, vol. 8, no. 7, p. 188, Jul. 2022, doi: 10.3390/jimaging8070188.
E. Z. Orji, A. Haydar, İ. Erşan, and O. O. Mwambe, “Advancing OCR Accuracy in Image-to-LaTeX Conversion—A Critical and Creative Exploration,” Applied Sciences, vol. 13, no. 22, p. 12503, Nov. 2023, doi: 10.3390/app132212503.
D. N. Kholifah, H. M. Nawawi, and I. J. Thira, “IMAGE BACKGROUND PROCESSING FOR COMPARING ACCURACY VALUES OF OCR PERFORMANCE,” Jurnal Pilar Nusa Mandiri, vol. 16, no. 1, pp. 33–38, Mar. 2020, doi: 10.33480/pilar.v16i1.1076.
J. Jayanthi and P. U. Maheswari, “Comparative study: enhancing legibility of ancient Indian script images from diverse stone background structures using 34 different pre-processing methods,” Herit Sci, vol. 12, no. 1, p. 63, Feb. 2024, doi: 10.1186/s40494-024-01169-6.
T. Wang, G. T. Kim, M. Kim, and J. Jang, “Contrast Enhancement-Based Preprocessing Process to Improve Deep Learning Object Task Performance and Results,” Applied Sciences, vol. 13, no. 19, p. 10760, Sep. 2023, doi: 10.3390/app131910760.
B. LIU and J. LIU, “Overview of image noise reduction based on non-local mean algorithm,” MATEC Web of Conferences, vol. 232, p. 03029, Nov. 2018, doi: 10.1051/matecconf/201823203029.
J. Reibring, “Photo OCR for Nutrition Labels Combining Machine Learning and General Image Processing for Text Detection of American Nutrition Labels,” 2017.
GitHub, “Tesseract documentation,” github.io. Accessed: Oct. 20, 2024. [Online]. Available: https://tesseract-ocr.github.io/
N. Chigali, S. R. Bobba, K. Suvarna Vani, and S. Rajeswari, “OCR Assisted Translator,” in 2020 7th International Conference on Smart Structures and Systems (ICSSS), IEEE, Jul. 2020, pp. 1–4. doi: 10.1109/ICSSS49621.2020.9202034.
M. Heidarysafa, J. Reed, K. Kowsari, A. C. R. Leviton, J. I. Warren, and D. E. Brown, “From Videos to URLs: A Multi-Browser Guide To Extract User’s Behavior with Optical Character Recognition,” Nov. 2018.
S. Drobac and K. Lindén, “Optical character recognition with neural networks and post-correction with finite state methods,” International Journal on Document Analysis and Recognition (IJDAR), vol. 23, no. 4, pp. 279–295, Dec. 2020, doi: 10.1007/s10032-020-00359-9.
L. L. de Oliveira et al., “Evaluating and mitigating the impact of OCR errors on information retrieval,” International Journal on Digital Libraries, vol. 24, no. 1, pp. 45–62, Mar. 2023, doi: 10.1007/s00799-023-00345-6.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Implementation of Tesseract OCR and Bounding Box for Text Extraction on Food Nutrition Labels
Pages: 1403-1412
Copyright (c) 2024 The Manuel Eric Saputra, Ajib Susanto, Bastiaans Jessica Carmelita

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).