Automatic Information Extraction From Student ID Card Images Using DB and VietOCR: A Case Study at a Vietnamese University

Online First: 29/04/2026

Authors

Corressponding author's email:

nguyendung@hueuni.edu.vn

DOI:

https://doi.org/10.54644/jte.2026.2101

Keywords:

MobileNetV3, Differentiable Binarization, Text Detection, Vietnamese Text Recognition, VietOCR

Abstract

The development of an information extraction system from student ID card images plays an important role in the digitalization of student management. This study proposes a two-stage processing framework that integrates computer vision and deep learning techniques, in which MobileNetV3-Small is employed for student identification card image classification, while the Differentiable Binarization (DB) model and VietOCR are responsible for Vietnamese text detection and recognition, respectively. Experimental results on a student ID card image dataset show that the classification model achieves an accuracy of 99.40% with an AUC of 0.9996, while the DB-based text detection model attains an Hmean of 89.81% after data augmentation. For text recognition, the proposed system achieves over 99% character-level accuracy and up to 98.90% full-sequence accuracy. These results demonstrate the effectiveness and practical feasibility of the proposed system, which is further validated through a proof-of-concept offline attendance application. In addition, the system is designed with computational efficiency in mind, enabling deployment on resource-constrained devices without requiring continuous internet connectivity. The proposed framework can be readily adapted to other types of identification documents, providing a scalable and cost-effective solution for automated data acquisition in educational institutions.

Downloads: 0

Download data is not yet available.

Author Biographies

Bao-Khanh Hoang, University of Sciences, Hue University, Vietnam

Bao-Khanh Hoang was born on November 15, 2004, in Hue. He is currently a fourth-year student in Information Technology, majoring in Computer Science at the University of Sciences, Hue University. Research areas: Artificial Intelligence, Computer Vision.

Email: 22T1020637@husc.edu.vn. ORCID:  https://orcid.org/0009-0002-0269-9612

Van-Hai Ngo, University of Sciences, Hue University, Vietnam

Van-Hai Ngo was born on September 9, 2003, in Hue. In 2026, he graduated with a bachelor's degree in Computer Science from the University of Sciences, Hue University. Research areas: Artificial Intelligence, Computer Vision.

Email: 21T1020340@husc.edu.vn. ORCID:  https://orcid.org/0009-0002-9568-6342

Xuan-Truong Tran, University of Sciences, Hue University, Vietnam

Xuan-Truong Tran was born on December 22, 2004, in Thua Thien Hue. He is currently a fourth-year student in Information Technology, majoring in Computer Science at the University of Sciences, Hue University. Research areas: Artificial Intelligence, Computer Vision.

Email: 22T1020784@husc.edu.vn. ORCID:  https://orcid.org/0009-0001-0804-5946

Dung Nguyen, University of Sciences, Hue University, Vietnam

Dung Nguyen was born on June 13, 1988 in Thua Thien Hue. He graduated with a bachelor’s degree in information technology from the College of Sciences, Hue University in 2010. In 2013, he graduated with a master’s degree in computer science from the College of Sciences, Hue University. Currently he works at the University of Sciences, Hue University. Research fields: Software technology, artificial intelligence, machine learning, deep learning, databases.

Email: nguyendung@hueuni.edu.vn. ORCID:  https://orcid.org/0009-0000-4510-7504. Tel: 0905198887.

Duc-Phuc Nguyen, University of Arts, Hue University, Vietnam

Duc-Phuc Nguyen was born on January 2, 1987, in Hue City. He earned his bachelor's degree in Information Technology from the University of Sciences, Hue University in 2008, and his master's degree in Computer Science in 2017 from the same university. He has been working at the Department of Administration and Facilities of the University of Arts, Hue University since 2008. Research areas: Databases, Computer Networks.

Email: ndphuc.hufa@hueuni.edu.vn. ORCID:  https://orcid.org/0009-0000-5916-9466

References

E. Mukul and G. Büyüközkan, “Digital transformation in education: A systematic review of Education 4.0,” Technol. Forecast. Soc. Change, vol. 194, Art. no. 122664, 2023. DOI: https://doi.org/10.1016/j.techfore.2023.122664

K. K. de S. Oliveira and R. A. C. De Souza, “Digital transformation towards Education 4.0,” Informatics in Education, vol. 21, no. 2, pp. 283–309, 2022. DOI: https://doi.org/10.15388/infedu.2022.13

A. A. Bilyalova, D. A. Salimova, and T. I. Zelenina, “Digital transformation in education,” in Proc. Int. Conf. Integrated Science, 2019, pp. 265–276. DOI: https://doi.org/10.1007/978-3-030-22493-6_24

J. Liang, D. Doermann, and H. Li, “Camera-based analysis of text and documents: A survey,” Int. J. Doc. Anal. Recognit., vol. 7, no. 2–3, pp. 84–104, 2005. DOI: https://doi.org/10.1007/s10032-004-0138-z

A. T. I. Mazumdar, N. N. Islam, and M. S. Hossain, “NFC-based mobile application for student attendance in institution of higher learning,” in Proc. ICAEEE, 2022, pp. 1–6. DOI: https://doi.org/10.1109/ICAIC53980.2022.9896975

M. Kumar, P. K. Samota, and M. K. Sharma, “Class attendance management system using NFC mobile devices,” Intell. Autom. Soft Comput., vol. 23, no. 2, pp. 243–250, 2017. DOI: https://doi.org/10.1080/10798587.2016.1204749

T. Karygiannis et al., Guidelines for Securing Radio Frequency Identification (RFID) Systems, NIST Special Publication 800-98, 2007. DOI: https://doi.org/10.6028/NIST.SP.800-98

C. Jin et al., “RFID technology, security vulnerabilities, and countermeasures,” in Cutting Edge Research Topics on Multiple Access Communications. London, U.K.: IntechOpen, 2009.

S. Kumar et al., “A comprehensive taxonomy of security and privacy issues in RFID,” Complex Intell. Syst., vol. 7, no. 4, pp. 1915–1943, 2021. DOI: https://doi.org/10.1007/s40747-021-00280-6

A. Howard et al., “Searching for MobileNetV3,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2019, pp. 1314–1324. DOI: https://doi.org/10.1109/ICCV.2019.00140

M. Liao, Z. Wan, C. Yao, K. Chen, and X. Bai, “Real-time scene text detection with differentiable binarization,” in Proc. AAAI Conf. Artif. Intell., vol. 34, no. 7, 2020, pp. 11474–11481. DOI: https://doi.org/10.1609/aaai.v34i07.6812

P. B. C. Quoc, “VietOCR – Nhận dạng tiếng Việt sử dụng mô hình Transformer và AttentionOCR,” 2021. [Online]. Available: https://pbcquoc.github.io/vietocr/

A. V. Gayer, Y. S. Chernyshova, and V. V. Arlazarov, “Recognition of machine-readable zone in identity documents: A review,” IEEE Access, 2025. DOI: https://doi.org/10.1109/ACCESS.2025.3571547

R. Smith, “An overview of the Tesseract OCR engine,” in Proc. ICDAR, 2007, pp. 629–633. DOI: https://doi.org/10.1109/ICDAR.2007.4376991

Y. Xu et al., “LayoutLMv3: Pre-training for document AI with unified text and image masking,” arXiv:2204.08387, 2022.

G. Kim et al., “Donut: Document understanding transformer without OCR,” in Proc. ECCV, 2022, pp. 1–19. DOI: https://doi.org/10.1007/978-3-031-19815-1_29

M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “MobileNetV2: Inverted residuals and linear bottlenecks,” in Proc. CVPR, 2018, pp. 4510–4520. DOI: https://doi.org/10.1109/CVPR.2018.00474

X. Zhou et al., “EAST: An efficient and accurate scene text detector,” in Proc. CVPR, 2017, pp. 2642–2651. DOI: https://doi.org/10.1109/CVPR.2017.283

S. Long et al., “TextSnake: A flexible representation for detecting text of arbitrary shapes,” in Proc. ECCV, 2018, pp. 20–36. DOI: https://doi.org/10.1007/978-3-030-01216-8_2

D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” arXiv:1409.0473, 2014.

B. Shi, X. Bai, and C. Yao, “An end-to-end trainable neural network for image-based sequence recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 11, pp. 2298–2304, 2017. DOI: https://doi.org/10.1109/TPAMI.2016.2646371

M. Li et al., “TrOCR: Transformer-based optical character recognition with pre-trained models,” in Proc. AAAI Conf. Artif. Intell., vol. 37, no. 11, 2023, pp. 13094–13102. DOI: https://doi.org/10.1609/aaai.v37i11.26538

J. Deng et al., “ImageNet: A large-scale hierarchical image database,” in Proc. CVPR, 2009, pp. 248–255. DOI: https://doi.org/10.1109/CVPR.2009.5206848

Y. Xu et al., “LayoutLM: Pre-training of text and layout for document image understanding,” in Proc. ACM SIGKDD, 2020, pp. 1192–1200. DOI: https://doi.org/10.1145/3394486.3403172

K. Nguyen-Trong, “An end-to-end method to extract information from Vietnamese ID card images,” Int. J. Adv. Comput. Sci. Appl., vol. 13, no. 3, 2022. DOI: https://doi.org/10.14569/IJACSA.2022.0130371

I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” in Adv. Neural Inf. Process. Syst., vol. 27, 2014.

P. Dhote, “Seq2Seq Encoder–Decoder LSTM Model,” Medium, 2020.

A. Vaswani et al., “Attention is all you need,” in Adv. Neural Inf. Process. Syst., vol. 30, 2017, pp. 5998–6008.

Viblo Asia, “Seq2Seq with Attention,” 2019. [Online]. Available: https://viblo.asia

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv:1409.1556, 2014.

Downloads

Published

29-04-2026

How to Cite

[1]
B.-K. Hoang, V.-H. Ngo, X.-T. Tran, D. Nguyen, and D.-P. Nguyen, “Automatic Information Extraction From Student ID Card Images Using DB and VietOCR: A Case Study at a Vietnamese University: Online First: 29/04/2026”, JTE, Apr. 2026.