Vietnamese character recognition based on CNN model with reduced character classes

Thi Ha Phan; Duc Chung Tran; Mohd Fadzil Hassan

doi:10.11591/eei.v10i2.2810

Vietnamese character recognition based on CNN model with reduced character classes

Thi Ha Phan, Duc Chung Tran, Mohd Fadzil Hassan

Abstract

This article will detail the steps to build and train the convolutional neural network (CNN) model for Vietnamese character recognition in educational books. Based on this model, a mobile application for extracting text content from images in Vietnamese textbooks was built using OpenCV and Canny edge detection algorithm. There are 178 characters classes in Vietnamese with accents. However, within the scope of Vietnamese character recognition in textbooks, some classes of characters only differ in terms of actual sizes, such as â€œcâ€ and â€œCâ€, â€œoâ€ and â€œOâ€. Therefore, the authors built the classification model for 138 Vietnamese character classes after filtering out similar character classes to increase the model's effectiveness.

Keywords

Character; CNN; Deep learning; Recognition

Full Text:

PDF

DOI: https://doi.org/10.11591/eei.v10i2.2810

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Bulletin of EEI Stats

Bulletin of Electrical Engineering and Informatics (BEEI)
ISSN: 2089-3191, e-ISSN: 2302-9285
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

Username
Password
Remember me