Vietnamese character recognition based on CNN model with reduced character classes
Thi Ha Phan, Duc Chung Tran, Mohd Fadzil Hassan
Abstract
This article will detail the steps to build and train the convolutional neural network (CNN) model for Vietnamese character recognition in educational books. Based on this model, a mobile application for extracting text content from images in Vietnamese textbooks was built using OpenCV and Canny edge detection algorithm. There are 178 characters classes in Vietnamese with accents. However, within the scope of Vietnamese character recognition in textbooks, some classes of characters only differ in terms of actual sizes, such as “c” and “C”, “o” and “O”. Therefore, the authors built the classification model for 138 Vietnamese character classes after filtering out similar character classes to increase the model's effectiveness.
Keywords
Character; CNN; Deep learning; Recognition
DOI:
https://doi.org/10.11591/eei.v10i2.2810
Refbacks
There are currently no refbacks.
This work is licensed under a
Creative Commons Attribution-ShareAlike 4.0 International License .
<div class="statcounter"><a title="hit counter" href="http://statcounter.com/free-hit-counter/" target="_blank"><img class="statcounter" src="http://c.statcounter.com/10241695/0/5a758c6a/0/" alt="hit counter"></a></div>
Bulletin of EEI Stats
Bulletin of Electrical Engineering and Informatics (BEEI) ISSN: 2089-3191, e-ISSN: 2302-9285 This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU) .