Chinese character recognition using neural network / Low Poh Tian

This project explores to the application of neural networks to the problem of identifying handwritten Chinese Characters in an automated manner. In particular, a backpropagation net is trained on a 5 Chinese Character fonts. The scope originally was to recognize five characters; it could easily be t...

詳細記述

保存先:
書誌詳細
第一著者: Low, Poh Tian
フォーマット: 学位論文
出版事項: 2003
主題:
オンライン・アクセス:http://studentsrepo.um.edu.my/12628/1/Low_Poh_Tian.pdf
http://studentsrepo.um.edu.my/12628/
タグ: タグ追加
タグなし, このレコードへの初めてのタグを付けませんか!
その他の書誌記述
要約:This project explores to the application of neural networks to the problem of identifying handwritten Chinese Characters in an automated manner. In particular, a backpropagation net is trained on a 5 Chinese Character fonts. The scope originally was to recognize five characters; it could easily be trained to recognize Chinese fonts. In process of training data, an original set of chine font was first generated. From this set, 4 other set were derived with different rotation angles. This provide a total of 20 images for the training data. In this system, each character is captured in a 320*240 pixel, black and white BITMAP file. During image processing a Bitmap file is broken up into a 1OO components, each being a 32*24 pixel in dimension. If the "on" pixels in this region is more than 10% of the total pixels in tire region, the component is set to "1" otherwise, it is set to "0". According to the implementation of this system, the input layer consists of 100 nodes and the output layer 5 nodes, each representing a character. Experiment has been carried out to determine the hidden layer size be/ ore it is set to 25 nodes. The training parameters of the network i.e. the learning rate and momentum has both fixed to 0.6. The extracted image data is then feed into the network, the weights of which are modified by error-backpropagation algorithm. The network has successfully been train to the error tolerance of 0.0001. As a result, it is able to recognize all the characters with which it is trained. However, other free-form (handwritten) cannot be recognize due to the limited training data.