Robust generic Structured Document Classification System / Hamam M.Ibrahim Mokayed

The Structured Document Classification System (SDCS) is an industrialdriven technology that has the ability to classify piles of structured documents collected everyday efficiently in different places. Although the SDCS technology has advanced tremendously, one of the most challenging tasks is to pr...

Full description

Saved in:
Bibliographic Details
Main Author: M.Ibrahim Mokayed, Hamam
Format: Book Section
Language:English
Published: Institute of Graduate Studies, UiTM 2017
Subjects:
Online Access:http://ir.uitm.edu.my/id/eprint/19740/1/ABS_HAMAM%20M.IBRAHIM%20MOKAYED%20TDRA%20VOL%2011%20IGS%2017.pdf
http://ir.uitm.edu.my/id/eprint/19740/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The Structured Document Classification System (SDCS) is an industrialdriven technology that has the ability to classify piles of structured documents collected everyday efficiently in different places. Although the SDCS technology has advanced tremendously, one of the most challenging tasks is to propose a classifier that supports various layouts for different categories and different script languages in a high accuracy and efficient time. To solve the issue of supporting various layouts for different categories and different script languages, a Robust Generic Structured Document Classifier has been proposed (RGSDC). RGSDS starts with finding the best objects that can be used to fit the target and solve the issue. Detailed study for all the previous thresholding techniques is conducted to introduce a new categorization method based on the transformation value of input images. This study is a good base for finding reliable thresholding algorithm. A new thresholding technique based on ordinal structure fuzzy logic (OSFM) is proposed to provide a robust generic image thresholding technique (RGT) that is able to extract clear mixed predefined objects for different languages and multi layouts problems. Two different set of features that distinguish different languages and multi layouts structured documents are proposed…