An improved associative classification model using fuzzy parameterized soft set-based decision for text classification

Text classification is applicable in various problem domains, including marketing, security, and biomedical. One of the potential text classifiers is the well-known associative classification approach. However, the existing associative classification approach is still prone to some limitations es...

Full description

Saved in:
Bibliographic Details
Main Author: Rohidin, Dede
Format: Thesis
Language:English
English
English
Published: 2023
Subjects:
Online Access:http://eprints.uthm.edu.my/10825/1/24p%20DEDE%20ROHIDIN.pdf
http://eprints.uthm.edu.my/10825/2/DEDE%20ROHIDIN%20COPYRIGHT%20DECLARATION.pdf
http://eprints.uthm.edu.my/10825/3/DEDE%20ROHIDIN%20WATERMARK.pdf
http://eprints.uthm.edu.my/10825/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Text classification is applicable in various problem domains, including marketing, security, and biomedical. One of the potential text classifiers is the well-known associative classification approach. However, the existing associative classification approach is still prone to some limitations especially when dealing with the problem with too many rules in text classification problem. Some of the rules generated from the textual data may be irrelevant and redundant, result in low performance in imbalanced and class overlapping data. Therefore, this research has proposed an improved associative classification approach to enhance the performance and efficiency of the text classification by removing the irrelevant rules, reducing redundant rules, and handling the imbalanced and class overlapping issues in the textual data. The proposed associative classification approach consists of three stages: pre-processing, fuzzification and classification. In the classification stage primarily, this study proposed to integrating principles of fuzzy soft set theory into associative rules, therefore referred to as Class-Based Fuzzy Soft Associative (CBFSA) method. The experiments used 20 Newsgroup (balanced data) datasets and Reuter-25178 (imbalanced) to evaluate the proposed model. It shows that CBFSA is successful in removing irrelevant and reducing redundant rules. The CBFSA classifier applies smaller number of rules than Class Based Associative (CBA) and Class Based of Predictive Association Rule (CPAR). The CBFSA is also successful in dealing with imbalanced and class overlap data. The CBFSA performance is higher and faster than CBA and CPAR. Meanwhile, comparative analysis with some other non-associative based classifiers may achieve improved f1-measure between 6% to 32%. The processing time of CBFSA is faster than RNN and CNN but slightly slower than Decision Tree, k-NN, Naïve Bayes, Roccio, Bagging and Boosting