Published 2008
“…Theoretically, a good set of knowledge should provide good accuracy when dealing with new cases.Besides accuracy, a good rule set must also has a minimum number of rules and each rule should be short as possible.It is often that a rule set contains smaller quantity of rules but they usually have more conditions.An ideal model should be able to produces fewer, shorter rule and classify new data with good accuracy.Consequently, the quality and
compact knowledge will contribute manager with a good decision model.Because of that, the search for appropriate data
mining approach which can provide quality knowledge is important.Rough classifier (RC) and decision tree classifier (DTC) are categorized as RBC.The purpose of this study is to investigate the capability of RC and DTC in generating quality knowledge which leads to the good accuracy.To achieve that, both classifiers are compared based on four measurements that are accuracy of the classification, the number of rule, the length of rule, and the coverage of rule.Five dataset from UCI Machine
Learning namely United States Congressional Voting Records, Credit Approval, Wisconsin Diagnostic Breast Cancer, Pima Indians Diabetes Database, and Vehicle Silhouettes are chosen as data experiment.All datasets were
mined using RC toolkit namely ROSETTA while C4.5
algorithm in WEKA application was chosen as DTC rule generator.The experimental results indicated that both classifiers produced good classification result and had generated quality rule in different types of model –
higher accuracy, fewer rule, shorter rule, and
higher coverage.In term of accuracy, RC obtained
higher accuracy in average while DTC significantly generated lower number of rule than RC.In term of rule length, RC produced
compact and shorter rule than DTC and the length is not significantly different.Meanwhile, RC has better coverage than DTC.Final conclusion can be decided as follows “If the user interested at a variety of rule pattern with a good accuracy and the number of rule is not important, RC is the best solution whereas if the user looks for fewer nr, DTC might be the best choice”…”
Get full text
Get full text
Get full text
Get full text
Monograph