Combining sampling and ensemble classifier for multiclass imbalance data learning

The aim of this paper is to investigate the effects of combining various sampling and ensemble classifiers on the prediction performance in addressing the multiclass imbalance data learning. This research uses data obtained from the Malaysian medicinal leaf images shape data and three other large be...

Full description

Saved in:
Bibliographic Details
Main Authors: Sainin, Mohd Shamrie, Alfred, Rayner, Adnan, Fairuz, Ahmad, Faudziah
Format: Book Section
Published: Springer 2018
Subjects:
Online Access:http://repo.uum.edu.my/25566/
http://doi.org/10.1007/978-981-10-8276-4_25
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The aim of this paper is to investigate the effects of combining various sampling and ensemble classifiers on the prediction performance in addressing the multiclass imbalance data learning. This research uses data obtained from the Malaysian medicinal leaf images shape data and three other large benchmark datasets in which seven ensemble methods from Weka machine learning tool were selected to perform the classification task. These ensemble methods include the AdaboostM1, Bagging, Decorate, END, MultiboostAB, RotationForest, and stacking methods. In addition to that, five base classifiers were used; Naïve Bayes, SMO, J48, Random Forest, and Random Tree in order to examine the performance of the ensemble methods. Two methods of combining the sampling and ensemble classifiers were used which are called the Resample with ensemble classifier and SMOTE with ensemble classifier. The results obtained from the experiments show that there is actually no single configuration that is “one design that fits all”. However, it is proven that when using the sampling and ensemble classifier which is coupled with Random Forest, the prediction performance of the classification task can be improved on the multiclass imbalance dataset.