Ensemble meta classifier with sampling and feature selection for data with multiclass imbalance problem

Ensemble learning by combining several single or another ensemble classifier is one of the procedures to solve the imbalance problem in multiclass data. However, this approach is still facing the question of how the ensemble methods obtain their higher performance. In this paper, the investigation i...

Full description

Saved in:
Bibliographic Details
Main Authors: Mohd Shamrie Sainin, Rayner Alfred, Faudziah Ahmad
Format: Article
Language:English
English
Published: Universiti Utara Malaysia Press 2021
Subjects:
Online Access:https://eprints.ums.edu.my/id/eprint/36465/1/ABSTRACT.pdf
https://eprints.ums.edu.my/id/eprint/36465/2/FULL%20TEXT.pdf
https://eprints.ums.edu.my/id/eprint/36465/
https://doi. org/10.32890/jict2021.20.2.1
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Ensemble learning by combining several single or another ensemble classifier is one of the procedures to solve the imbalance problem in multiclass data. However, this approach is still facing the question of how the ensemble methods obtain their higher performance. In this paper, the investigation is carried out on the design of the ensemble meta classifier with sampling and feature selection for imbalance multiclass data. The specific objectives are 1) to improve the ensemble classifier through data-level approach (sampling and feature selection); 2) to perform experiments on sampling, feature selection, and ensemble classifier model; and 3) to evaluate the performance of the ensemble classifier. To fulfill the objectives, a preliminary data collection of Malaysian plants leaf images was prepared, experimented, and comparing the results. The ensemble design is also tested with another three high imbalance ratio benchmark data. It is found that the design using sampling, feature selection and ensemble classifier method using AdaboostM1 with Random Forest (also an ensemble classifier) provides the improved performance throughout the investigation. The result of this study is important to the ongoing problem of multiclass imbalance where specific structure and its performance can be improved in terms of processing time and accuracy.