A neighbourhood undersampling stacked ensemble with H-measure maximising meta-learner for imbalanced classification / Seng Zian

Stacked ensemble formulates an ensemble using a meta-learner to combine (stack) the predictions of multiple base classifiers. It suffers from the problem of suboptimal performance in imbalanced classification. Several underlying difficulty factors are reported to be responsible for performance de...

Full description

Saved in:
Bibliographic Details
Main Author: Seng , Zian
Format: Thesis
Published: 2021
Subjects:
Online Access:http://studentsrepo.um.edu.my/14776/1/Seng_Zian.pdf
http://studentsrepo.um.edu.my/14776/2/Seng_Zian.pdf
http://studentsrepo.um.edu.my/14776/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.um.stud.14776
record_format eprints
spelling my.um.stud.147762024-02-17T17:45:18Z A neighbourhood undersampling stacked ensemble with H-measure maximising meta-learner for imbalanced classification / Seng Zian Seng , Zian QA75 Electronic computers. Computer science T Technology (General) Stacked ensemble formulates an ensemble using a meta-learner to combine (stack) the predictions of multiple base classifiers. It suffers from the problem of suboptimal performance in imbalanced classification. Several underlying difficulty factors are reported to be responsible for performance degradation in imbalanced classification. This research aims to improve the classification performance of the stacked ensemble on imbalanced datasets by investigating the stacked ensemble’s meta-learner and the underlying difficulty factors (i.e., class imbalance, class overlapping, and class noise). Since the stacked ensemble’s imbalanced classification performance depends on the configuration of its meta-learner, an experiment (i.e., Experiment 1) was conducted to identify the best performing type of meta-learner. The results of Experiment 1 showed that the weighted combination-based meta-learner outperformed other types of meta-learners. Also, based on Experiment 1’s result, the ‘AUC-maximising meta-learner’ is one of the best performing weighted combination-based meta-learners. Inspired by the superior performance of the AUC-maximising meta-learner (in Experiment 1) and the importance of H-measure (in the literature), a new weighted combination-based meta-learner that maximises the H-measure (i.e., H-measure maximising meta-learner) was further proposed. Experiment 2 was conducted to evaluate the proposed H-measure maximising meta-learner. Then, it was benchmarked with the top 3 meta-learners in Experiment 1 and superior classification performance of the proposed meta-learner was observed. Then, this research further investigated the stacked ensemble’s degradation problem from the perspective of underlying difficulty factors in imbalanced datasets. A stacked ensemble coined as Neighbourhood Undersampling Stacked Ensemble (NUS-SE) was proposed. The NUS-SE consists of two proposed components, i.e., the US-SE framework and the Neighbourhood Undersampling. Experiment 3 was performed to evaluate the performance of the proposed NUS-SE. Since NUS-SE is integrable with any meta-learner, the top 3 meta-learners in Experiment 1 and the proposed H-measure maximising meta-learner were used as the meta-learners of NUS-SE in Experiment 3. Based on Experiment 3’ results, the NUS-SE with Hmeasure maximising meta-learner (NUS-SE-H) outperformed all the original unmodified stacked ensembles with different meta-learners and the proposed NUS-SE with other top-performing meta-learners (i.e., NUS-SE-AUC, NUS-SE-CCLL, NUS-SE-NNLS). 2021-08 Thesis NonPeerReviewed application/pdf http://studentsrepo.um.edu.my/14776/1/Seng_Zian.pdf application/pdf http://studentsrepo.um.edu.my/14776/2/Seng_Zian.pdf Seng , Zian (2021) A neighbourhood undersampling stacked ensemble with H-measure maximising meta-learner for imbalanced classification / Seng Zian. PhD thesis, Universiti Malaya. http://studentsrepo.um.edu.my/14776/
institution Universiti Malaya
building UM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaya
content_source UM Student Repository
url_provider http://studentsrepo.um.edu.my/
topic QA75 Electronic computers. Computer science
T Technology (General)
spellingShingle QA75 Electronic computers. Computer science
T Technology (General)
Seng , Zian
A neighbourhood undersampling stacked ensemble with H-measure maximising meta-learner for imbalanced classification / Seng Zian
description Stacked ensemble formulates an ensemble using a meta-learner to combine (stack) the predictions of multiple base classifiers. It suffers from the problem of suboptimal performance in imbalanced classification. Several underlying difficulty factors are reported to be responsible for performance degradation in imbalanced classification. This research aims to improve the classification performance of the stacked ensemble on imbalanced datasets by investigating the stacked ensemble’s meta-learner and the underlying difficulty factors (i.e., class imbalance, class overlapping, and class noise). Since the stacked ensemble’s imbalanced classification performance depends on the configuration of its meta-learner, an experiment (i.e., Experiment 1) was conducted to identify the best performing type of meta-learner. The results of Experiment 1 showed that the weighted combination-based meta-learner outperformed other types of meta-learners. Also, based on Experiment 1’s result, the ‘AUC-maximising meta-learner’ is one of the best performing weighted combination-based meta-learners. Inspired by the superior performance of the AUC-maximising meta-learner (in Experiment 1) and the importance of H-measure (in the literature), a new weighted combination-based meta-learner that maximises the H-measure (i.e., H-measure maximising meta-learner) was further proposed. Experiment 2 was conducted to evaluate the proposed H-measure maximising meta-learner. Then, it was benchmarked with the top 3 meta-learners in Experiment 1 and superior classification performance of the proposed meta-learner was observed. Then, this research further investigated the stacked ensemble’s degradation problem from the perspective of underlying difficulty factors in imbalanced datasets. A stacked ensemble coined as Neighbourhood Undersampling Stacked Ensemble (NUS-SE) was proposed. The NUS-SE consists of two proposed components, i.e., the US-SE framework and the Neighbourhood Undersampling. Experiment 3 was performed to evaluate the performance of the proposed NUS-SE. Since NUS-SE is integrable with any meta-learner, the top 3 meta-learners in Experiment 1 and the proposed H-measure maximising meta-learner were used as the meta-learners of NUS-SE in Experiment 3. Based on Experiment 3’ results, the NUS-SE with Hmeasure maximising meta-learner (NUS-SE-H) outperformed all the original unmodified stacked ensembles with different meta-learners and the proposed NUS-SE with other top-performing meta-learners (i.e., NUS-SE-AUC, NUS-SE-CCLL, NUS-SE-NNLS).
format Thesis
author Seng , Zian
author_facet Seng , Zian
author_sort Seng , Zian
title A neighbourhood undersampling stacked ensemble with H-measure maximising meta-learner for imbalanced classification / Seng Zian
title_short A neighbourhood undersampling stacked ensemble with H-measure maximising meta-learner for imbalanced classification / Seng Zian
title_full A neighbourhood undersampling stacked ensemble with H-measure maximising meta-learner for imbalanced classification / Seng Zian
title_fullStr A neighbourhood undersampling stacked ensemble with H-measure maximising meta-learner for imbalanced classification / Seng Zian
title_full_unstemmed A neighbourhood undersampling stacked ensemble with H-measure maximising meta-learner for imbalanced classification / Seng Zian
title_sort neighbourhood undersampling stacked ensemble with h-measure maximising meta-learner for imbalanced classification / seng zian
publishDate 2021
url http://studentsrepo.um.edu.my/14776/1/Seng_Zian.pdf
http://studentsrepo.um.edu.my/14776/2/Seng_Zian.pdf
http://studentsrepo.um.edu.my/14776/
_version_ 1792149226132602880
score 13.154949