Validation assessments on resampling method in imbalanced binary classification for linear discriminant analysis

The curse of class imbalance affects the performance of many conventional classification algorithms including linear discriminant analysis (LDA). The data pre-processing approach through some resampling methods such as random oversampling (ROS) and random undersampling (RUS) is one of the treatments...

Full description

Saved in:
Bibliographic Details
Main Authors: Jamaluddin, Ahmad Hakiim, Mahat, Nor Idayu
Format: Article
Language:English
Published: Universiti Utara Malaysia 2021
Subjects:
Online Access:https://repo.uum.edu.my/id/eprint/28128/1/JICT%2020%201%202021%2083-102.pdf
https://doi.org/10.32890/jict.20.1.2021.6358
https://repo.uum.edu.my/id/eprint/28128/
http://jict.uum.edu.my/index.php/currentissue
https://doi.org/10.32890/jict.20.1.2021.6358
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.uum.repo.28128
record_format eprints
spelling my.uum.repo.281282023-06-19T09:57:02Z https://repo.uum.edu.my/id/eprint/28128/ Validation assessments on resampling method in imbalanced binary classification for linear discriminant analysis Jamaluddin, Ahmad Hakiim Mahat, Nor Idayu QA75 Electronic computers. Computer science The curse of class imbalance affects the performance of many conventional classification algorithms including linear discriminant analysis (LDA). The data pre-processing approach through some resampling methods such as random oversampling (ROS) and random undersampling (RUS) is one of the treatments to alleviate such curse. Previous studies have attempted to address the effect of a resampling method on the performance of LDA. However, some studies contradicted with each other based on different performance measures as well as validation strategies. This manuscript attempted to shed more light on the effect of a resampling method (ROS or RUS) on the performance of LDA based on true positive rate and true negative rate through five validation strategies, i.e. leave-one-out cross-validation, k-fold cross-validation, repeated k-fold cross-validation, naive bootstrap, and .632+ bootstrap. 100 twogroup bivariate normally distributed simulated and four real data sets with severe class imbalance ratio were utilised. The analysis on the location and dispersion statistics of the performance measures was further enlightened on: (i) the effect of a resampling method on the performance of LDA, and (ii) the enhancement in the learning fairness of LDA on objects regardless of sample size, hence reducing the effect of the curse of class imbalance. Universiti Utara Malaysia 2021 Article PeerReviewed application/pdf en cc4_by https://repo.uum.edu.my/id/eprint/28128/1/JICT%2020%201%202021%2083-102.pdf Jamaluddin, Ahmad Hakiim and Mahat, Nor Idayu (2021) Validation assessments on resampling method in imbalanced binary classification for linear discriminant analysis. Journal of Information and Communication Technology (JICT), 20 (1). pp. 83-102. ISSN 1675-414X http://jict.uum.edu.my/index.php/currentissue https://doi.org/10.32890/jict.20.1.2021.6358 https://doi.org/10.32890/jict.20.1.2021.6358
institution Universiti Utara Malaysia
building UUM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Utara Malaysia
content_source UUM Institutional Repository
url_provider http://repo.uum.edu.my/
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Jamaluddin, Ahmad Hakiim
Mahat, Nor Idayu
Validation assessments on resampling method in imbalanced binary classification for linear discriminant analysis
description The curse of class imbalance affects the performance of many conventional classification algorithms including linear discriminant analysis (LDA). The data pre-processing approach through some resampling methods such as random oversampling (ROS) and random undersampling (RUS) is one of the treatments to alleviate such curse. Previous studies have attempted to address the effect of a resampling method on the performance of LDA. However, some studies contradicted with each other based on different performance measures as well as validation strategies. This manuscript attempted to shed more light on the effect of a resampling method (ROS or RUS) on the performance of LDA based on true positive rate and true negative rate through five validation strategies, i.e. leave-one-out cross-validation, k-fold cross-validation, repeated k-fold cross-validation, naive bootstrap, and .632+ bootstrap. 100 twogroup bivariate normally distributed simulated and four real data sets with severe class imbalance ratio were utilised. The analysis on the location and dispersion statistics of the performance measures was further enlightened on: (i) the effect of a resampling method on the performance of LDA, and (ii) the enhancement in the learning fairness of LDA on objects regardless of sample size, hence reducing the effect of the curse of class imbalance.
format Article
author Jamaluddin, Ahmad Hakiim
Mahat, Nor Idayu
author_facet Jamaluddin, Ahmad Hakiim
Mahat, Nor Idayu
author_sort Jamaluddin, Ahmad Hakiim
title Validation assessments on resampling method in imbalanced binary classification for linear discriminant analysis
title_short Validation assessments on resampling method in imbalanced binary classification for linear discriminant analysis
title_full Validation assessments on resampling method in imbalanced binary classification for linear discriminant analysis
title_fullStr Validation assessments on resampling method in imbalanced binary classification for linear discriminant analysis
title_full_unstemmed Validation assessments on resampling method in imbalanced binary classification for linear discriminant analysis
title_sort validation assessments on resampling method in imbalanced binary classification for linear discriminant analysis
publisher Universiti Utara Malaysia
publishDate 2021
url https://repo.uum.edu.my/id/eprint/28128/1/JICT%2020%201%202021%2083-102.pdf
https://doi.org/10.32890/jict.20.1.2021.6358
https://repo.uum.edu.my/id/eprint/28128/
http://jict.uum.edu.my/index.php/currentissue
https://doi.org/10.32890/jict.20.1.2021.6358
_version_ 1769845892608688128
score 13.19449