Search-Based Wrapper Feature Selection Methods in Software Defect Prediction: An Empirical Analysis

High dimensionality is a data quality problem that negatively influences the predictive capabilities of prediction models in software defect prediction (SDP). As a viable solution, feature selection (FS) has been used to address the high dimensionality problem in SDP. From existing studies, Filter-b...

Full description

Saved in:
Bibliographic Details
Main Authors: Balogun, A.O., Basri, S., Jadid, S.A., Mahamad, S., Al-momani, M.A., Bajeh, A.O., Alazzawi, A.K.
Format: Article
Published: Springer 2020
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85089715754&doi=10.1007%2f978-3-030-51965-0_43&partnerID=40&md5=43e46e25f07ab051697ddf22a1bf6b90
http://eprints.utp.edu.my/24728/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utp.eprints.24728
record_format eprints
spelling my.utp.eprints.247282021-08-27T05:50:45Z Search-Based Wrapper Feature Selection Methods in Software Defect Prediction: An Empirical Analysis Balogun, A.O. Basri, S. Jadid, S.A. Mahamad, S. Al-momani, M.A. Bajeh, A.O. Alazzawi, A.K. High dimensionality is a data quality problem that negatively influences the predictive capabilities of prediction models in software defect prediction (SDP). As a viable solution, feature selection (FS) has been used to address the high dimensionality problem in SDP. From existing studies, Filter-based feature selection (FFS) and Wrapper Feature Selection (WFS) are the two basic types of FS methods. WFS methods have been regarded to have superior performance between the two. However, WFS methods have been known to have high computational cost as the number of executions required for feature subset search, evaluation and selection is not known prior. This often leads to overfitting of prediction models due to easy trapping in local maxima. Applying appropriate search method in WFS subset evaluator phase can resolve its trapping in local maxima. Best First Search (BFS) and Greedy Step-wise Search (GSS) methods have been extensively and conventionally used as viable search methods in WFS with positive impacts. However, metaheuristic search methods can also be as effective as BFS and GSS. Consequently, this study conducts an empirical comparative analysis of 13 search methods (11 state-of-the-art metaheuristic search and 2 conventional search methods) in WFS methods for SDP. The experimental results showed that metaheuristic (AS, BS, BAT, CS, ES, FS, FLS, GS, NSGA-II, PSOS, RS) as search methods in WFS proved to be better than conventional search methods (BFS and GSS). Although the average computational time of metaheuristic-based WFS methods is relatively high. We recommend that metaheuristic search can be used as alternate search methods for WFS methods in SDP. © 2020, Springer Nature Switzerland AG. Springer 2020 Article NonPeerReviewed https://www.scopus.com/inward/record.uri?eid=2-s2.0-85089715754&doi=10.1007%2f978-3-030-51965-0_43&partnerID=40&md5=43e46e25f07ab051697ddf22a1bf6b90 Balogun, A.O. and Basri, S. and Jadid, S.A. and Mahamad, S. and Al-momani, M.A. and Bajeh, A.O. and Alazzawi, A.K. (2020) Search-Based Wrapper Feature Selection Methods in Software Defect Prediction: An Empirical Analysis. Advances in Intelligent Systems and Computing, 1224 A . pp. 492-503. http://eprints.utp.edu.my/24728/
institution Universiti Teknologi Petronas
building UTP Resource Centre
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Petronas
content_source UTP Institutional Repository
url_provider http://eprints.utp.edu.my/
description High dimensionality is a data quality problem that negatively influences the predictive capabilities of prediction models in software defect prediction (SDP). As a viable solution, feature selection (FS) has been used to address the high dimensionality problem in SDP. From existing studies, Filter-based feature selection (FFS) and Wrapper Feature Selection (WFS) are the two basic types of FS methods. WFS methods have been regarded to have superior performance between the two. However, WFS methods have been known to have high computational cost as the number of executions required for feature subset search, evaluation and selection is not known prior. This often leads to overfitting of prediction models due to easy trapping in local maxima. Applying appropriate search method in WFS subset evaluator phase can resolve its trapping in local maxima. Best First Search (BFS) and Greedy Step-wise Search (GSS) methods have been extensively and conventionally used as viable search methods in WFS with positive impacts. However, metaheuristic search methods can also be as effective as BFS and GSS. Consequently, this study conducts an empirical comparative analysis of 13 search methods (11 state-of-the-art metaheuristic search and 2 conventional search methods) in WFS methods for SDP. The experimental results showed that metaheuristic (AS, BS, BAT, CS, ES, FS, FLS, GS, NSGA-II, PSOS, RS) as search methods in WFS proved to be better than conventional search methods (BFS and GSS). Although the average computational time of metaheuristic-based WFS methods is relatively high. We recommend that metaheuristic search can be used as alternate search methods for WFS methods in SDP. © 2020, Springer Nature Switzerland AG.
format Article
author Balogun, A.O.
Basri, S.
Jadid, S.A.
Mahamad, S.
Al-momani, M.A.
Bajeh, A.O.
Alazzawi, A.K.
spellingShingle Balogun, A.O.
Basri, S.
Jadid, S.A.
Mahamad, S.
Al-momani, M.A.
Bajeh, A.O.
Alazzawi, A.K.
Search-Based Wrapper Feature Selection Methods in Software Defect Prediction: An Empirical Analysis
author_facet Balogun, A.O.
Basri, S.
Jadid, S.A.
Mahamad, S.
Al-momani, M.A.
Bajeh, A.O.
Alazzawi, A.K.
author_sort Balogun, A.O.
title Search-Based Wrapper Feature Selection Methods in Software Defect Prediction: An Empirical Analysis
title_short Search-Based Wrapper Feature Selection Methods in Software Defect Prediction: An Empirical Analysis
title_full Search-Based Wrapper Feature Selection Methods in Software Defect Prediction: An Empirical Analysis
title_fullStr Search-Based Wrapper Feature Selection Methods in Software Defect Prediction: An Empirical Analysis
title_full_unstemmed Search-Based Wrapper Feature Selection Methods in Software Defect Prediction: An Empirical Analysis
title_sort search-based wrapper feature selection methods in software defect prediction: an empirical analysis
publisher Springer
publishDate 2020
url https://www.scopus.com/inward/record.uri?eid=2-s2.0-85089715754&doi=10.1007%2f978-3-030-51965-0_43&partnerID=40&md5=43e46e25f07ab051697ddf22a1bf6b90
http://eprints.utp.edu.my/24728/
_version_ 1738656631099490304
score 13.160551