A Novel Wrapper-Based Optimization Algorithm for the Feature Selection and Classification

Machine learning (ML) practices such as classification have played a very important role in classifying diseases in medical science. Since medical science is a sensitive field, the pre-processing of medical data requires careful handling to make quality clinical decisions. Generally, medical data is...

Full description

Saved in:
Bibliographic Details
Main Authors: Talpur, N., Abdulkadir, S.J., Hasan, M.H., Alhussian, H., Alwadain, A.
Format: Article
Published: Tech Science Press 2023
Online Access:http://scholars.utp.edu.my/id/eprint/34293/
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85145353810&doi=10.32604%2fcmc.2023.034025&partnerID=40&md5=c865ec168ffae3e91f8ad847e7421d50
Tags: Add Tag
No Tags, Be the first to tag this record!
id oai:scholars.utp.edu.my:34293
record_format eprints
spelling oai:scholars.utp.edu.my:342932023-01-17T13:35:25Z http://scholars.utp.edu.my/id/eprint/34293/ A Novel Wrapper-Based Optimization Algorithm for the Feature Selection and Classification Talpur, N. Abdulkadir, S.J. Hasan, M.H. Alhussian, H. Alwadain, A. Machine learning (ML) practices such as classification have played a very important role in classifying diseases in medical science. Since medical science is a sensitive field, the pre-processing of medical data requires careful handling to make quality clinical decisions. Generally, medical data is considered high-dimensional and complex data that contains many irrelevant and redundant features. These factors indirectly upset the disease prediction and classification accuracy of any ML model. To address this issue, various data pre-processing methods called Feature Selection (FS) techniques have been presented in the literature. However, the majority of such techniques frequently suffer from local minima issues due to large solution space. Thus, this study has proposed a novel wrapper-based Sand Cat SwarmOptimization (SCSO) technique as an FS approach to find optimum features from ten benchmark medical datasets. The SCSO algorithm replicates the hunting and searching strategies of the sand cat while having the advantage of avoiding local optima and finding the ideal solution with minimal control variables. Moreover, K-Nearest Neighbor (KNN) classifier was used to evaluate the effectiveness of the features identified by the proposed SCSO algorithm. The performance of the proposed SCSO algorithm was compared with six state-of-the-art and recent wrapper-based optimization algorithms using the validation metrics of classification accuracy, optimum feature size, and computational cost in seconds. The simulation results on the benchmark medical datasets revealed that the proposed SCSO-KNN approach has outperformed comparative algorithms with an average classification accuracy of 93.96 by selecting 14.2 features within 1.91 s. Additionally, the Wilcoxon rank test was used to perform the significance analysis between the proposed SCSOKNN method and six other algorithms for a p-value less than 5.00E-02. The findings revealed that the proposed algorithm produces better outcomes with an average p-value of 1.82E-02. Moreover, potential future directions are also suggested as a result of the study's promising findings. © 2023 Tech Science Press. All rights reserved. Tech Science Press 2023 Article NonPeerReviewed Talpur, N. and Abdulkadir, S.J. and Hasan, M.H. and Alhussian, H. and Alwadain, A. (2023) A Novel Wrapper-Based Optimization Algorithm for the Feature Selection and Classification. Computers, Materials and Continua, 74 (3). pp. 5799-5820. ISSN 15462218 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85145353810&doi=10.32604%2fcmc.2023.034025&partnerID=40&md5=c865ec168ffae3e91f8ad847e7421d50 10.32604/cmc.2023.034025 10.32604/cmc.2023.034025 10.32604/cmc.2023.034025
institution Universiti Teknologi Petronas
building UTP Resource Centre
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Petronas
content_source UTP Institutional Repository
url_provider http://eprints.utp.edu.my/
description Machine learning (ML) practices such as classification have played a very important role in classifying diseases in medical science. Since medical science is a sensitive field, the pre-processing of medical data requires careful handling to make quality clinical decisions. Generally, medical data is considered high-dimensional and complex data that contains many irrelevant and redundant features. These factors indirectly upset the disease prediction and classification accuracy of any ML model. To address this issue, various data pre-processing methods called Feature Selection (FS) techniques have been presented in the literature. However, the majority of such techniques frequently suffer from local minima issues due to large solution space. Thus, this study has proposed a novel wrapper-based Sand Cat SwarmOptimization (SCSO) technique as an FS approach to find optimum features from ten benchmark medical datasets. The SCSO algorithm replicates the hunting and searching strategies of the sand cat while having the advantage of avoiding local optima and finding the ideal solution with minimal control variables. Moreover, K-Nearest Neighbor (KNN) classifier was used to evaluate the effectiveness of the features identified by the proposed SCSO algorithm. The performance of the proposed SCSO algorithm was compared with six state-of-the-art and recent wrapper-based optimization algorithms using the validation metrics of classification accuracy, optimum feature size, and computational cost in seconds. The simulation results on the benchmark medical datasets revealed that the proposed SCSO-KNN approach has outperformed comparative algorithms with an average classification accuracy of 93.96 by selecting 14.2 features within 1.91 s. Additionally, the Wilcoxon rank test was used to perform the significance analysis between the proposed SCSOKNN method and six other algorithms for a p-value less than 5.00E-02. The findings revealed that the proposed algorithm produces better outcomes with an average p-value of 1.82E-02. Moreover, potential future directions are also suggested as a result of the study's promising findings. © 2023 Tech Science Press. All rights reserved.
format Article
author Talpur, N.
Abdulkadir, S.J.
Hasan, M.H.
Alhussian, H.
Alwadain, A.
spellingShingle Talpur, N.
Abdulkadir, S.J.
Hasan, M.H.
Alhussian, H.
Alwadain, A.
A Novel Wrapper-Based Optimization Algorithm for the Feature Selection and Classification
author_facet Talpur, N.
Abdulkadir, S.J.
Hasan, M.H.
Alhussian, H.
Alwadain, A.
author_sort Talpur, N.
title A Novel Wrapper-Based Optimization Algorithm for the Feature Selection and Classification
title_short A Novel Wrapper-Based Optimization Algorithm for the Feature Selection and Classification
title_full A Novel Wrapper-Based Optimization Algorithm for the Feature Selection and Classification
title_fullStr A Novel Wrapper-Based Optimization Algorithm for the Feature Selection and Classification
title_full_unstemmed A Novel Wrapper-Based Optimization Algorithm for the Feature Selection and Classification
title_sort novel wrapper-based optimization algorithm for the feature selection and classification
publisher Tech Science Press
publishDate 2023
url http://scholars.utp.edu.my/id/eprint/34293/
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85145353810&doi=10.32604%2fcmc.2023.034025&partnerID=40&md5=c865ec168ffae3e91f8ad847e7421d50
_version_ 1755874791373406208
score 13.214268