Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer classification

An important application of DNA microarray data is cancer classification. Because of the high-dimensionality problem of microarray data, gene selection approaches are often employed to support the expert systems in diagnostic capability of cancer with high classification accuracy. Penalized logistic...

Full description

Saved in:
Bibliographic Details
Main Authors: Algamal, Zakariya Yahya, Lee, Muhammad Hisyam
Format: Article
Published: Elsevier Ltd 2015
Subjects:
Online Access:http://eprints.utm.my/id/eprint/58770/
http://dx.doi.org/10.1016/j.eswa.2015.08.016
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utm.58770
record_format eprints
spelling my.utm.587702021-09-26T15:17:25Z http://eprints.utm.my/id/eprint/58770/ Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer classification Algamal, Zakariya Yahya Lee, Muhammad Hisyam QA Mathematics An important application of DNA microarray data is cancer classification. Because of the high-dimensionality problem of microarray data, gene selection approaches are often employed to support the expert systems in diagnostic capability of cancer with high classification accuracy. Penalized logistic regression using the least absolute shrinkage and selection operator (LASSO) is one of the key steps in high-dimensional cancer classification, as gene coefficient estimation and gene selection simultaneously. However, the LASSO has been criticized for being biased in gene selection. The adaptive LASSO (APLR) was originally proposed to overcome the selection bias by assigning a consistent weight to each gene. In high-dimensional data, however, the adaptive LASSO faces practical problems in choosing the type of initial weight. In practice, the LASSO estimator itself has been used as an initial weight. However, this may not be preferable because the LASSO is inconsistent in itself. To address this issue, an alternative initial weight in adaptive penalized logistic regression (CBPLR) is proposed. The effectiveness of the CBPLR is examined on three well-known high-dimensional cancer classification datasets using number of selected genes, area under the curve, and misclassification rate. The experimental results reveal that the proposed CBPLR is quite efficient and feasible for cancer classification. Additionally, the proposed weight is compared with APLR and LASSO and exhibits competitive performance in both classification accuracy and gene selection. The proposed CBPLR has significant impact in penalized logistic regression by selecting fewer genes with high area under the curve and low misclassification rate. Thus, the proposed weight could conceivably be used in other research that implements gene selection in the field of high dimensional cancer classification. Elsevier Ltd 2015 Article PeerReviewed Algamal, Zakariya Yahya and Lee, Muhammad Hisyam (2015) Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer classification. Expert Systems with Applications, 42 (23). pp. 9326-9332. ISSN 0957-4174 http://dx.doi.org/10.1016/j.eswa.2015.08.016 DOI:10.1016/j.eswa.2015.08.016
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
topic QA Mathematics
spellingShingle QA Mathematics
Algamal, Zakariya Yahya
Lee, Muhammad Hisyam
Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer classification
description An important application of DNA microarray data is cancer classification. Because of the high-dimensionality problem of microarray data, gene selection approaches are often employed to support the expert systems in diagnostic capability of cancer with high classification accuracy. Penalized logistic regression using the least absolute shrinkage and selection operator (LASSO) is one of the key steps in high-dimensional cancer classification, as gene coefficient estimation and gene selection simultaneously. However, the LASSO has been criticized for being biased in gene selection. The adaptive LASSO (APLR) was originally proposed to overcome the selection bias by assigning a consistent weight to each gene. In high-dimensional data, however, the adaptive LASSO faces practical problems in choosing the type of initial weight. In practice, the LASSO estimator itself has been used as an initial weight. However, this may not be preferable because the LASSO is inconsistent in itself. To address this issue, an alternative initial weight in adaptive penalized logistic regression (CBPLR) is proposed. The effectiveness of the CBPLR is examined on three well-known high-dimensional cancer classification datasets using number of selected genes, area under the curve, and misclassification rate. The experimental results reveal that the proposed CBPLR is quite efficient and feasible for cancer classification. Additionally, the proposed weight is compared with APLR and LASSO and exhibits competitive performance in both classification accuracy and gene selection. The proposed CBPLR has significant impact in penalized logistic regression by selecting fewer genes with high area under the curve and low misclassification rate. Thus, the proposed weight could conceivably be used in other research that implements gene selection in the field of high dimensional cancer classification.
format Article
author Algamal, Zakariya Yahya
Lee, Muhammad Hisyam
author_facet Algamal, Zakariya Yahya
Lee, Muhammad Hisyam
author_sort Algamal, Zakariya Yahya
title Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer classification
title_short Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer classification
title_full Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer classification
title_fullStr Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer classification
title_full_unstemmed Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer classification
title_sort penalized logistic regression with the adaptive lasso for gene selection in high-dimensional cancer classification
publisher Elsevier Ltd
publishDate 2015
url http://eprints.utm.my/id/eprint/58770/
http://dx.doi.org/10.1016/j.eswa.2015.08.016
_version_ 1712285043181748224
score 13.160551