Remote protein homology detection and fold recognition using two-layer support vector machine classifiers

Remote protein homology detection and fold recognition refer to detection of structural homology in proteins where there are small or no similarities in the sequence. To detect protein structural classes from protein primary sequence information, homology-based methods have been developed, which can...

Full description

Saved in:
Bibliographic Details
Main Authors: Muda, Hilmi M., Saad, Puteh, Othman, Razib M.
Format: Article
Published: Elsevier Ltd. 2011
Subjects:
Online Access:http://eprints.utm.my/id/eprint/29630/
http://dx.doi.org/10.1016/j.compbiomed.2011.06.004
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utm.29630
record_format eprints
spelling my.utm.296302019-04-25T01:18:13Z http://eprints.utm.my/id/eprint/29630/ Remote protein homology detection and fold recognition using two-layer support vector machine classifiers Muda, Hilmi M. Saad, Puteh Othman, Razib M. QA75 Electronic computers. Computer science Remote protein homology detection and fold recognition refer to detection of structural homology in proteins where there are small or no similarities in the sequence. To detect protein structural classes from protein primary sequence information, homology-based methods have been developed, which can be divided to three types: discriminative classifiers, generative models for protein families and pairwise sequence comparisons. Support Vector Machines (SVM) and Neural Networks (NN) are two popular discriminative methods. Recent studies have shown that SVM has fast speed during training, more accurate and efficient compared to NN. We present a comprehensive method based on two-layer classifiers. The 1st layer is used to detect up to superfamily and family in SCOP hierarchy using optimized binary SVM classification rules. It used the kernel function known as the Bio-kernel, which incorporates the biological information in the classification process. The 2nd layer uses discriminative SVM algorithm with string kernel that will detect up to protein fold level in SCOP hierarchy. The results obtained were evaluated using mean ROC and mean MRFP and the significance of the result produced with pairwise t-test was tested. Experimental results show that our approaches significantly improve the performance of remote protein homology detection and fold recognition for all three different version SCOP datasets (1.53, 1.67 and 1.73). We achieved 4.19% improvements in term of mean ROC in SCOP 1.53, 4.75% in SCOP 1.67 and 4.03% in SCOP 1.73 datasets when compared to the result produced by well-known methods. The combination of first layer and second layer of BioSVM-2L performs well in remote homology detection and fold recognition even in three different versions of datasets. Elsevier Ltd. 2011-08 Article PeerReviewed Muda, Hilmi M. and Saad, Puteh and Othman, Razib M. (2011) Remote protein homology detection and fold recognition using two-layer support vector machine classifiers. Computers in Biology and Medicine, 41 (8). pp. 687-699. ISSN 0010-4825 http://dx.doi.org/10.1016/j.compbiomed.2011.06.004 DOI:10.1016/j.compbiomed.2011.06.004
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Muda, Hilmi M.
Saad, Puteh
Othman, Razib M.
Remote protein homology detection and fold recognition using two-layer support vector machine classifiers
description Remote protein homology detection and fold recognition refer to detection of structural homology in proteins where there are small or no similarities in the sequence. To detect protein structural classes from protein primary sequence information, homology-based methods have been developed, which can be divided to three types: discriminative classifiers, generative models for protein families and pairwise sequence comparisons. Support Vector Machines (SVM) and Neural Networks (NN) are two popular discriminative methods. Recent studies have shown that SVM has fast speed during training, more accurate and efficient compared to NN. We present a comprehensive method based on two-layer classifiers. The 1st layer is used to detect up to superfamily and family in SCOP hierarchy using optimized binary SVM classification rules. It used the kernel function known as the Bio-kernel, which incorporates the biological information in the classification process. The 2nd layer uses discriminative SVM algorithm with string kernel that will detect up to protein fold level in SCOP hierarchy. The results obtained were evaluated using mean ROC and mean MRFP and the significance of the result produced with pairwise t-test was tested. Experimental results show that our approaches significantly improve the performance of remote protein homology detection and fold recognition for all three different version SCOP datasets (1.53, 1.67 and 1.73). We achieved 4.19% improvements in term of mean ROC in SCOP 1.53, 4.75% in SCOP 1.67 and 4.03% in SCOP 1.73 datasets when compared to the result produced by well-known methods. The combination of first layer and second layer of BioSVM-2L performs well in remote homology detection and fold recognition even in three different versions of datasets.
format Article
author Muda, Hilmi M.
Saad, Puteh
Othman, Razib M.
author_facet Muda, Hilmi M.
Saad, Puteh
Othman, Razib M.
author_sort Muda, Hilmi M.
title Remote protein homology detection and fold recognition using two-layer support vector machine classifiers
title_short Remote protein homology detection and fold recognition using two-layer support vector machine classifiers
title_full Remote protein homology detection and fold recognition using two-layer support vector machine classifiers
title_fullStr Remote protein homology detection and fold recognition using two-layer support vector machine classifiers
title_full_unstemmed Remote protein homology detection and fold recognition using two-layer support vector machine classifiers
title_sort remote protein homology detection and fold recognition using two-layer support vector machine classifiers
publisher Elsevier Ltd.
publishDate 2011
url http://eprints.utm.my/id/eprint/29630/
http://dx.doi.org/10.1016/j.compbiomed.2011.06.004
_version_ 1643648340360953856
score 13.160551