Protein secondary structure prediction from amino acid sequence using artificial intelligence technique
Large genome sequencing projects generate huge number of protein sequences in their primary structures that is difficult for conventional biological techniques to determine their corresponding 3D structures and then their functions. Protein secondary structure prediction is a prerequisite step in de...
Saved in:
Main Authors: | , , |
---|---|
Format: | Monograph |
Language: | English |
Published: |
Faculty of Computer Science and Information System
2005
|
Subjects: | |
Online Access: | http://eprints.utm.my/id/eprint/4265/1/74017.pdf http://eprints.utm.my/id/eprint/4265/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.utm.4265 |
---|---|
record_format |
eprints |
spelling |
my.utm.42652010-06-01T03:16:06Z http://eprints.utm.my/id/eprint/4265/ Protein secondary structure prediction from amino acid sequence using artificial intelligence technique Deris, Safaai Md. Illias, Rosli Arjunan, Satya Nanda Vel QA75 Electronic computers. Computer science Large genome sequencing projects generate huge number of protein sequences in their primary structures that is difficult for conventional biological techniques to determine their corresponding 3D structures and then their functions. Protein secondary structure prediction is a prerequisite step in determining the 3D structure of a protein. In this research a method for prediction of protein secondary structure has been proposed and implemented together with other known accurate methods in this domain. The method has been discussed and presented in a comparative analysis progression to allow easy comparison and clear conclusions. A benchmark data set is exploited in training and testing the methods under the same hardware, platforms, and environments. The newly developed method utilizes the knowledge of the GORV information theory and the power of the neural network to classify a novel protein sequence in one of its three secondary structures classes. NN-GORV-I is developed and implemented to predict proteins secondary structure using the biological information conserved in neighboring residues and related sequences. The method is further improved by a filtering mechanism for the searched sequences to its advanced version NN-GORV-II. The newly developed method is rigorously tested together with the other methods and observed reaches the above 80% level of accuracy. The accuracy and quality of prediction of the newly developed method is superior to all the six methods developed or examined in this research work or that reported in this domain. The Mathews Correlation Coefficients (MCC) proved that NN-GORV-II secondary structure predicted states are highly related to the observed secondary structure states. The NN-GORV-II method is further tested using five DSSP reduction schemes and found stable and reliable in its prediction ability. An additional blind test of sequences that have not been used in the training and testing procedures is conducted and the experimental results show that the NN-GORV-II prediction is of high accuracy, quality, and stability. The Receiver Operating Characteristic (ROC) curve and the area under curve (AUC) are applied as novel procedures to assess a multi-class classifier with approximately 0.5 probability of one and only one class. The results of ROC and AUC prove that the NN-GOR-V-II successfully discriminates between two classes; coils and not-coils. Faculty of Computer Science and Information System 2005-07-31 Monograph NonPeerReviewed application/pdf en http://eprints.utm.my/id/eprint/4265/1/74017.pdf Deris, Safaai and Md. Illias, Rosli and Arjunan, Satya Nanda Vel (2005) Protein secondary structure prediction from amino acid sequence using artificial intelligence technique. Project Report. Faculty of Computer Science and Information System, Skudai, Johor. (Unpublished) |
institution |
Universiti Teknologi Malaysia |
building |
UTM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Malaysia |
content_source |
UTM Institutional Repository |
url_provider |
http://eprints.utm.my/ |
language |
English |
topic |
QA75 Electronic computers. Computer science |
spellingShingle |
QA75 Electronic computers. Computer science Deris, Safaai Md. Illias, Rosli Arjunan, Satya Nanda Vel Protein secondary structure prediction from amino acid sequence using artificial intelligence technique |
description |
Large genome sequencing projects generate huge number of protein sequences in their primary structures that is difficult for conventional biological techniques to determine their corresponding 3D structures and then their functions. Protein secondary structure prediction is a prerequisite step in determining the 3D structure of a protein. In this research a method for prediction of protein secondary structure has been proposed and implemented together with other known accurate methods in this domain. The method has been discussed and presented in a comparative analysis progression to allow easy comparison and clear conclusions. A benchmark data set is exploited in training and testing the methods under the same hardware, platforms, and environments. The newly developed method utilizes the knowledge of the GORV information theory and the power of the neural network to classify a novel protein sequence in one of its three secondary structures classes. NN-GORV-I is developed and implemented to predict proteins secondary structure using the biological information conserved in neighboring residues and related sequences. The method is further improved by a filtering mechanism for the searched sequences to its advanced version NN-GORV-II. The newly developed method is rigorously tested together with the other methods and observed reaches the above 80% level of accuracy. The accuracy and quality of prediction of the newly developed method is superior to all the six methods developed or examined in this research work or that reported in this domain. The Mathews Correlation Coefficients (MCC) proved that NN-GORV-II secondary structure predicted states are highly related to the observed secondary structure states. The NN-GORV-II method is further tested using five DSSP reduction schemes and found stable and reliable in its prediction ability. An additional blind test of sequences that have not been used in the training and testing procedures is conducted and the experimental results show that the NN-GORV-II prediction is of high accuracy, quality, and stability. The Receiver Operating Characteristic (ROC) curve and the area under curve (AUC) are applied as novel procedures to assess a multi-class classifier with approximately 0.5 probability of one and only one class. The results of ROC and AUC prove that the NN-GOR-V-II successfully discriminates between two classes; coils and not-coils. |
format |
Monograph |
author |
Deris, Safaai Md. Illias, Rosli Arjunan, Satya Nanda Vel |
author_facet |
Deris, Safaai Md. Illias, Rosli Arjunan, Satya Nanda Vel |
author_sort |
Deris, Safaai |
title |
Protein secondary structure prediction from amino acid sequence using artificial intelligence technique |
title_short |
Protein secondary structure prediction from amino acid sequence using artificial intelligence technique |
title_full |
Protein secondary structure prediction from amino acid sequence using artificial intelligence technique |
title_fullStr |
Protein secondary structure prediction from amino acid sequence using artificial intelligence technique |
title_full_unstemmed |
Protein secondary structure prediction from amino acid sequence using artificial intelligence technique |
title_sort |
protein secondary structure prediction from amino acid sequence using artificial intelligence technique |
publisher |
Faculty of Computer Science and Information System |
publishDate |
2005 |
url |
http://eprints.utm.my/id/eprint/4265/1/74017.pdf http://eprints.utm.my/id/eprint/4265/ |
_version_ |
1643644008784723968 |
score |
13.186839 |