Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators

The completion of the human genome project in the last decade has generated a strong demand in computational analysis techniques in order to fully exploit the acquired human genome database. The human genome project generated a perplexing mass of genetic data which necessitates automatic genome anno...

Full description

Saved in:
Bibliographic Details
Main Authors: Htike@Muhammad Yusof, Zaw Zaw, Win, Shoon Lei
Format: Article
Language:English
Published: Elsevier Ltd. 2013
Subjects:
Online Access:http://irep.iium.edu.my/34337/4/Recognition_of_promoters_in_DNA_sequences_%28Paper_2%29.pdf
http://irep.iium.edu.my/34337/
http://www.sciencedirect.com/science/article/pii/S1877050913011447
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.iium.irep.34337
record_format dspace
spelling my.iium.irep.343372015-06-01T03:29:28Z http://irep.iium.edu.my/34337/ Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators Htike@Muhammad Yusof, Zaw Zaw Win, Shoon Lei Q Science (General) The completion of the human genome project in the last decade has generated a strong demand in computational analysis techniques in order to fully exploit the acquired human genome database. The human genome project generated a perplexing mass of genetic data which necessitates automatic genome annotation. There is a growing interest in the process of gene finding and gene recognition from DNA sequences. In genetics, a promoter is a segment of a DNA that marks the starting point of transcription of a particular gene. Therefore, recognizing promoters is a one step towards gene finding in DNA sequences. Promoters also play a fundamental role in many other vital cellular processes. Aberrant promoters can cause a wide range of diseases including cancers. This paper describes a state-of-the-art machine learning based approach called weightily averaged one-dependence estimators to tackle the problem of recognizing promoters in genetic sequences. To lower the computational complexity and to increase the generalization capability of the system, we employ an entropy-based feature extraction approach to select relevant nucleotides that are directly responsible for promoter recognition. We carried out experiments on a dataset extracted from the biological literature for a proof-of-concept. The proposed system has achieved an accuracy of 97.17 % in classifying promoters. The experimental results demonstrate the efficacy of our framework and encourage us to extend the framework to recognize promoter sequences in various species of higher eukaryotes. Elsevier Ltd. 2013-11-09 Article REM application/pdf en http://irep.iium.edu.my/34337/4/Recognition_of_promoters_in_DNA_sequences_%28Paper_2%29.pdf Htike@Muhammad Yusof, Zaw Zaw and Win, Shoon Lei (2013) Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators. Procedia Computer Science, 23. pp. 60-67. ISSN 1877-0509 http://www.sciencedirect.com/science/article/pii/S1877050913011447 10.1016/j.procs.2013.10.009
institution Universiti Islam Antarabangsa Malaysia
building IIUM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider International Islamic University Malaysia
content_source IIUM Repository (IREP)
url_provider http://irep.iium.edu.my/
language English
topic Q Science (General)
spellingShingle Q Science (General)
Htike@Muhammad Yusof, Zaw Zaw
Win, Shoon Lei
Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators
description The completion of the human genome project in the last decade has generated a strong demand in computational analysis techniques in order to fully exploit the acquired human genome database. The human genome project generated a perplexing mass of genetic data which necessitates automatic genome annotation. There is a growing interest in the process of gene finding and gene recognition from DNA sequences. In genetics, a promoter is a segment of a DNA that marks the starting point of transcription of a particular gene. Therefore, recognizing promoters is a one step towards gene finding in DNA sequences. Promoters also play a fundamental role in many other vital cellular processes. Aberrant promoters can cause a wide range of diseases including cancers. This paper describes a state-of-the-art machine learning based approach called weightily averaged one-dependence estimators to tackle the problem of recognizing promoters in genetic sequences. To lower the computational complexity and to increase the generalization capability of the system, we employ an entropy-based feature extraction approach to select relevant nucleotides that are directly responsible for promoter recognition. We carried out experiments on a dataset extracted from the biological literature for a proof-of-concept. The proposed system has achieved an accuracy of 97.17 % in classifying promoters. The experimental results demonstrate the efficacy of our framework and encourage us to extend the framework to recognize promoter sequences in various species of higher eukaryotes.
format Article
author Htike@Muhammad Yusof, Zaw Zaw
Win, Shoon Lei
author_facet Htike@Muhammad Yusof, Zaw Zaw
Win, Shoon Lei
author_sort Htike@Muhammad Yusof, Zaw Zaw
title Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators
title_short Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators
title_full Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators
title_fullStr Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators
title_full_unstemmed Recognition of promoters in DNA sequences using weightily averaged one-dependence estimators
title_sort recognition of promoters in dna sequences using weightily averaged one-dependence estimators
publisher Elsevier Ltd.
publishDate 2013
url http://irep.iium.edu.my/34337/4/Recognition_of_promoters_in_DNA_sequences_%28Paper_2%29.pdf
http://irep.iium.edu.my/34337/
http://www.sciencedirect.com/science/article/pii/S1877050913011447
_version_ 1643610608016293888
score 13.188404