Data Mining for Building Neural Protein Sequence Classification Systems with Improved Performance

Traditionally, two protein sequences are classified into the same class if their feature patterns have high homology. These feature patterns were originally extracted by sequence alignment algorithms, which measure similarity between an unseen protein sequence and identified protein sequences. Neur...

Full description

Saved in:
Bibliographic Details
Main Authors: Wang, Dianhui, Lee, Nung Kion, Dillon, Tharam S.
Format: Conference or Workshop Item
Language:English
Published: IEEE 2003
Subjects:
Online Access:http://ir.unimas.my/id/eprint/11927/1/Data%20Mining_abstract.pdf
http://ir.unimas.my/id/eprint/11927/
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1223671
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Traditionally, two protein sequences are classified into the same class if their feature patterns have high homology. These feature patterns were originally extracted by sequence alignment algorithms, which measure similarity between an unseen protein sequence and identified protein sequences. Neural network approaches, while reasonably accurate at classification, give no information ahout the relationship between the unseen case and the classified items that is useful to biologist. In contrast, in this paper we use a generalized radial basis function (GRBF) neural network architecture 'that generates fuzzy classification rules that could he used for further knowledge discovery. Our proposed techniques were evaluated using protein sequences with ten classes of super-families downloaded from a public domain database, and the results compared favorably with other standard machine learning techniques.