CMD: a database to store the bonding states of cysteine motifs with secondary structures

Computational approaches to the disulphide bonding state and its connectivity pattern prediction are based on various descriptors. One descriptor is the amino acid sequence motifs flanking the cysteine residue motifs. Despite the existence of disulphide bonding information in many databases and appl...

Full description

Saved in:
Bibliographic Details
Main Authors: Bostan, Hamed, Salim, Naomie, Hussein, Zeti Azura, Klappa, Peter, Sahmsir, Mohd. Shahir
Format: Article
Language:English
Published: 2012
Subjects:
Online Access:http://eprints.utm.my/id/eprint/46694/1/HamedBostan_2012_CMD%20A%20Database%20to%20Store%20the%20Bonding%20States%20of%20Cysteine%20Motifs.pdf
http://eprints.utm.my/id/eprint/46694/
http://dx.doi.org/10.1155/2012/849830
http://dx.doi.org/10.1155/2012/849830
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Computational approaches to the disulphide bonding state and its connectivity pattern prediction are based on various descriptors. One descriptor is the amino acid sequence motifs flanking the cysteine residue motifs. Despite the existence of disulphide bonding information in many databases and applications, there is no complete reference and motif query available at the moment. Cysteine motif database (CMD) is the first online resource that stores all cysteine residues, their flanking motifs with their secondary structure, and propensity values assignment derived from the laboratory data. We extracted more than 3 million cysteine motifs from PDB and UniProt data, annotated with secondary structure assignment, propensity value assignment, and frequency of occurrence and coefficiency of their bonding status. Removal of redundancies generated 15875 unique flanking motifs that are always bonded and 41577 unique patterns that are always nonbonded. Queries are based on the protein ID, FASTA sequence, sequence motif, and secondary structure individually or in batch format using the provided APIs that allow remote users to query our database via third party software and/or high throughput screening/querying. The CMD offers extensive information about the bonded, free cysteine residues, and their motifs that allows in-depth characterization of the sequence motif composition.