DTLM-DBP: deep transfer learning models for DNA binding proteins identification

The identification of DNA binding proteins (DNABPs) is considered a major challenge in genome annotation because they are linked to several important applied and research applications of cellular functions e.g., in the study of the biological, biophysical, and biochemical effects of antibiotics, dru...

Full description

Saved in:
Bibliographic Details
Main Authors: Saber, S., Khairuddin, U., Yusof, R., Madani, A.
Format: Article
Language:English
Published: Tech Science Press 2021
Subjects:
Online Access:http://eprints.utm.my/id/eprint/94887/1/UswahKhairuddin2021_DTLMDBPDeepTransferLearningModels.pdf
http://eprints.utm.my/id/eprint/94887/
http://dx.doi.org/10.32604/cmc.2021.017769
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The identification of DNA binding proteins (DNABPs) is considered a major challenge in genome annotation because they are linked to several important applied and research applications of cellular functions e.g., in the study of the biological, biophysical, and biochemical effects of antibiotics, drugs, and steroids on DNA. This paper presents an efficient approach for DNABPs identification based on deep transfer learning, named "DTLM-DBP." Two transfer learning methods are used in the identification process. The first is based on the pre-trained deep learning model as a feature's extractor and classifier. Two different pre-trained Convolutional Neural Networks (CNN), AlexNet 8 and VGG 16, are tested and compared. The second method uses the deep learning model as a feature's extractor only and two different classifiers for the identification process. Two classifiers, Support Vector Machine (SVM) and Random Forest (RF), are tested and compared. The proposed approach is tested using different DNA proteins datasets. The performance of the identification process is evaluated in terms of identification accuracy, sensitivity, specificity andMCC, with four available DNAproteins datasets:PDB1075,PDB186,PDNA-543, and PDNA-316. The results show that the RF classifier, with VGG-Net pre-trained deep transfer learning features, gives the highest performance. DTLM-DBP was compared with other published methods and it provides a considerable improvement in the performance of DNABPs identification.