Parallel model and scheduling technique for spaces complexity and synchronization problems in sequences alignment

Biologists are confusing with the huge amount of data resulting from conformations of DNA and protein sequences. In an earlier stage, a dot-plot method is used to identify new sequences. It is based on comparing sequences in a level of graphical illustration to detect similar locations of sequences....

Full description

Saved in:
Bibliographic Details
Main Authors: Eltayeeb, Manhal Elfadil, Abd. Latiff, Muhammad Shafie, Isnin, Ismail Fauzi
Format: Article
Language:English
Published: Asian Research Publishing Network (ARPN) 2014
Subjects:
Online Access:http://eprints.utm.my/id/eprint/54398/1/ManhalElfadilEltayeeb2014_Parallelmodelandschedulingtechnique.pdf
http://eprints.utm.my/id/eprint/54398/
http://www.jatit.org/volumes/sixtythird_2_2014.php
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Biologists are confusing with the huge amount of data resulting from conformations of DNA and protein sequences. In an earlier stage, a dot-plot method is used to identify new sequences. It is based on comparing sequences in a level of graphical illustration to detect similar locations of sequences. However, for long sequences this method is impractical. Furthermore, Improvement method using sequential machine adopted by Needleman-Wunsch (NW) and Smith-Waterman (SW) algorithms, where sequences set in a matrix with scoring system and optimal alignment via dynamic programming method is achieved. Unfortunately, these algorithms suffer from time and space complexity. An alternative approach is necessary to compare long sequences in a reasonable time with respect to memory restrictions. In this paper, we developed a new parallel model with implementing scheduler-worker paradigm and a scheduling technique. Our model is based on Bulk Synchronous Parallelism (BSP) model, where each worker has its own distributed memory and accomplish selected number of blocks. Using X86-based PC with eight logical processors we are able to compare sequences range from 411 KBP to 4 MBP in o(m+n/w/w)space and linear communication complexity