Neighbour replica affirmative adaptive failure detection and autonomous recovery

High availability is an important property for current distributed systems. The trends of current distributed systems such as grid computing and cloud computing are the delivery of computing as a service rather than a product. Thus, current distributed systems rely more on the highly available...

Full description

Saved in:
Bibliographic Details
Main Author: Mohd Noor, Ahmad Shukri
Format: Thesis
Language:English
English
English
Published: 2012
Subjects:
Online Access:http://eprints.uthm.edu.my/2475/1/24p%20AHMAD%20SHUKRI%20MOHD%20NOOR.pdf
http://eprints.uthm.edu.my/2475/2/AHMAD%20SHUKRI%20MOHD%20NOOR%20COPYRIGHT%20DECLARATION.pdf
http://eprints.uthm.edu.my/2475/3/AHMAD%20SHUKRI%20MOHD%20NOOR%20WATERMARK.pdf
http://eprints.uthm.edu.my/2475/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.uthm.eprints.2475
record_format eprints
spelling my.uthm.eprints.24752021-11-01T02:13:49Z http://eprints.uthm.edu.my/2475/ Neighbour replica affirmative adaptive failure detection and autonomous recovery Mohd Noor, Ahmad Shukri TK Electrical engineering. Electronics Nuclear engineering TK1001-1841 Production of electric energy or power. Powerplants. Central stations High availability is an important property for current distributed systems. The trends of current distributed systems such as grid computing and cloud computing are the delivery of computing as a service rather than a product. Thus, current distributed systems rely more on the highly available systems. The potential to fail-stop failure in distributed computing systems is a significant disruptive factor for high availability distributed system. Hence, a new failure detection approach in a distributed system called Affirmative Adaptive Failure Detection (AAFD) is introduced. AAFD utilises heartbeat for node monitoring. Subsequently, Neighbour Replica Failure Recovery(NRFR) is proposed for autonomous recovery in distributed systems. AAFD can be classified as an adaptive failure detector, since it can adapt to the unpredictable network conditions and CPU loads. NRFR utilises the advantages of the neighbour replica distributed technique (NRDT) and combines with weighted priority selection in order to achieve high availability, since automatic failure recovery through continuous monitoring approach is essential in current high availability distributed system. The environment is continuously monitored by AAFD while auto-reconfiguring environment for automating failure recovery is managed by NRFR. The NRFR and AAFD are evaluated through virtualisation implementation. The results showed that the AAFD is 30% better than other detection techniques. While for recovery performance, the NRFR outperformed the others only with an exception to recovery in two distributed technique (TRDT). Subsequently, a realistic logical structure is modelled in complex and interdependent distributed environment for NRDT and TRDT. The model prediction showed that NRDT availability is 38.8% better than TRDT. Thus, the model proved that NRDT is the ideal replication environment for practical failure recovery in complex distributed systems. Hence, with the ability to minimise the Mean Time To Repair (MTTR) significantly and maximise Mean Time Between Failure (MTBF), this research has accomplished the goal to provide high availability self sustainable distributed system. 2012-11 Thesis NonPeerReviewed text en http://eprints.uthm.edu.my/2475/1/24p%20AHMAD%20SHUKRI%20MOHD%20NOOR.pdf text en http://eprints.uthm.edu.my/2475/2/AHMAD%20SHUKRI%20MOHD%20NOOR%20COPYRIGHT%20DECLARATION.pdf text en http://eprints.uthm.edu.my/2475/3/AHMAD%20SHUKRI%20MOHD%20NOOR%20WATERMARK.pdf Mohd Noor, Ahmad Shukri (2012) Neighbour replica affirmative adaptive failure detection and autonomous recovery. Doctoral thesis, Universiti Tun Hussein Onn Malaysia.
institution Universiti Tun Hussein Onn Malaysia
building UTHM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Tun Hussein Onn Malaysia
content_source UTHM Institutional Repository
url_provider http://eprints.uthm.edu.my/
language English
English
English
topic TK Electrical engineering. Electronics Nuclear engineering
TK1001-1841 Production of electric energy or power. Powerplants. Central stations
spellingShingle TK Electrical engineering. Electronics Nuclear engineering
TK1001-1841 Production of electric energy or power. Powerplants. Central stations
Mohd Noor, Ahmad Shukri
Neighbour replica affirmative adaptive failure detection and autonomous recovery
description High availability is an important property for current distributed systems. The trends of current distributed systems such as grid computing and cloud computing are the delivery of computing as a service rather than a product. Thus, current distributed systems rely more on the highly available systems. The potential to fail-stop failure in distributed computing systems is a significant disruptive factor for high availability distributed system. Hence, a new failure detection approach in a distributed system called Affirmative Adaptive Failure Detection (AAFD) is introduced. AAFD utilises heartbeat for node monitoring. Subsequently, Neighbour Replica Failure Recovery(NRFR) is proposed for autonomous recovery in distributed systems. AAFD can be classified as an adaptive failure detector, since it can adapt to the unpredictable network conditions and CPU loads. NRFR utilises the advantages of the neighbour replica distributed technique (NRDT) and combines with weighted priority selection in order to achieve high availability, since automatic failure recovery through continuous monitoring approach is essential in current high availability distributed system. The environment is continuously monitored by AAFD while auto-reconfiguring environment for automating failure recovery is managed by NRFR. The NRFR and AAFD are evaluated through virtualisation implementation. The results showed that the AAFD is 30% better than other detection techniques. While for recovery performance, the NRFR outperformed the others only with an exception to recovery in two distributed technique (TRDT). Subsequently, a realistic logical structure is modelled in complex and interdependent distributed environment for NRDT and TRDT. The model prediction showed that NRDT availability is 38.8% better than TRDT. Thus, the model proved that NRDT is the ideal replication environment for practical failure recovery in complex distributed systems. Hence, with the ability to minimise the Mean Time To Repair (MTTR) significantly and maximise Mean Time Between Failure (MTBF), this research has accomplished the goal to provide high availability self sustainable distributed system.
format Thesis
author Mohd Noor, Ahmad Shukri
author_facet Mohd Noor, Ahmad Shukri
author_sort Mohd Noor, Ahmad Shukri
title Neighbour replica affirmative adaptive failure detection and autonomous recovery
title_short Neighbour replica affirmative adaptive failure detection and autonomous recovery
title_full Neighbour replica affirmative adaptive failure detection and autonomous recovery
title_fullStr Neighbour replica affirmative adaptive failure detection and autonomous recovery
title_full_unstemmed Neighbour replica affirmative adaptive failure detection and autonomous recovery
title_sort neighbour replica affirmative adaptive failure detection and autonomous recovery
publishDate 2012
url http://eprints.uthm.edu.my/2475/1/24p%20AHMAD%20SHUKRI%20MOHD%20NOOR.pdf
http://eprints.uthm.edu.my/2475/2/AHMAD%20SHUKRI%20MOHD%20NOOR%20COPYRIGHT%20DECLARATION.pdf
http://eprints.uthm.edu.my/2475/3/AHMAD%20SHUKRI%20MOHD%20NOOR%20WATERMARK.pdf
http://eprints.uthm.edu.my/2475/
_version_ 1738580995902275584
score 13.211869