Enhancement data integrity checking using combination MD5 and SHA1 algorithm in Hadoop architecture

The use of Big Data in decision-making is critical, in line with the growing size of data storage, either online or offline. However, there are only a few software applications that are capable to process large-capacity data such as Hadoop. Hadoop is open-source software for Big Data processing incl...

Full description

Saved in:
Bibliographic Details
Main Authors: Idris, Yaakub, Ismail, Saiful Adli, Mohd. Azmi, Nurulhuda Firdaus, Azmi, Azri, Azizan, Azizul
Format: Article
Published: SANDKRS Sdn Bhd 2017
Subjects:
Online Access:http://eprints.utm.my/id/eprint/84669/
http://dx.doi.org/10.20967/jcscm.2017.03.007
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The use of Big Data in decision-making is critical, in line with the growing size of data storage, either online or offline. However, there are only a few software applications that are capable to process large-capacity data such as Hadoop. Hadoop is open-source software for Big Data processing including several components joined together where one of its main components is Hadoop User Experience (Hue). Hue is being used to upload the data into Hadoop databases using Graphical User Interface (GUI). However, Hue is not equipped with a function to evaluate whether the downloaded data has changed or not, resulting in the processing of incorrect data that leads to false decisions. Therefore, this study aims to improve the functions available in Hue using MD5 and SHA1 cryptographic functions for data verification purposes. These cryptographic functions have been chosen due to their acceptance worldwide and with the added functionality of data verification in Hue, data validation can be performed during uploading process to prevent users from processing erroneous data. The result of this study will ensure the integrity of the data by validation in any means of changes of data before being stored to the Hadoop in offline mode.