Data quality in big data: A review

The Data Warehousing Institute (TDWI) estimates that data quality problems cost U.S. businesses more than $600 billion a year. The problem with data is that its quality quickly degenerates over time. Experts say 2 percent of records in a customer file become obsolete in one month because customers d...

Full description

Saved in:
Bibliographic Details
Main Authors: Abdullah, Noraini, Ismail, Saiful Azmi, Yuhaniz, Siti Sophiayati, Mohd. Sam, Suriani
Format: Article
Published: International Center for Scientific Research and Studies 2015
Subjects:
Online Access:http://eprints.utm.my/id/eprint/58206/
https://link.springer.com/chapter/10.1007/978-3-319-99007-1_11
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utm.58206
record_format eprints
spelling my.utm.582062021-08-17T02:08:13Z http://eprints.utm.my/id/eprint/58206/ Data quality in big data: A review Abdullah, Noraini Ismail, Saiful Azmi Yuhaniz, Siti Sophiayati Mohd. Sam, Suriani T Technology (General) The Data Warehousing Institute (TDWI) estimates that data quality problems cost U.S. businesses more than $600 billion a year. The problem with data is that its quality quickly degenerates over time. Experts say 2 percent of records in a customer file become obsolete in one month because customers die, divorce, marry, and move. In addition, data entry errors, system migrations, and changes in source systems, among other things, generate bucket loads of errors. More complex, as organizations fragment into different divisions and units, interpretations of data elements change to meet the local business needs. However, there are several ways that the Company should concern, such as to treat data as a strategic corporate resource; develop a program for managing data quality with a commitment from the top; and hire, train, or outsource experienced data quality professionals to oversee and carry out the program. The Organizations can sustain a commitment to managing data quality over time and adjust monitoring and cleansing processes to changes in the business and underlying systems by using the Commercial data quality tools. Data is a vital resource. Companies that invest proportionally to manage this resource will stand a stronger chance of succeeding in today's competitive global economy than those that squander this critical resource by neglecting to ensure adequate levels of quality. This paper reviews the characteristics of big data quality and the managing processes that are involved in it. International Center for Scientific Research and Studies 2015 Article PeerReviewed Abdullah, Noraini and Ismail, Saiful Azmi and Yuhaniz, Siti Sophiayati and Mohd. Sam, Suriani (2015) Data quality in big data: A review. Interntional Journal Of Advances In Soft Computing And Its Applications, 7 (Specia). pp. 16-27. ISSN 2074-8523 https://link.springer.com/chapter/10.1007/978-3-319-99007-1_11
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
topic T Technology (General)
spellingShingle T Technology (General)
Abdullah, Noraini
Ismail, Saiful Azmi
Yuhaniz, Siti Sophiayati
Mohd. Sam, Suriani
Data quality in big data: A review
description The Data Warehousing Institute (TDWI) estimates that data quality problems cost U.S. businesses more than $600 billion a year. The problem with data is that its quality quickly degenerates over time. Experts say 2 percent of records in a customer file become obsolete in one month because customers die, divorce, marry, and move. In addition, data entry errors, system migrations, and changes in source systems, among other things, generate bucket loads of errors. More complex, as organizations fragment into different divisions and units, interpretations of data elements change to meet the local business needs. However, there are several ways that the Company should concern, such as to treat data as a strategic corporate resource; develop a program for managing data quality with a commitment from the top; and hire, train, or outsource experienced data quality professionals to oversee and carry out the program. The Organizations can sustain a commitment to managing data quality over time and adjust monitoring and cleansing processes to changes in the business and underlying systems by using the Commercial data quality tools. Data is a vital resource. Companies that invest proportionally to manage this resource will stand a stronger chance of succeeding in today's competitive global economy than those that squander this critical resource by neglecting to ensure adequate levels of quality. This paper reviews the characteristics of big data quality and the managing processes that are involved in it.
format Article
author Abdullah, Noraini
Ismail, Saiful Azmi
Yuhaniz, Siti Sophiayati
Mohd. Sam, Suriani
author_facet Abdullah, Noraini
Ismail, Saiful Azmi
Yuhaniz, Siti Sophiayati
Mohd. Sam, Suriani
author_sort Abdullah, Noraini
title Data quality in big data: A review
title_short Data quality in big data: A review
title_full Data quality in big data: A review
title_fullStr Data quality in big data: A review
title_full_unstemmed Data quality in big data: A review
title_sort data quality in big data: a review
publisher International Center for Scientific Research and Studies
publishDate 2015
url http://eprints.utm.my/id/eprint/58206/
https://link.springer.com/chapter/10.1007/978-3-319-99007-1_11
_version_ 1709667334152519680
score 13.160551