Slicing-based enhanced method for privacy-preserving in publishing big data

Publishing big data and making it accessible to researchers is important for knowledge building as it helps in applying highly efficient methods to plan, conduct, and assess scientific research. However, publishing and processing big data poses a privacy concern related to protecting individuals’ se...

Full description

Saved in:
Bibliographic Details
Main Authors: BinJubeir, Mohammed Ma., Mohd Arfian, Ismail, Ali Ahmed, Abdulghani, Sadiq, Ali Safaa
Format: Article
Language:English
Published: Tech Science Press 2022
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/33598/1/Slicing%20based%20enhanced%20method%20for%20privacy%20preserving.pdf
http://umpir.ump.edu.my/id/eprint/33598/
https://doi.org/10.32604/cmc.2022.024663
https://doi.org/10.32604/cmc.2022.024663
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Publishing big data and making it accessible to researchers is important for knowledge building as it helps in applying highly efficient methods to plan, conduct, and assess scientific research. However, publishing and processing big data poses a privacy concern related to protecting individuals’ sensitive information while maintaining the usability of the published data. Several anonymization methods, such as slicing and merging, have been designed as solutions to the privacy concerns for publishing big data. However, the major drawback of merging and slicing is the random permutation procedure, which does not always guarantee complete protection against attribute or membership disclosure. Moreover, merging procedures may generate many fake tuples, leading to a loss of data utility and subsequent erroneous knowledge extraction. This study therefore proposes a slicing-based enhanced method for privacy-preserving big data publishing while maintaining the data utility. In particular, the proposed method distributes the data into horizontal and vertical partitions. The lower and upper protection levels are then used to identify the unique and identical attributes’ values. The unique and identical attributes are swapped to ensure the published big data is protected from disclosure risks. The outcome of the experiments demonstrates that the proposed method could maintain data utility and provide stronger privacy preservation.