Quasi-identifier recognition algorithm for privacy preservation of cloud data based on risk reidentification

Cloud computing plays an essential role as a source for outsourcing data to perform mining operations or other data processing, especially for data owners who do not have sufficient resources or experience to execute data mining techniques. However, the privacy of outsourced data is a serious concer...

Full description

Saved in:
Bibliographic Details
Main Authors: Mansour, Huda O., M. Siraj, Maheyzah, Abdoh Ghaleb, Fuad Abdulgaleel, Saeed, Faisal, Alkhammash, Eman H., Maarof, Mohd. A.
Format: Article
Language:English
Published: Hindawi Limited 2021
Subjects:
Online Access:http://eprints.utm.my/id/eprint/93964/1/FuadAbdulgaleel2021_QuasiIdentifierRecognitionAlgorithm.pdf
http://eprints.utm.my/id/eprint/93964/
http://dx.doi.org/10.1155/2021/7154705
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Cloud computing plays an essential role as a source for outsourcing data to perform mining operations or other data processing, especially for data owners who do not have sufficient resources or experience to execute data mining techniques. However, the privacy of outsourced data is a serious concern. Most data owners are using anonymization-based techniques to prevent identity and attribute disclosures to avoid privacy leakage before outsourced data for mining over the cloud. In addition, data collection and dissemination in a resource-limited network such as sensor cloud require efficient methods to reduce privacy leakage. The main issue that caused identity disclosure is quasi-identifier (QID) linking. But most researchers of anonymization methods ignore the identification of proper QIDs. This reduces the validity of the used anonymization methods and may thus lead to a failure of the anonymity process. This paper introduces a new quasi-identifier recognition algorithm that reduces identity disclosure which resulted from QID linking. The proposed algorithm is comprised of two main stages: (1) attribute classification (or QID recognition) and (2) QID dimension identification. The algorithm works based on the reidentification of risk rate for all attributes and the dimension of QIDs where it determines the proper QIDs and their suitable dimensions. The proposed algorithm was tested on a real dataset. The results demonstrated that the proposed algorithm significantly reduces privacy leakage and maintains the data utility compared to recent related algorithms.