Definition, approaches, and analysis of code duplication detection (2006-2020): a critical review

Code duplication detection is the act of finding similar code in software development. It is important for software engineer to address the issues of code duplication detection. In this paper, a critical review of previous works on code duplication for code clone and plagiarism detection is performe...

Full description

Saved in:
Bibliographic Details
Main Authors: Chen, Chang Feng, Mohd. Zain, Azlan, Zhou, Kai Qing
Format: Article
Published: Springer Science and Business Media Deutschland GmbH 2022
Subjects:
Online Access:http://eprints.utm.my/103386/
http://dx.doi.org/10.1007/s00521-022-07707-2
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Code duplication detection is the act of finding similar code in software development. It is important for software engineer to address the issues of code duplication detection. In this paper, a critical review of previous works on code duplication for code clone and plagiarism detection is performed. The review involves five main parts. Firstly, a systematic literature review is conducted to confirm the selected articles. Secondly, a critical review of different code duplication approaches is conducted based on three phases; processing, detection, and decision. Thirdly, statistical analysis of the number of review articles is performed to show the trends and hots of code duplication research. Moreover, quantitative analysis of different code duplication approaches is presented to show the effectiveness of different approaches. Fourthly, the advantages and disadvantages of different approaches and techniques are summarized and discussed. Finally, the conclusion of the review is summarized and future research direction of code duplication is described.