Generic code clone detection model for java applications

Code clone is a common term used for codes that are repeated multiple times in a program. There are Type 1, Type 2, Type 3 and Type 4 code clones. Various code clone detection approaches and models have been used to detect a code clone. However, a major challenge faced in detecting code clone using...

Full description

Saved in:
Bibliographic Details
Main Authors: Mubarak-Ali, Al-Fahim, Sulaiman, Shahida
Format: Conference or Workshop Item
Language:English
Published: 2020
Subjects:
Online Access:http://eprints.utm.my/id/eprint/92334/1/ShahidaSulaiman2020_GenericCodeCloneDetectionModelforJavaApplications.pdf
http://eprints.utm.my/id/eprint/92334/
http://dx.doi.org/10.1088/1757-899X/769/1/012023
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utm.92334
record_format eprints
spelling my.utm.923342021-09-28T07:36:50Z http://eprints.utm.my/id/eprint/92334/ Generic code clone detection model for java applications Mubarak-Ali, Al-Fahim Sulaiman, Shahida QA75 Electronic computers. Computer science Code clone is a common term used for codes that are repeated multiple times in a program. There are Type 1, Type 2, Type 3 and Type 4 code clones. Various code clone detection approaches and models have been used to detect a code clone. However, a major challenge faced in detecting code clone using these models is the lack of generality in detecting all clone types. To address this problem, Generic Code Clone Detection (GCCD) model that consists of five processes which are Preprocessing, Transformation, Parameterization, Categorization and Match Detection process is proposed. Initially, a pre-processing process produces source units through the application of five combinatorial rules. This is followed by the transformation process to produce transformed source units based on the letter to number substitution concept. Next, a parameterization process produces parameters used in categorization and match detection process. Next, a categorization process groups the source units into pools. Finally, a match detection process uses a hybrid exact matching with Euclidean distance to detect the clones. Based on these processes, a prototype of the GCCD was developed using Netbeans 8.0. The model was compared with the Generic Pipeline Model (GPM). The comparisons showed that the GCCD was able to detect clone pairs of Type-1 until Type-4 while the GPM was able to detect clone pair for Type-1 only. Furthermore, the GCCD prototype was empirically tested with Bellons benchmark data and it was able to detect clones in Java applications with up to 203,000 line of codes. As a conclusion, the GCCD model is able to overcome the lack of generality in detecting all code clone types by detecting Type 1, Type 2, Type 3 and Type 4 clones. 2020 Conference or Workshop Item PeerReviewed application/pdf en http://eprints.utm.my/id/eprint/92334/1/ShahidaSulaiman2020_GenericCodeCloneDetectionModelforJavaApplications.pdf Mubarak-Ali, Al-Fahim and Sulaiman, Shahida (2020) Generic code clone detection model for java applications. In: 6th International Conference on Software Engineering and Computer Systems, ICSECS 2019, 25 - 27 September 2019, Kuantan, Pahang. http://dx.doi.org/10.1088/1757-899X/769/1/012023
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Mubarak-Ali, Al-Fahim
Sulaiman, Shahida
Generic code clone detection model for java applications
description Code clone is a common term used for codes that are repeated multiple times in a program. There are Type 1, Type 2, Type 3 and Type 4 code clones. Various code clone detection approaches and models have been used to detect a code clone. However, a major challenge faced in detecting code clone using these models is the lack of generality in detecting all clone types. To address this problem, Generic Code Clone Detection (GCCD) model that consists of five processes which are Preprocessing, Transformation, Parameterization, Categorization and Match Detection process is proposed. Initially, a pre-processing process produces source units through the application of five combinatorial rules. This is followed by the transformation process to produce transformed source units based on the letter to number substitution concept. Next, a parameterization process produces parameters used in categorization and match detection process. Next, a categorization process groups the source units into pools. Finally, a match detection process uses a hybrid exact matching with Euclidean distance to detect the clones. Based on these processes, a prototype of the GCCD was developed using Netbeans 8.0. The model was compared with the Generic Pipeline Model (GPM). The comparisons showed that the GCCD was able to detect clone pairs of Type-1 until Type-4 while the GPM was able to detect clone pair for Type-1 only. Furthermore, the GCCD prototype was empirically tested with Bellons benchmark data and it was able to detect clones in Java applications with up to 203,000 line of codes. As a conclusion, the GCCD model is able to overcome the lack of generality in detecting all code clone types by detecting Type 1, Type 2, Type 3 and Type 4 clones.
format Conference or Workshop Item
author Mubarak-Ali, Al-Fahim
Sulaiman, Shahida
author_facet Mubarak-Ali, Al-Fahim
Sulaiman, Shahida
author_sort Mubarak-Ali, Al-Fahim
title Generic code clone detection model for java applications
title_short Generic code clone detection model for java applications
title_full Generic code clone detection model for java applications
title_fullStr Generic code clone detection model for java applications
title_full_unstemmed Generic code clone detection model for java applications
title_sort generic code clone detection model for java applications
publishDate 2020
url http://eprints.utm.my/id/eprint/92334/1/ShahidaSulaiman2020_GenericCodeCloneDetectionModelforJavaApplications.pdf
http://eprints.utm.my/id/eprint/92334/
http://dx.doi.org/10.1088/1757-899X/769/1/012023
_version_ 1712285079841013760
score 13.164666