MARC表示: Constrained clustering approach to aid in remodularisation of object-oriented software systems / Chong Chun Yong

Constrained clustering approach to aid in remodularisation of object-oriented software systems / Chong Chun Yong

Effective execution of software maintenance requires knowledge of the detailed working of software. The structure of a software, however, may not be clear to software maintainers because it is poorly designed or, worse, there is no updated software documentation. To effectively address this issue, r...

詳細記述

保存先:

書誌詳細
第一著者:	Chong, Chun Yong
フォーマット:	学位論文
出版事項:	2016
主題:	QA76 Computer software
オンライン･アクセス:	http://studentsrepo.um.edu.my/6606/4/chun_yong.pdf http://studentsrepo.um.edu.my/6606/
タグ:	タグ追加タグなし, このレコードへの初めてのタグを付けませんか!

id	my.um.stud.6606
record_format	eprints
institution	Universiti Malaya
building	UM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Malaya
content_source	UM Student Repository
url_provider	http://studentsrepo.um.edu.my/
topic	QA76 Computer software
spellingShingle	QA76 Computer software Chong, Chun Yong Constrained clustering approach to aid in remodularisation of object-oriented software systems / Chong Chun Yong
description	Effective execution of software maintenance requires knowledge of the detailed working of software. The structure of a software, however, may not be clear to software maintainers because it is poorly designed or, worse, there is no updated software documentation. To effectively address this issue, researchers have proposed to apply software clustering to help in recovering a high-level semantic representation of the software design by grouping sets of collaborating software components into meaningful subsystems. This high-level semantic representation serves to help bridge the dichotomy between the perceived software design from the maintainers’ view and the actual code structure. However, software clustering is typically conducted in an unsupervised and rigid manner, where maintainers have no influence on the clustering results and only a single solution is produced for any given dataset. Even if maintainers possess additional information that could be useful to guide and improve the clustering results, traditional clustering algorithms have no way to take advantage of this information. These practical concerns have led the researcher to propose the idea of integrating domain knowledge into traditional unsupervised clustering algorithms, herewith referred as constrained clustering, a semi-supervised clustering technique where domain experts can explicitly exert their opinions in the form of explicit clustering constraints to restrict whether a pair of software components should or should not be clustered into the same subsystem. Apart from the explicit clustering constraints from domain experts, other sources of information to guide and improve clustering results can be derived implicitly from the source code itself. To help maintainers effectively identify and interpret the implicit information hidden in the source code, this study proposes representing software using weighted complex network in conjunction with graph theory to help in understanding and analysing the structure, behaviour, as well as the complexity of the software components and their iii relationships from the graph theory’s point of view. The results of the analysis can be subsequently converted into implicit clustering constraints. Hence, maintainers can make use of both the explicit and implicit constraints to help in creating a high-level semantic representation of the software design that is coherent and consistent with the actual code structure. This thesis proposes a constrained clustering approach to aid in remodularisation of poorly designed or poorly documented object-oriented software systems. The source code of an object-oriented software system is first converted into UML class diagrams. Next, information from the class diagrams are extracted to measure the strength of cohesion among related classes together with their relationships, and then transform them into a weighted complex network with its nodes and edges associated with measured weights. Graph theory metrics are subsequently applied onto the constructed weighted complex network so that the structure, behaviour, and the complexity of software components and their relationships can be analysed. The results are then converted into sets of clustering constraints. Guided by the explicit and implicit clustering constraints, sets of cohesive clusters are progressively derived to act as a high-level semantic representation of the software design. This research follows an empirical research methodology, where the proposed approach is validated using 40 object-oriented open-source software systems written in Java. Using MoJoFM, which is a well-established technique used to compare the similarity between multiple clustering results, the proposed approach achieves an aggregated average of 80.33% accuracy when compared against the original package diagrams of the 40 software systems, thus considerably outperforms conventional unconstrained clustering approach. The clustering results serve as supplementary information for software iv maintainers to aid in making critical decisions for re-engineering, maintaining and evolving software systems. Ultimately, this research helps in reducing the cost of software maintenance through better comprehension of the recovered software design.
format	Thesis
author	Chong, Chun Yong
author_facet	Chong, Chun Yong
author_sort	Chong, Chun Yong
title	Constrained clustering approach to aid in remodularisation of object-oriented software systems / Chong Chun Yong
title_short	Constrained clustering approach to aid in remodularisation of object-oriented software systems / Chong Chun Yong
title_full	Constrained clustering approach to aid in remodularisation of object-oriented software systems / Chong Chun Yong
title_fullStr	Constrained clustering approach to aid in remodularisation of object-oriented software systems / Chong Chun Yong
title_full_unstemmed	Constrained clustering approach to aid in remodularisation of object-oriented software systems / Chong Chun Yong
title_sort	constrained clustering approach to aid in remodularisation of object-oriented software systems / chong chun yong
publishDate	2016
url	http://studentsrepo.um.edu.my/6606/4/chun_yong.pdf http://studentsrepo.um.edu.my/6606/
_version_	1738505937251991552
spelling	my.um.stud.66062020-01-18T03:01:04Z Constrained clustering approach to aid in remodularisation of object-oriented software systems / Chong Chun Yong Chong, Chun Yong QA76 Computer software Effective execution of software maintenance requires knowledge of the detailed working of software. The structure of a software, however, may not be clear to software maintainers because it is poorly designed or, worse, there is no updated software documentation. To effectively address this issue, researchers have proposed to apply software clustering to help in recovering a high-level semantic representation of the software design by grouping sets of collaborating software components into meaningful subsystems. This high-level semantic representation serves to help bridge the dichotomy between the perceived software design from the maintainers’ view and the actual code structure. However, software clustering is typically conducted in an unsupervised and rigid manner, where maintainers have no influence on the clustering results and only a single solution is produced for any given dataset. Even if maintainers possess additional information that could be useful to guide and improve the clustering results, traditional clustering algorithms have no way to take advantage of this information. These practical concerns have led the researcher to propose the idea of integrating domain knowledge into traditional unsupervised clustering algorithms, herewith referred as constrained clustering, a semi-supervised clustering technique where domain experts can explicitly exert their opinions in the form of explicit clustering constraints to restrict whether a pair of software components should or should not be clustered into the same subsystem. Apart from the explicit clustering constraints from domain experts, other sources of information to guide and improve clustering results can be derived implicitly from the source code itself. To help maintainers effectively identify and interpret the implicit information hidden in the source code, this study proposes representing software using weighted complex network in conjunction with graph theory to help in understanding and analysing the structure, behaviour, as well as the complexity of the software components and their iii relationships from the graph theory’s point of view. The results of the analysis can be subsequently converted into implicit clustering constraints. Hence, maintainers can make use of both the explicit and implicit constraints to help in creating a high-level semantic representation of the software design that is coherent and consistent with the actual code structure. This thesis proposes a constrained clustering approach to aid in remodularisation of poorly designed or poorly documented object-oriented software systems. The source code of an object-oriented software system is first converted into UML class diagrams. Next, information from the class diagrams are extracted to measure the strength of cohesion among related classes together with their relationships, and then transform them into a weighted complex network with its nodes and edges associated with measured weights. Graph theory metrics are subsequently applied onto the constructed weighted complex network so that the structure, behaviour, and the complexity of software components and their relationships can be analysed. The results are then converted into sets of clustering constraints. Guided by the explicit and implicit clustering constraints, sets of cohesive clusters are progressively derived to act as a high-level semantic representation of the software design. This research follows an empirical research methodology, where the proposed approach is validated using 40 object-oriented open-source software systems written in Java. Using MoJoFM, which is a well-established technique used to compare the similarity between multiple clustering results, the proposed approach achieves an aggregated average of 80.33% accuracy when compared against the original package diagrams of the 40 software systems, thus considerably outperforms conventional unconstrained clustering approach. The clustering results serve as supplementary information for software iv maintainers to aid in making critical decisions for re-engineering, maintaining and evolving software systems. Ultimately, this research helps in reducing the cost of software maintenance through better comprehension of the recovered software design. 2016 Thesis NonPeerReviewed application/pdf http://studentsrepo.um.edu.my/6606/4/chun_yong.pdf Chong, Chun Yong (2016) Constrained clustering approach to aid in remodularisation of object-oriented software systems / Chong Chun Yong. PhD thesis, University of Malaya. http://studentsrepo.um.edu.my/6606/
score	13.153044

Constrained clustering approach to aid in remodularisation of object-oriented software systems / Chong Chun Yong

類似資料