Enhance efficiency of answering XML keyword query using incompact structure of MCCTree

People nowadays live in cyber life where everything can be done by just typing through keyboard and system will complete the process. As the interaction is done through online, data sharing is the most important service to send and deliver information. Extended Markup Language (XML) has been chosen...

Full description

Saved in:
Bibliographic Details
Main Author: Sazaly, Ummu Sulaim
Format: Thesis
Language:English
Published: 2012
Online Access:http://psasir.upm.edu.my/id/eprint/38635/1/FSKTM%202013%203R.pdf
http://psasir.upm.edu.my/id/eprint/38635/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.upm.eprints.38635
record_format eprints
spelling my.upm.eprints.386352016-01-13T09:21:50Z http://psasir.upm.edu.my/id/eprint/38635/ Enhance efficiency of answering XML keyword query using incompact structure of MCCTree Sazaly, Ummu Sulaim People nowadays live in cyber life where everything can be done by just typing through keyboard and system will complete the process. As the interaction is done through online, data sharing is the most important service to send and deliver information. Extended Markup Language (XML) has been chosen as the most important data sharing medium as it is very friendly for human and machine to interpret. Due to the importance of it, many studies have been done to increase the effectiveness of retrieving information from XML file. Many notions and techniques have been introduced especially to process query of information. Compact Lowest Common Ancestor (CLCA) and Maximal Compact Lowest Common Ancestor (MCLCA) implemented in algorithms named CGTreeGenerator and MCCTreeGenerator has been proven in returning an accurate result in answering XML keyword query. CGTreeGenerator compacted the XML tree by eliminating irrelevant nodes based on CLCA notion, which produced Compact Global Tree (CGTree). MCCTreeGenerator used CGTree to select subtree called Maximal Compact Connected Tree (MCCTree) as query result based on MCLCA notion. However, the MCCTree cannot be used directly in its ranking method because calculation in ranking method used the structure of subtree as before it has been compacted. If the result cannot be used directly by the ranking method, the algorithm has an ineffective process. Moreover, if the ineffective process requires re-examining the original tree, the efficiency of the process of the algorithm will be reduced. This study is a response to these weaknesses. This study proposes a new algorithm, namely XMCCTreeGenerator, to enhance the efficiency of the CGTree- MCCTreeGenerator. This study identifies the effective processes needed in producing XML query result using MCLCA notion and without compacting it. Those processes constructed MCCTreeGenerator algorithm which will produce the same subtree as MCCTree but difference in its structure. This new returned subtree called Extended MCCTree(XMCCTree) can be used directly by the ranking method because it is in an incompact structure. An experiment is run using XML datasets available in XML Data Repository from University of Washington’s website. Two files are selected which consist of different data structure and divided into three ranges of size. Keywords are manually randomly selected from the files and executed between three to five numbers of keyword. Two prototypes are developed which implement CGTree-MCCTreeGenerator and XMCCTreeGenerator. Since this study focuses on efficiency of the algorithm, elapsed time for each execution is collected from the experiment. In conclusion, the proposed XMCCTreeGenerator is more efficient than the previous CGTree- MCCTreeGenerator in answering XML keyword query using MCLCA. 2012-11 Thesis NonPeerReviewed application/pdf en http://psasir.upm.edu.my/id/eprint/38635/1/FSKTM%202013%203R.pdf Sazaly, Ummu Sulaim (2012) Enhance efficiency of answering XML keyword query using incompact structure of MCCTree. Masters thesis, Universiti Putra Malaysia.
institution Universiti Putra Malaysia
building UPM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Putra Malaysia
content_source UPM Institutional Repository
url_provider http://psasir.upm.edu.my/
language English
description People nowadays live in cyber life where everything can be done by just typing through keyboard and system will complete the process. As the interaction is done through online, data sharing is the most important service to send and deliver information. Extended Markup Language (XML) has been chosen as the most important data sharing medium as it is very friendly for human and machine to interpret. Due to the importance of it, many studies have been done to increase the effectiveness of retrieving information from XML file. Many notions and techniques have been introduced especially to process query of information. Compact Lowest Common Ancestor (CLCA) and Maximal Compact Lowest Common Ancestor (MCLCA) implemented in algorithms named CGTreeGenerator and MCCTreeGenerator has been proven in returning an accurate result in answering XML keyword query. CGTreeGenerator compacted the XML tree by eliminating irrelevant nodes based on CLCA notion, which produced Compact Global Tree (CGTree). MCCTreeGenerator used CGTree to select subtree called Maximal Compact Connected Tree (MCCTree) as query result based on MCLCA notion. However, the MCCTree cannot be used directly in its ranking method because calculation in ranking method used the structure of subtree as before it has been compacted. If the result cannot be used directly by the ranking method, the algorithm has an ineffective process. Moreover, if the ineffective process requires re-examining the original tree, the efficiency of the process of the algorithm will be reduced. This study is a response to these weaknesses. This study proposes a new algorithm, namely XMCCTreeGenerator, to enhance the efficiency of the CGTree- MCCTreeGenerator. This study identifies the effective processes needed in producing XML query result using MCLCA notion and without compacting it. Those processes constructed MCCTreeGenerator algorithm which will produce the same subtree as MCCTree but difference in its structure. This new returned subtree called Extended MCCTree(XMCCTree) can be used directly by the ranking method because it is in an incompact structure. An experiment is run using XML datasets available in XML Data Repository from University of Washington’s website. Two files are selected which consist of different data structure and divided into three ranges of size. Keywords are manually randomly selected from the files and executed between three to five numbers of keyword. Two prototypes are developed which implement CGTree-MCCTreeGenerator and XMCCTreeGenerator. Since this study focuses on efficiency of the algorithm, elapsed time for each execution is collected from the experiment. In conclusion, the proposed XMCCTreeGenerator is more efficient than the previous CGTree- MCCTreeGenerator in answering XML keyword query using MCLCA.
format Thesis
author Sazaly, Ummu Sulaim
spellingShingle Sazaly, Ummu Sulaim
Enhance efficiency of answering XML keyword query using incompact structure of MCCTree
author_facet Sazaly, Ummu Sulaim
author_sort Sazaly, Ummu Sulaim
title Enhance efficiency of answering XML keyword query using incompact structure of MCCTree
title_short Enhance efficiency of answering XML keyword query using incompact structure of MCCTree
title_full Enhance efficiency of answering XML keyword query using incompact structure of MCCTree
title_fullStr Enhance efficiency of answering XML keyword query using incompact structure of MCCTree
title_full_unstemmed Enhance efficiency of answering XML keyword query using incompact structure of MCCTree
title_sort enhance efficiency of answering xml keyword query using incompact structure of mcctree
publishDate 2012
url http://psasir.upm.edu.my/id/eprint/38635/1/FSKTM%202013%203R.pdf
http://psasir.upm.edu.my/id/eprint/38635/
_version_ 1643832195750559744
score 13.160551