Keyword query processing interface model of ontological natural language manipulation

Querying structured information through keyword queries provides an easy way to get to the information without knowing the structural details of the underlying data for formulating formal queries and without posing correct grammatical questions to the user interface. Besides the obvious advantages o...

Full description

Saved in:
Bibliographic Details
Main Author: Hasany, Syed Muhammad Noman
Format: Thesis
Language:English
Published: 2010
Online Access:http://psasir.upm.edu.my/id/eprint/40907/1/FK%202010%2037R.pdf
http://psasir.upm.edu.my/id/eprint/40907/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.upm.eprints.40907
record_format eprints
institution Universiti Putra Malaysia
building UPM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Putra Malaysia
content_source UPM Institutional Repository
url_provider http://psasir.upm.edu.my/
language English
description Querying structured information through keyword queries provides an easy way to get to the information without knowing the structural details of the underlying data for formulating formal queries and without posing correct grammatical questions to the user interface. Besides the obvious advantages of keyword querying, it lacks expressiveness in contrast to syntactic questions. The problems faced by keyword queries lie in the fact that the processing capability is restricted to the posed keywords, additional connecting words and relations among keywords are ignored. semi-structured data like RDF, relations are formally defined as properties among concepts. This helps the keyword querying in finding connections among concepts from underlying data. But instead of this facility, the NLIs results lack in precision and relevance. One major reason for this lacking is that more work is done in increasing efficiency with respect to data storage, data indexing and reporting results using top-k strategies. Less work is performed in the direction of enhancing expressiveness, supporting lengthy queries and answering the queries with relevance oriented ranking.We are concerned with enhancing the keyword query processing model in terms of handling expressive keyword queries and syntactic questions that incorporates quantifier restrictions and AND-OR semantics on RDF knowledge bases. The process of manipulating both type of natural language (NL) queries are supported by Ontologies. These NL queries are converted to target queries for result retrieval from RDF. The generated target queries are required to be ranked so that the results are reported in order to their relevance to the user query. To handle large keyword queries, graph representation and processing is considered as a bottleneck. We preprocessed the RDF graph to be stored in distributed manner after the elimination of single chain productions in order to increase the efficiency in conversion process. We used the shortest path algorithms to be called on certain resources to explore connectivity to reduce complexity of search. For the generality of target query representation and to incorporate quantifiers, subclasses and sub-class unions, we define an extended representation of the conjunctive query, termed as extended conjunctive query. But for the implementation of user query AND-OR semantics and semantic ranking, we define an efficient representation, termed as compact Boolean query (CBQ). Empty result conditions reported by some approaches are also handled with the CBQ. For the problem of conversion, techniques with fixed templates face scalability problems; while graph only techniques are processing intensive. We propose a variable template based conversion with inexpensive graph techniques to handle lengthy queries and exploring indirect connectivity among elements. Considering the ranking problem, relevance ranking comprising of co-occurrence and Boolean semantics is proposed to help in understanding keyword queries and syntactic questions for precise answering. Experimental results applied on LUBM, Mooney and self developed ontologies have shown that our technique can handle queries of 19 keywords within bearable time limits. The CBQ provides complete solution for empty results condition for correctly transformed queries. The coverage of queries is extended to understand queries originated from syntactic questions with improved precision. The improvement in values of MRR and TQP reflects the potential of our designed co-occurrence and AND-OR ranking strategies in placing the most relevant target queries at top positions.
format Thesis
author Hasany, Syed Muhammad Noman
spellingShingle Hasany, Syed Muhammad Noman
Keyword query processing interface model of ontological natural language manipulation
author_facet Hasany, Syed Muhammad Noman
author_sort Hasany, Syed Muhammad Noman
title Keyword query processing interface model of ontological natural language manipulation
title_short Keyword query processing interface model of ontological natural language manipulation
title_full Keyword query processing interface model of ontological natural language manipulation
title_fullStr Keyword query processing interface model of ontological natural language manipulation
title_full_unstemmed Keyword query processing interface model of ontological natural language manipulation
title_sort keyword query processing interface model of ontological natural language manipulation
publishDate 2010
url http://psasir.upm.edu.my/id/eprint/40907/1/FK%202010%2037R.pdf
http://psasir.upm.edu.my/id/eprint/40907/
_version_ 1643832846769455104
spelling my.upm.eprints.409072015-10-06T04:52:38Z http://psasir.upm.edu.my/id/eprint/40907/ Keyword query processing interface model of ontological natural language manipulation Hasany, Syed Muhammad Noman Querying structured information through keyword queries provides an easy way to get to the information without knowing the structural details of the underlying data for formulating formal queries and without posing correct grammatical questions to the user interface. Besides the obvious advantages of keyword querying, it lacks expressiveness in contrast to syntactic questions. The problems faced by keyword queries lie in the fact that the processing capability is restricted to the posed keywords, additional connecting words and relations among keywords are ignored. semi-structured data like RDF, relations are formally defined as properties among concepts. This helps the keyword querying in finding connections among concepts from underlying data. But instead of this facility, the NLIs results lack in precision and relevance. One major reason for this lacking is that more work is done in increasing efficiency with respect to data storage, data indexing and reporting results using top-k strategies. Less work is performed in the direction of enhancing expressiveness, supporting lengthy queries and answering the queries with relevance oriented ranking.We are concerned with enhancing the keyword query processing model in terms of handling expressive keyword queries and syntactic questions that incorporates quantifier restrictions and AND-OR semantics on RDF knowledge bases. The process of manipulating both type of natural language (NL) queries are supported by Ontologies. These NL queries are converted to target queries for result retrieval from RDF. The generated target queries are required to be ranked so that the results are reported in order to their relevance to the user query. To handle large keyword queries, graph representation and processing is considered as a bottleneck. We preprocessed the RDF graph to be stored in distributed manner after the elimination of single chain productions in order to increase the efficiency in conversion process. We used the shortest path algorithms to be called on certain resources to explore connectivity to reduce complexity of search. For the generality of target query representation and to incorporate quantifiers, subclasses and sub-class unions, we define an extended representation of the conjunctive query, termed as extended conjunctive query. But for the implementation of user query AND-OR semantics and semantic ranking, we define an efficient representation, termed as compact Boolean query (CBQ). Empty result conditions reported by some approaches are also handled with the CBQ. For the problem of conversion, techniques with fixed templates face scalability problems; while graph only techniques are processing intensive. We propose a variable template based conversion with inexpensive graph techniques to handle lengthy queries and exploring indirect connectivity among elements. Considering the ranking problem, relevance ranking comprising of co-occurrence and Boolean semantics is proposed to help in understanding keyword queries and syntactic questions for precise answering. Experimental results applied on LUBM, Mooney and self developed ontologies have shown that our technique can handle queries of 19 keywords within bearable time limits. The CBQ provides complete solution for empty results condition for correctly transformed queries. The coverage of queries is extended to understand queries originated from syntactic questions with improved precision. The improvement in values of MRR and TQP reflects the potential of our designed co-occurrence and AND-OR ranking strategies in placing the most relevant target queries at top positions. 2010-10 Thesis NonPeerReviewed application/pdf en http://psasir.upm.edu.my/id/eprint/40907/1/FK%202010%2037R.pdf Hasany, Syed Muhammad Noman (2010) Keyword query processing interface model of ontological natural language manipulation. PhD thesis, Universiti Putra Malaysia.
score 13.160551