Comparing two corpus-based methods for extracting paraphrases to dictionary-based method

Paraphrase extraction plays an increasingly important role in language-related research and applications in areas such as information retrieval, question answering and automatic machine evaluation. Most of the existing methods extract paraphrases from different types of corpora by using syntactic-ba...

Full description

Saved in:
Bibliographic Details
Main Authors: Ho, Chuk Fong, Azmi Murad, Masrah Azrifah, Abdul Kadir, Rabiah, C. Doraisamy, Shyamala
Format: Article
Language:English
Published: World Scientific Publishing 2011
Online Access:http://psasir.upm.edu.my/id/eprint/22466/1/Comparing%20two%20corpus-based%20methods%20for%20extracting%20paraphrases%20to%20dictionary-based%20method.pdf
http://psasir.upm.edu.my/id/eprint/22466/
http://www.worldscientific.com/doi/abs/10.1142/S1793351X11001225
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.upm.eprints.22466
record_format eprints
spelling my.upm.eprints.224662016-06-08T09:00:40Z http://psasir.upm.edu.my/id/eprint/22466/ Comparing two corpus-based methods for extracting paraphrases to dictionary-based method Ho, Chuk Fong Azmi Murad, Masrah Azrifah Abdul Kadir, Rabiah C. Doraisamy, Shyamala Paraphrase extraction plays an increasingly important role in language-related research and applications in areas such as information retrieval, question answering and automatic machine evaluation. Most of the existing methods extract paraphrases from different types of corpora by using syntactic-based approaches. Since a syntactic-based approach relies on the similarity of context to identify and capture paraphrases, other than paraphrases, other terms which tend to appear in a similar context such as loosely related terms and functionally similar yet unrelated terms tend to be extracted. Besides, different types of corpora suffer from different kinds of problems such as limited availability and domain biased. This paper presents a solely semantic-based paraphrase extraction model. This model collects paraphrases from multiple lexical resources and validates those paraphrases semantically in three ways; by computing domain similarity, definition similarity and word similarity. This model is benchmarked with two outstanding syntactic-based approaches. The experimental results from a manual evaluation show that the proposed model outperforms the benchmarks. The results indicate that a semantic-based approach should be applied in paraphrase extraction instead of a syntactic-based approach. The results further suggest that a hybrid of these two approaches should be applied if one targets strictly precise paraphrases. World Scientific Publishing 2011 Article PeerReviewed application/pdf en http://psasir.upm.edu.my/id/eprint/22466/1/Comparing%20two%20corpus-based%20methods%20for%20extracting%20paraphrases%20to%20dictionary-based%20method.pdf Ho, Chuk Fong and Azmi Murad, Masrah Azrifah and Abdul Kadir, Rabiah and C. Doraisamy, Shyamala (2011) Comparing two corpus-based methods for extracting paraphrases to dictionary-based method. International Journal of Semantic Computing, 5 (2). pp. 133-178. ISSN 1793-351X; ESSN: 1793-7108 http://www.worldscientific.com/doi/abs/10.1142/S1793351X11001225 10.1142/S1793351X11001225
institution Universiti Putra Malaysia
building UPM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Putra Malaysia
content_source UPM Institutional Repository
url_provider http://psasir.upm.edu.my/
language English
description Paraphrase extraction plays an increasingly important role in language-related research and applications in areas such as information retrieval, question answering and automatic machine evaluation. Most of the existing methods extract paraphrases from different types of corpora by using syntactic-based approaches. Since a syntactic-based approach relies on the similarity of context to identify and capture paraphrases, other than paraphrases, other terms which tend to appear in a similar context such as loosely related terms and functionally similar yet unrelated terms tend to be extracted. Besides, different types of corpora suffer from different kinds of problems such as limited availability and domain biased. This paper presents a solely semantic-based paraphrase extraction model. This model collects paraphrases from multiple lexical resources and validates those paraphrases semantically in three ways; by computing domain similarity, definition similarity and word similarity. This model is benchmarked with two outstanding syntactic-based approaches. The experimental results from a manual evaluation show that the proposed model outperforms the benchmarks. The results indicate that a semantic-based approach should be applied in paraphrase extraction instead of a syntactic-based approach. The results further suggest that a hybrid of these two approaches should be applied if one targets strictly precise paraphrases.
format Article
author Ho, Chuk Fong
Azmi Murad, Masrah Azrifah
Abdul Kadir, Rabiah
C. Doraisamy, Shyamala
spellingShingle Ho, Chuk Fong
Azmi Murad, Masrah Azrifah
Abdul Kadir, Rabiah
C. Doraisamy, Shyamala
Comparing two corpus-based methods for extracting paraphrases to dictionary-based method
author_facet Ho, Chuk Fong
Azmi Murad, Masrah Azrifah
Abdul Kadir, Rabiah
C. Doraisamy, Shyamala
author_sort Ho, Chuk Fong
title Comparing two corpus-based methods for extracting paraphrases to dictionary-based method
title_short Comparing two corpus-based methods for extracting paraphrases to dictionary-based method
title_full Comparing two corpus-based methods for extracting paraphrases to dictionary-based method
title_fullStr Comparing two corpus-based methods for extracting paraphrases to dictionary-based method
title_full_unstemmed Comparing two corpus-based methods for extracting paraphrases to dictionary-based method
title_sort comparing two corpus-based methods for extracting paraphrases to dictionary-based method
publisher World Scientific Publishing
publishDate 2011
url http://psasir.upm.edu.my/id/eprint/22466/1/Comparing%20two%20corpus-based%20methods%20for%20extracting%20paraphrases%20to%20dictionary-based%20method.pdf
http://psasir.upm.edu.my/id/eprint/22466/
http://www.worldscientific.com/doi/abs/10.1142/S1793351X11001225
_version_ 1643827836304228352
score 13.19449