Cost-sensitive structured perceptron incorporating category hierarchy for named entity recognition

Named Entity Recognition (NER) is a fundamental natural language processing task for the identification and classification of expressions into predefined categories, such as person and organization.Existing NER systems usually target about 10 categories and do not incorporate analysis of category re...

Full description

Saved in:
Bibliographic Details
Main Authors: Higashiyama, Shohei, Mathieu, Blondel, Seki, Kazuhiro, Uehara, Kuniaki
Format: Article
Published: Universiti Utara Malaysia Press 2015
Subjects:
Online Access:http://repo.uum.edu.my/24079/
http://jict.uum.edu.my/index.php/previous-issues/143-vol-14-2015
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.uum.repo.24079
record_format eprints
spelling my.uum.repo.240792018-04-29T01:43:54Z http://repo.uum.edu.my/24079/ Cost-sensitive structured perceptron incorporating category hierarchy for named entity recognition Higashiyama, Shohei Mathieu, Blondel Seki, Kazuhiro Uehara, Kuniaki QA75 Electronic computers. Computer science Named Entity Recognition (NER) is a fundamental natural language processing task for the identification and classification of expressions into predefined categories, such as person and organization.Existing NER systems usually target about 10 categories and do not incorporate analysis of category relations.However, categories often belong naturally to some predefined hierarchy.In such cases, the distance between categories in the hierarchy becomes a rich source of information that can be exploited.This is intuitively useful particularly when the categories are numerous.On that account, this paper proposes an NER approach that can leverage category hierarchy information by introducing, in the structured perceptron framework, a cost function more strongly penalizing category predictions that are more distant from the correct category in the hierarchy.Experimental results on the GENIA biomedical text corpus indicate the effectiveness of the proposed approach as compared with the case where no cost function is utilized. In addition, the proposed approach demonstrates the superior performance over a representative work using multi-class support vector machines on the same corpus.A possible direction to further improve the proposed approach is to investigate more elaborate cost functions than a simple additive cost adopted in this work. Universiti Utara Malaysia Press 2015 Article PeerReviewed Higashiyama, Shohei and Mathieu, Blondel and Seki, Kazuhiro and Uehara, Kuniaki (2015) Cost-sensitive structured perceptron incorporating category hierarchy for named entity recognition. Journal of Information and Communication Technology, 14. pp. 1-20. ISSN 2180-3862 http://jict.uum.edu.my/index.php/previous-issues/143-vol-14-2015
institution Universiti Utara Malaysia
building UUM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Utara Malaysia
content_source UUM Institutionali Repository
url_provider http://repo.uum.edu.my/
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Higashiyama, Shohei
Mathieu, Blondel
Seki, Kazuhiro
Uehara, Kuniaki
Cost-sensitive structured perceptron incorporating category hierarchy for named entity recognition
description Named Entity Recognition (NER) is a fundamental natural language processing task for the identification and classification of expressions into predefined categories, such as person and organization.Existing NER systems usually target about 10 categories and do not incorporate analysis of category relations.However, categories often belong naturally to some predefined hierarchy.In such cases, the distance between categories in the hierarchy becomes a rich source of information that can be exploited.This is intuitively useful particularly when the categories are numerous.On that account, this paper proposes an NER approach that can leverage category hierarchy information by introducing, in the structured perceptron framework, a cost function more strongly penalizing category predictions that are more distant from the correct category in the hierarchy.Experimental results on the GENIA biomedical text corpus indicate the effectiveness of the proposed approach as compared with the case where no cost function is utilized. In addition, the proposed approach demonstrates the superior performance over a representative work using multi-class support vector machines on the same corpus.A possible direction to further improve the proposed approach is to investigate more elaborate cost functions than a simple additive cost adopted in this work.
format Article
author Higashiyama, Shohei
Mathieu, Blondel
Seki, Kazuhiro
Uehara, Kuniaki
author_facet Higashiyama, Shohei
Mathieu, Blondel
Seki, Kazuhiro
Uehara, Kuniaki
author_sort Higashiyama, Shohei
title Cost-sensitive structured perceptron incorporating category hierarchy for named entity recognition
title_short Cost-sensitive structured perceptron incorporating category hierarchy for named entity recognition
title_full Cost-sensitive structured perceptron incorporating category hierarchy for named entity recognition
title_fullStr Cost-sensitive structured perceptron incorporating category hierarchy for named entity recognition
title_full_unstemmed Cost-sensitive structured perceptron incorporating category hierarchy for named entity recognition
title_sort cost-sensitive structured perceptron incorporating category hierarchy for named entity recognition
publisher Universiti Utara Malaysia Press
publishDate 2015
url http://repo.uum.edu.my/24079/
http://jict.uum.edu.my/index.php/previous-issues/143-vol-14-2015
_version_ 1644283958482960384
score 13.160551