Feature selection for location metonymy using augmented bag-of-words

Location metonymy resolution is a study that deals with locations being used in a non-literal way that create problems in several natural language processing tasks such as Named entity recognition and Geographical parsing. Many studies were conducted attempting to accurately classify whether the loc...

Full description

Saved in:
Bibliographic Details
Main Authors: Meguellati, Muhammad Elyas, Mahmud, Rohana, Abdul Kareem, Sameem, Zeghina, Assaad Oussama, Saadi, Younes
Format: Article
Published: Institute of Electrical and Electronics Engineers 2022
Subjects:
Online Access:http://eprints.um.edu.my/41789/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Location metonymy resolution is a study that deals with locations being used in a non-literal way that create problems in several natural language processing tasks such as Named entity recognition and Geographical parsing. Many studies were conducted attempting to accurately classify whether the location is used literally or metonymically, however, most of the approaches that performed well had to employ a considerable amount of resources along with complex machine learning models; those that reduced the resources experienced a decline in performance due to data sparseness. This study proposes a novel feature selection approach that uses bag-of-words and augments it with GloVe embeddings to obtain features that can be recognized based on the context of the sentence. We then implement a minimalist deep learning model making the entire classification task as light as possible. The study found that relying solely on the given datasets to identify features without depending on other external resources can achieve remarkable results despite the small size of the datasets. The results obtained from evaluating our method compared to the state-of-the-art methods show that eliminating noise based on the context notwithstanding the usage of low-cost resources has outperformed all of the previous methods with an accuracy of 99.2% on the WIMCOR dataset.