Study of keyword extraction techniques for electric double-layer capacitor domain using text similarity indexes: An experimental analysis

Keywords perform a significant role in selecting various topic-related documents quite easily. Topics or keywords assigned by humans or experts provide accurate information. However, this practice is quite expensive in terms of resources and time management. Hence, it is more satisfying to utilize a...

Full description

Saved in:
Bibliographic Details
Main Authors: Miah, M. Saef Ullah, Junaida, Sulaiman, Talha, Sarwar, Kamal Z., Zamli, Jose, Rajan
Format: Article
Language:English
Published: Hindawi Limited 2021
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/33202/1/Study%20of%20keyword%20extraction%20techniques%20for%20electric%20double-layer%20capacitor%20domain.pdf
http://umpir.ump.edu.my/id/eprint/33202/
https://doi.org/10.1155/2021/8192320
https://doi.org/10.1155/2021/8192320
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.ump.umpir.33202
record_format eprints
spelling my.ump.umpir.332022022-01-13T03:17:37Z http://umpir.ump.edu.my/id/eprint/33202/ Study of keyword extraction techniques for electric double-layer capacitor domain using text similarity indexes: An experimental analysis Miah, M. Saef Ullah Junaida, Sulaiman Talha, Sarwar Kamal Z., Zamli Jose, Rajan QA76 Computer software QD Chemistry T Technology (General) Keywords perform a significant role in selecting various topic-related documents quite easily. Topics or keywords assigned by humans or experts provide accurate information. However, this practice is quite expensive in terms of resources and time management. Hence, it is more satisfying to utilize automated keyword extraction techniques. Nevertheless, before beginning the automated process, it is necessary to check and confirm how similar expert-provided and algorithm-generated keywords are. This paper presents an experimental analysis of similarity scores of keywords generated by different supervised and unsupervised automated keyword extraction algorithms with expert-provided keywords from the electric double layer capacitor (EDLC) domain. The paper also analyses which texts provide better keywords such as positive sentences or all sentences of the document. From the unsupervised algorithms, YAKE, TopicRank, MultipartiteRank, and KPMiner are employed for keyword extraction. From the supervised algorithms, KEA and WINGNUS are employed for keyword extraction. To assess the similarity of the extracted keywords with expert-provided keywords, Jaccard, Cosine, and Cosine with word vector similarity indexes are employed in this study. The experiment shows that the MultipartiteRank keyword extraction technique measured with cosine with word vector similarity index produces the best result with 92% similarity with expert-provided keywords. This study can help the NLP researchers working with the EDLC domain or recommender systems to select more suitable keyword extraction and similarity index calculation techniques. Hindawi Limited 2021-12-02 Article PeerReviewed pdf en cc_by_4 http://umpir.ump.edu.my/id/eprint/33202/1/Study%20of%20keyword%20extraction%20techniques%20for%20electric%20double-layer%20capacitor%20domain.pdf Miah, M. Saef Ullah and Junaida, Sulaiman and Talha, Sarwar and Kamal Z., Zamli and Jose, Rajan (2021) Study of keyword extraction techniques for electric double-layer capacitor domain using text similarity indexes: An experimental analysis. Complexity, 2021 (8192320). pp. 1-12. ISSN 1076-2787 https://doi.org/10.1155/2021/8192320 https://doi.org/10.1155/2021/8192320
institution Universiti Malaysia Pahang
building UMP Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaysia Pahang
content_source UMP Institutional Repository
url_provider http://umpir.ump.edu.my/
language English
topic QA76 Computer software
QD Chemistry
T Technology (General)
spellingShingle QA76 Computer software
QD Chemistry
T Technology (General)
Miah, M. Saef Ullah
Junaida, Sulaiman
Talha, Sarwar
Kamal Z., Zamli
Jose, Rajan
Study of keyword extraction techniques for electric double-layer capacitor domain using text similarity indexes: An experimental analysis
description Keywords perform a significant role in selecting various topic-related documents quite easily. Topics or keywords assigned by humans or experts provide accurate information. However, this practice is quite expensive in terms of resources and time management. Hence, it is more satisfying to utilize automated keyword extraction techniques. Nevertheless, before beginning the automated process, it is necessary to check and confirm how similar expert-provided and algorithm-generated keywords are. This paper presents an experimental analysis of similarity scores of keywords generated by different supervised and unsupervised automated keyword extraction algorithms with expert-provided keywords from the electric double layer capacitor (EDLC) domain. The paper also analyses which texts provide better keywords such as positive sentences or all sentences of the document. From the unsupervised algorithms, YAKE, TopicRank, MultipartiteRank, and KPMiner are employed for keyword extraction. From the supervised algorithms, KEA and WINGNUS are employed for keyword extraction. To assess the similarity of the extracted keywords with expert-provided keywords, Jaccard, Cosine, and Cosine with word vector similarity indexes are employed in this study. The experiment shows that the MultipartiteRank keyword extraction technique measured with cosine with word vector similarity index produces the best result with 92% similarity with expert-provided keywords. This study can help the NLP researchers working with the EDLC domain or recommender systems to select more suitable keyword extraction and similarity index calculation techniques.
format Article
author Miah, M. Saef Ullah
Junaida, Sulaiman
Talha, Sarwar
Kamal Z., Zamli
Jose, Rajan
author_facet Miah, M. Saef Ullah
Junaida, Sulaiman
Talha, Sarwar
Kamal Z., Zamli
Jose, Rajan
author_sort Miah, M. Saef Ullah
title Study of keyword extraction techniques for electric double-layer capacitor domain using text similarity indexes: An experimental analysis
title_short Study of keyword extraction techniques for electric double-layer capacitor domain using text similarity indexes: An experimental analysis
title_full Study of keyword extraction techniques for electric double-layer capacitor domain using text similarity indexes: An experimental analysis
title_fullStr Study of keyword extraction techniques for electric double-layer capacitor domain using text similarity indexes: An experimental analysis
title_full_unstemmed Study of keyword extraction techniques for electric double-layer capacitor domain using text similarity indexes: An experimental analysis
title_sort study of keyword extraction techniques for electric double-layer capacitor domain using text similarity indexes: an experimental analysis
publisher Hindawi Limited
publishDate 2021
url http://umpir.ump.edu.my/id/eprint/33202/1/Study%20of%20keyword%20extraction%20techniques%20for%20electric%20double-layer%20capacitor%20domain.pdf
http://umpir.ump.edu.my/id/eprint/33202/
https://doi.org/10.1155/2021/8192320
https://doi.org/10.1155/2021/8192320
_version_ 1724073506823471104
score 13.211869