Staff View: Review of feature extraction approaches on biomedical text classification

Review of feature extraction approaches on biomedical text classification

The overcoming volume of online biomedical literature causes congestion of data and difficulties in organizing these documents and also to retrieve the required documents from the database, especially in the Medline database. One of the solutions to surpass the overwhelming of documents is to apply...

Full description

Saved in:

Bibliographic Details
Main Authors:	Dollah, R., Jafni, T. I., Hashim, H., Othman, M. S., Rasib, A. W.
Format:	Article
Published:	Inst Advanced Science Extension 2020
Subjects:	QA Mathematics
Online Access:	http://eprints.utm.my/id/eprint/87028/ http://www.dx.doi.org/10.21833/ijaas.2020.04.001
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.utm.87028
record_format	eprints
spelling	my.utm.870282020-10-31T12:16:41Z http://eprints.utm.my/id/eprint/87028/ Review of feature extraction approaches on biomedical text classification Dollah, R. Jafni, T. I. Hashim, H. Othman, M. S. Rasib, A. W. QA Mathematics The overcoming volume of online biomedical literature causes congestion of data and difficulties in organizing these documents and also to retrieve the required documents from the database, especially in the Medline database. One of the solutions to surpass the overwhelming of documents is to apply classification. However, each document must be represented by a set of terminology or feature vectors. The identification of terminology or feature from biomedical literature is one of the most important and challenging tasks in text classification. This is due to a large number of new features and entities that appear in the biomedical domain. In addition, combining sets of features from different terminological resources leads to naming conflicts such as homonymous use of names and terminological ambiguities. Therefore, the purpose of this research is to investigate and evaluate the effective ways for extracting the relevant and meaningful features in order to increase the classification accuracy and improve the performance of web searches. Towards this effort, we conduct several classification experiments to evaluate and compare the effectiveness of feature extraction approaches for extracting the relevant and informative features from the biomedical literature. For our experiments, we use two different sets of features, which are a set of features that are extracted using the Genia tagger tool and set of features that are extracted by medical experts from Pusat Perubatan Universiti Kebangsaan Malaysia (PPUKM). The results show the performance of classification using features that are extracted by medical experts outperform the performance of classification using the Genia Tagger tool when applying feature selection method. Inst Advanced Science Extension 2020-04 Article PeerReviewed Dollah, R. and Jafni, T. I. and Hashim, H. and Othman, M. S. and Rasib, A. W. (2020) Review of feature extraction approaches on biomedical text classification. International Journal of Advanced And Applied Sciences, 7 (4). pp. 1-8. http://www.dx.doi.org/10.21833/ijaas.2020.04.001 DOI:10.21833/ijaas.2020.04.001
institution	Universiti Teknologi Malaysia
building	UTM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Teknologi Malaysia
content_source	UTM Institutional Repository
url_provider	http://eprints.utm.my/
topic	QA Mathematics
spellingShingle	QA Mathematics Dollah, R. Jafni, T. I. Hashim, H. Othman, M. S. Rasib, A. W. Review of feature extraction approaches on biomedical text classification
description	The overcoming volume of online biomedical literature causes congestion of data and difficulties in organizing these documents and also to retrieve the required documents from the database, especially in the Medline database. One of the solutions to surpass the overwhelming of documents is to apply classification. However, each document must be represented by a set of terminology or feature vectors. The identification of terminology or feature from biomedical literature is one of the most important and challenging tasks in text classification. This is due to a large number of new features and entities that appear in the biomedical domain. In addition, combining sets of features from different terminological resources leads to naming conflicts such as homonymous use of names and terminological ambiguities. Therefore, the purpose of this research is to investigate and evaluate the effective ways for extracting the relevant and meaningful features in order to increase the classification accuracy and improve the performance of web searches. Towards this effort, we conduct several classification experiments to evaluate and compare the effectiveness of feature extraction approaches for extracting the relevant and informative features from the biomedical literature. For our experiments, we use two different sets of features, which are a set of features that are extracted using the Genia tagger tool and set of features that are extracted by medical experts from Pusat Perubatan Universiti Kebangsaan Malaysia (PPUKM). The results show the performance of classification using features that are extracted by medical experts outperform the performance of classification using the Genia Tagger tool when applying feature selection method.
format	Article
author	Dollah, R. Jafni, T. I. Hashim, H. Othman, M. S. Rasib, A. W.
author_facet	Dollah, R. Jafni, T. I. Hashim, H. Othman, M. S. Rasib, A. W.
author_sort	Dollah, R.
title	Review of feature extraction approaches on biomedical text classification
title_short	Review of feature extraction approaches on biomedical text classification
title_full	Review of feature extraction approaches on biomedical text classification
title_fullStr	Review of feature extraction approaches on biomedical text classification
title_full_unstemmed	Review of feature extraction approaches on biomedical text classification
title_sort	review of feature extraction approaches on biomedical text classification
publisher	Inst Advanced Science Extension
publishDate	2020
url	http://eprints.utm.my/id/eprint/87028/ http://www.dx.doi.org/10.21833/ijaas.2020.04.001
_version_	1683230692621680640
score	13.19449

Review of feature extraction approaches on biomedical text classification

Similar Items