A comparative study of evolving fuzzy grammar and machine learning techniques for text categorization

Several methods have been studied in text categorization and mostly are inspired by the statistical distribution features in the texts, such as the implementation of Machine Learning (ML) methods. However, there is no work available that investigates the performance of ML-based methods against the t...

Full description

Saved in:
Bibliographic Details
Main Authors: Mohd Sharef, Nurfadhlina, Martin, Trevor, Kasmiran, Khairul Azhar, Mustapha, Aida, Sulaiman, Md. Nasir, Azmi Murad, Masrah Azrifah
Format: Article
Language:English
Published: Springer-Verlag Berlin Heidelberg 2015
Online Access:http://psasir.upm.edu.my/id/eprint/43473/1/abstract00.pdf
http://psasir.upm.edu.my/id/eprint/43473/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.upm.eprints.43473
record_format eprints
spelling my.upm.eprints.434732016-06-28T03:46:14Z http://psasir.upm.edu.my/id/eprint/43473/ A comparative study of evolving fuzzy grammar and machine learning techniques for text categorization Mohd Sharef, Nurfadhlina Martin, Trevor Kasmiran, Khairul Azhar Mustapha, Aida Sulaiman, Md. Nasir Azmi Murad, Masrah Azrifah Several methods have been studied in text categorization and mostly are inspired by the statistical distribution features in the texts, such as the implementation of Machine Learning (ML) methods. However, there is no work available that investigates the performance of ML-based methods against the text expression-based method, especially for incident and medical case categorization. Meanwhile, these two domains are becoming ever more popular, due to a growing interest of automation in security intelligence and health services. This paper presents a text expression-based method called Evolving Fuzzy Grammar (EFG) and evaluates its performance against the conventional ML methods of Naïve Bayes, support vector machine, k-nearest neighbour, adaptive booting, and decision tree. The incident dataset used is a real dataset that was taken from the World Incidents Tracking System, while Image CLEF 2009 was used as the source for radiology case reports. The results suggested variations of strength and weakness of each method in both categorization tasks, where a standard evaluation technique (i.e., recall, precision, and F-measure) was used. In both domains, the SMO and IBk methods were the best, while AdaBoost was the worst. It was also observed that the medical dataset was easier to categorize than the incident. Although EFG was ranked second lowest, it obtained the highest precision score in the bombing categorization, the highest score in armed attack recall, and was averagely ranked in the top three for the medical case categorization. It was also noted that the text expression-based method used in EFG was the most verbose and expressive, when compared to the ML methods. This indicates that EFG is a viable method in text categorization and may serve as an alternative approach to such a task. Springer-Verlag Berlin Heidelberg 2015 Article PeerReviewed application/pdf en http://psasir.upm.edu.my/id/eprint/43473/1/abstract00.pdf Mohd Sharef, Nurfadhlina and Martin, Trevor and Kasmiran, Khairul Azhar and Mustapha, Aida and Sulaiman, Md. Nasir and Azmi Murad, Masrah Azrifah (2015) A comparative study of evolving fuzzy grammar and machine learning techniques for text categorization. Soft Computing, 19 (6). pp. 1701-1714. ISSN 1432-7643; ESSN: 1433-7479 10.1007/s00500-014-1358-x
institution Universiti Putra Malaysia
building UPM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Putra Malaysia
content_source UPM Institutional Repository
url_provider http://psasir.upm.edu.my/
language English
description Several methods have been studied in text categorization and mostly are inspired by the statistical distribution features in the texts, such as the implementation of Machine Learning (ML) methods. However, there is no work available that investigates the performance of ML-based methods against the text expression-based method, especially for incident and medical case categorization. Meanwhile, these two domains are becoming ever more popular, due to a growing interest of automation in security intelligence and health services. This paper presents a text expression-based method called Evolving Fuzzy Grammar (EFG) and evaluates its performance against the conventional ML methods of Naïve Bayes, support vector machine, k-nearest neighbour, adaptive booting, and decision tree. The incident dataset used is a real dataset that was taken from the World Incidents Tracking System, while Image CLEF 2009 was used as the source for radiology case reports. The results suggested variations of strength and weakness of each method in both categorization tasks, where a standard evaluation technique (i.e., recall, precision, and F-measure) was used. In both domains, the SMO and IBk methods were the best, while AdaBoost was the worst. It was also observed that the medical dataset was easier to categorize than the incident. Although EFG was ranked second lowest, it obtained the highest precision score in the bombing categorization, the highest score in armed attack recall, and was averagely ranked in the top three for the medical case categorization. It was also noted that the text expression-based method used in EFG was the most verbose and expressive, when compared to the ML methods. This indicates that EFG is a viable method in text categorization and may serve as an alternative approach to such a task.
format Article
author Mohd Sharef, Nurfadhlina
Martin, Trevor
Kasmiran, Khairul Azhar
Mustapha, Aida
Sulaiman, Md. Nasir
Azmi Murad, Masrah Azrifah
spellingShingle Mohd Sharef, Nurfadhlina
Martin, Trevor
Kasmiran, Khairul Azhar
Mustapha, Aida
Sulaiman, Md. Nasir
Azmi Murad, Masrah Azrifah
A comparative study of evolving fuzzy grammar and machine learning techniques for text categorization
author_facet Mohd Sharef, Nurfadhlina
Martin, Trevor
Kasmiran, Khairul Azhar
Mustapha, Aida
Sulaiman, Md. Nasir
Azmi Murad, Masrah Azrifah
author_sort Mohd Sharef, Nurfadhlina
title A comparative study of evolving fuzzy grammar and machine learning techniques for text categorization
title_short A comparative study of evolving fuzzy grammar and machine learning techniques for text categorization
title_full A comparative study of evolving fuzzy grammar and machine learning techniques for text categorization
title_fullStr A comparative study of evolving fuzzy grammar and machine learning techniques for text categorization
title_full_unstemmed A comparative study of evolving fuzzy grammar and machine learning techniques for text categorization
title_sort comparative study of evolving fuzzy grammar and machine learning techniques for text categorization
publisher Springer-Verlag Berlin Heidelberg
publishDate 2015
url http://psasir.upm.edu.my/id/eprint/43473/1/abstract00.pdf
http://psasir.upm.edu.my/id/eprint/43473/
_version_ 1643833576145289216
score 13.1944895