Evolving fuzzy grammar for crime texts categorization

Text mining refers to the activity of identifying useful information from natural language text. This is one of the criteria practiced in automated text categorization. Machine learning (ML) based methods are the popular solution for this problem. However, the developed models typically provide low...

Full description

Saved in:
Bibliographic Details
Main Authors: Mohd Sharef, Nurfadhlina, Martin, Trevor
Format: Article
Language:English
Published: Elsevier 2015
Online Access:http://psasir.upm.edu.my/id/eprint/44705/1/FUZZY.pdf
http://psasir.upm.edu.my/id/eprint/44705/
https://www.sciencedirect.com/science/article/pii/S1568494614006000
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.upm.eprints.44705
record_format eprints
spelling my.upm.eprints.447052021-04-20T02:55:36Z http://psasir.upm.edu.my/id/eprint/44705/ Evolving fuzzy grammar for crime texts categorization Mohd Sharef, Nurfadhlina Martin, Trevor Text mining refers to the activity of identifying useful information from natural language text. This is one of the criteria practiced in automated text categorization. Machine learning (ML) based methods are the popular solution for this problem. However, the developed models typically provide low expressivity and lacking in human-understandable representation. In spite of being highly efficient, the ML based methods are established in train–test setting, and when the existing model is found insufficient, the whole processes need to be reinvented which implies train–test–retrain and is typically time consuming. Furthermore, retraining the model is not usually practical and feasible option whenever there is continuous change. This paper introduces the evolving fuzzy grammar (EFG) method for crime texts categorization. In this method, the learning model is built based on a set of selected text fragments which are then transformed into their underlying structure called fuzzy grammars. The fuzzy notion is used because the grammar matching, parsing and derivation involve uncertainty. Fuzzy union operator is also used to combine and transform individual text fragment grammars into more general representations of the learned text fragments. The set of learned fuzzy grammars is influenced by the evolution in the seen pattern; the learned model is slightly changed (incrementally) as adaptation, which does not require the conventional redevelopment. The performance of EFG in crime texts categorization is evaluated against expert-tagged real incidents summaries and compared against C4.5, support vector machines, naïve Bayes, boosting, and k-nearest neighbour methods. Results show that the EFG algorithm produces results that are close in performance with the other ML methods while being highly interpretable, easily integrated into a more comprehensive grammar system and with lower model retraining adaptability time. Elsevier 2015-03 Article PeerReviewed text en http://psasir.upm.edu.my/id/eprint/44705/1/FUZZY.pdf Mohd Sharef, Nurfadhlina and Martin, Trevor (2015) Evolving fuzzy grammar for crime texts categorization. Applied Soft Computing, 28. pp. 175-187. ISSN 1568-4946 https://www.sciencedirect.com/science/article/pii/S1568494614006000 10.1016/j.asoc.2014.11.038
institution Universiti Putra Malaysia
building UPM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Putra Malaysia
content_source UPM Institutional Repository
url_provider http://psasir.upm.edu.my/
language English
description Text mining refers to the activity of identifying useful information from natural language text. This is one of the criteria practiced in automated text categorization. Machine learning (ML) based methods are the popular solution for this problem. However, the developed models typically provide low expressivity and lacking in human-understandable representation. In spite of being highly efficient, the ML based methods are established in train–test setting, and when the existing model is found insufficient, the whole processes need to be reinvented which implies train–test–retrain and is typically time consuming. Furthermore, retraining the model is not usually practical and feasible option whenever there is continuous change. This paper introduces the evolving fuzzy grammar (EFG) method for crime texts categorization. In this method, the learning model is built based on a set of selected text fragments which are then transformed into their underlying structure called fuzzy grammars. The fuzzy notion is used because the grammar matching, parsing and derivation involve uncertainty. Fuzzy union operator is also used to combine and transform individual text fragment grammars into more general representations of the learned text fragments. The set of learned fuzzy grammars is influenced by the evolution in the seen pattern; the learned model is slightly changed (incrementally) as adaptation, which does not require the conventional redevelopment. The performance of EFG in crime texts categorization is evaluated against expert-tagged real incidents summaries and compared against C4.5, support vector machines, naïve Bayes, boosting, and k-nearest neighbour methods. Results show that the EFG algorithm produces results that are close in performance with the other ML methods while being highly interpretable, easily integrated into a more comprehensive grammar system and with lower model retraining adaptability time.
format Article
author Mohd Sharef, Nurfadhlina
Martin, Trevor
spellingShingle Mohd Sharef, Nurfadhlina
Martin, Trevor
Evolving fuzzy grammar for crime texts categorization
author_facet Mohd Sharef, Nurfadhlina
Martin, Trevor
author_sort Mohd Sharef, Nurfadhlina
title Evolving fuzzy grammar for crime texts categorization
title_short Evolving fuzzy grammar for crime texts categorization
title_full Evolving fuzzy grammar for crime texts categorization
title_fullStr Evolving fuzzy grammar for crime texts categorization
title_full_unstemmed Evolving fuzzy grammar for crime texts categorization
title_sort evolving fuzzy grammar for crime texts categorization
publisher Elsevier
publishDate 2015
url http://psasir.upm.edu.my/id/eprint/44705/1/FUZZY.pdf
http://psasir.upm.edu.my/id/eprint/44705/
https://www.sciencedirect.com/science/article/pii/S1568494614006000
_version_ 1698698913536540672
score 13.18916