An enhanced malay named entity recognition using clustering and classification approach for crime textual data analysis

Named Entity Recognition (NER) is one of the tasks undertaken in the information extraction. NER is used for extracting and classifying words or entities that belong to the proper noun category in text data such as the person's name, location, organization, date, etc. As seen in today's ge...

Full description

Saved in:
Bibliographic Details
Main Author: Salleh, Muhammad Sharilazlan
Format: Thesis
Language:English
English
Published: 2018
Subjects:
Online Access:http://eprints.utem.edu.my/id/eprint/23326/1/An%20Enhanced%20Malay%20Named%20Entity%20Recognition%20Using%20Clustering%20and%20Classification%20Approach%20For%20Crime%20Textual%20Data%20Analysis.pdf
http://eprints.utem.edu.my/id/eprint/23326/2/An%20enhanced%20malay%20named%20entity%20recognition%20using%20clustering%20and%20classification%20approach%20for%20crime%20textual%20data%20analysis.pdf
http://eprints.utem.edu.my/id/eprint/23326/
https://plh.utem.edu.my/cgi-bin/koha/opac-detail.pl?biblionumber=112736
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utem.eprints.23326
record_format eprints
spelling my.utem.eprints.233262022-04-20T12:25:25Z http://eprints.utem.edu.my/id/eprint/23326/ An enhanced malay named entity recognition using clustering and classification approach for crime textual data analysis Salleh, Muhammad Sharilazlan Q Science (General) QA76 Computer software Named Entity Recognition (NER) is one of the tasks undertaken in the information extraction. NER is used for extracting and classifying words or entities that belong to the proper noun category in text data such as the person's name, location, organization, date, etc. As seen in today's generation, social media such as web pages, blogs, Facebook, Twitter, Instagram and online newspapers are among the major contributors to information extraction. These resources contain various types of unstructured data such as text. However, the amount of works done to process this type of data is limited for Malay Named Entity Recognition (MNER). The deficiency on Malay textual analytic has led to difficulties in extracting information for decision making. This research aims to present a Malay Named Entity Recognition technique that focuses on crime data analysis in the Malay language that extracted from Polis Diraja Malaysia (PDRM) news web page. This Malay Named Entity Recognition (MNER) technique is proposed by using multi-staged of clustering and classification methods. The methods are Fuzzy C-Means and K-Nearest Neighbors Algorithm. The methods involve multi-layer features extraction to recognize entities such as person name, location, organization, date and crime type. This multi-staged technique is obtained 95.24% accuracy in the process of recognizing named entities for text analysis, particularly in Malay. The proposed technique can improve the accuracy performance on named entity recognition of crime data based on the suitability selected features for the Malay language. 2018 Thesis NonPeerReviewed text en http://eprints.utem.edu.my/id/eprint/23326/1/An%20Enhanced%20Malay%20Named%20Entity%20Recognition%20Using%20Clustering%20and%20Classification%20Approach%20For%20Crime%20Textual%20Data%20Analysis.pdf text en http://eprints.utem.edu.my/id/eprint/23326/2/An%20enhanced%20malay%20named%20entity%20recognition%20using%20clustering%20and%20classification%20approach%20for%20crime%20textual%20data%20analysis.pdf Salleh, Muhammad Sharilazlan (2018) An enhanced malay named entity recognition using clustering and classification approach for crime textual data analysis. Masters thesis, Universiti Teknikal Malaysia Melaka. https://plh.utem.edu.my/cgi-bin/koha/opac-detail.pl?biblionumber=112736
institution Universiti Teknikal Malaysia Melaka
building UTEM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknikal Malaysia Melaka
content_source UTEM Institutional Repository
url_provider http://eprints.utem.edu.my/
language English
English
topic Q Science (General)
QA76 Computer software
spellingShingle Q Science (General)
QA76 Computer software
Salleh, Muhammad Sharilazlan
An enhanced malay named entity recognition using clustering and classification approach for crime textual data analysis
description Named Entity Recognition (NER) is one of the tasks undertaken in the information extraction. NER is used for extracting and classifying words or entities that belong to the proper noun category in text data such as the person's name, location, organization, date, etc. As seen in today's generation, social media such as web pages, blogs, Facebook, Twitter, Instagram and online newspapers are among the major contributors to information extraction. These resources contain various types of unstructured data such as text. However, the amount of works done to process this type of data is limited for Malay Named Entity Recognition (MNER). The deficiency on Malay textual analytic has led to difficulties in extracting information for decision making. This research aims to present a Malay Named Entity Recognition technique that focuses on crime data analysis in the Malay language that extracted from Polis Diraja Malaysia (PDRM) news web page. This Malay Named Entity Recognition (MNER) technique is proposed by using multi-staged of clustering and classification methods. The methods are Fuzzy C-Means and K-Nearest Neighbors Algorithm. The methods involve multi-layer features extraction to recognize entities such as person name, location, organization, date and crime type. This multi-staged technique is obtained 95.24% accuracy in the process of recognizing named entities for text analysis, particularly in Malay. The proposed technique can improve the accuracy performance on named entity recognition of crime data based on the suitability selected features for the Malay language.
format Thesis
author Salleh, Muhammad Sharilazlan
author_facet Salleh, Muhammad Sharilazlan
author_sort Salleh, Muhammad Sharilazlan
title An enhanced malay named entity recognition using clustering and classification approach for crime textual data analysis
title_short An enhanced malay named entity recognition using clustering and classification approach for crime textual data analysis
title_full An enhanced malay named entity recognition using clustering and classification approach for crime textual data analysis
title_fullStr An enhanced malay named entity recognition using clustering and classification approach for crime textual data analysis
title_full_unstemmed An enhanced malay named entity recognition using clustering and classification approach for crime textual data analysis
title_sort enhanced malay named entity recognition using clustering and classification approach for crime textual data analysis
publishDate 2018
url http://eprints.utem.edu.my/id/eprint/23326/1/An%20Enhanced%20Malay%20Named%20Entity%20Recognition%20Using%20Clustering%20and%20Classification%20Approach%20For%20Crime%20Textual%20Data%20Analysis.pdf
http://eprints.utem.edu.my/id/eprint/23326/2/An%20enhanced%20malay%20named%20entity%20recognition%20using%20clustering%20and%20classification%20approach%20for%20crime%20textual%20data%20analysis.pdf
http://eprints.utem.edu.my/id/eprint/23326/
https://plh.utem.edu.my/cgi-bin/koha/opac-detail.pl?biblionumber=112736
_version_ 1731229664127483904
score 13.18916