An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents

Event Detection (ED) works on identifying events from various types of data. Building an ED model for news text documents greatly helps decision-makers in various disciplines in improving their strategies. However, identifying and summarizing events from such data is a non-trivial task due to the la...

Full description

Saved in:
Bibliographic Details
Main Author: Al-Dyani, Wafa Zubair Abdullah
Format: Thesis
Language:English
English
Published: 2022
Subjects:
Online Access:https://etd.uum.edu.my/10228/1/s901775_01.pdf
https://etd.uum.edu.my/10228/2/s901775_02.pdf
https://etd.uum.edu.my/10228/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.uum.etd.10228
record_format eprints
spelling my.uum.etd.102282023-01-16T03:51:34Z https://etd.uum.edu.my/10228/ An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents Al-Dyani, Wafa Zubair Abdullah QA Mathematics Event Detection (ED) works on identifying events from various types of data. Building an ED model for news text documents greatly helps decision-makers in various disciplines in improving their strategies. However, identifying and summarizing events from such data is a non-trivial task due to the large volume of published heterogeneous news text documents. Such documents create a high-dimensional feature space that influences the overall performance of the baseline methods in ED model. To address such a problem, this research presents an enhanced ED model that includes improved methods for the crucial phases of the ED model such as Feature Selection (FS), ED, and summarization. This work focuses on the FS problem by automatically detecting events through a novel wrapper FS method based on Adapted Binary Bat Algorithm (ABBA) and Adapted Markov Clustering Algorithm (AMCL), termed ABBA-AMCL. These adaptive techniques were developed to overcome the premature convergence in BBA and fast convergence rate in MCL. Furthermore, this study proposes four summarizing methods to generate informative summaries. The enhanced ED model was tested on 10 benchmark datasets and 2 Facebook news datasets. The effectiveness of ABBA-AMCL was compared to 8 FS methods based on meta-heuristic algorithms and 6 graph-based ED methods. The empirical and statistical results proved that ABBAAMCL surpassed other methods on most datasets. The key representative features demonstrated that ABBA-AMCL method successfully detects real-world events from Facebook news datasets with 0.96 Precision and 1 Recall for dataset 11, while for dataset 12, the Precision is 1 and Recall is 0.76. To conclude, the novel ABBA-AMCL presented in this research has successfully bridged the research gap and resolved the curse of high dimensionality feature space for heterogeneous news text documents. Hence, the enhanced ED model can organize news documents into distinct events and provide policymakers with valuable information for decision making. 2022 Thesis NonPeerReviewed text en https://etd.uum.edu.my/10228/1/s901775_01.pdf text en https://etd.uum.edu.my/10228/2/s901775_02.pdf Al-Dyani, Wafa Zubair Abdullah (2022) An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents. Doctoral thesis, Universiti Utara Malaysia.
institution Universiti Utara Malaysia
building UUM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Utara Malaysia
content_source UUM Electronic Theses
url_provider http://etd.uum.edu.my/
language English
English
topic QA Mathematics
spellingShingle QA Mathematics
Al-Dyani, Wafa Zubair Abdullah
An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents
description Event Detection (ED) works on identifying events from various types of data. Building an ED model for news text documents greatly helps decision-makers in various disciplines in improving their strategies. However, identifying and summarizing events from such data is a non-trivial task due to the large volume of published heterogeneous news text documents. Such documents create a high-dimensional feature space that influences the overall performance of the baseline methods in ED model. To address such a problem, this research presents an enhanced ED model that includes improved methods for the crucial phases of the ED model such as Feature Selection (FS), ED, and summarization. This work focuses on the FS problem by automatically detecting events through a novel wrapper FS method based on Adapted Binary Bat Algorithm (ABBA) and Adapted Markov Clustering Algorithm (AMCL), termed ABBA-AMCL. These adaptive techniques were developed to overcome the premature convergence in BBA and fast convergence rate in MCL. Furthermore, this study proposes four summarizing methods to generate informative summaries. The enhanced ED model was tested on 10 benchmark datasets and 2 Facebook news datasets. The effectiveness of ABBA-AMCL was compared to 8 FS methods based on meta-heuristic algorithms and 6 graph-based ED methods. The empirical and statistical results proved that ABBAAMCL surpassed other methods on most datasets. The key representative features demonstrated that ABBA-AMCL method successfully detects real-world events from Facebook news datasets with 0.96 Precision and 1 Recall for dataset 11, while for dataset 12, the Precision is 1 and Recall is 0.76. To conclude, the novel ABBA-AMCL presented in this research has successfully bridged the research gap and resolved the curse of high dimensionality feature space for heterogeneous news text documents. Hence, the enhanced ED model can organize news documents into distinct events and provide policymakers with valuable information for decision making.
format Thesis
author Al-Dyani, Wafa Zubair Abdullah
author_facet Al-Dyani, Wafa Zubair Abdullah
author_sort Al-Dyani, Wafa Zubair Abdullah
title An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents
title_short An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents
title_full An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents
title_fullStr An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents
title_full_unstemmed An enhanced binary bat and Markov clustering algorithms to improve event detection for heterogeneous news text documents
title_sort enhanced binary bat and markov clustering algorithms to improve event detection for heterogeneous news text documents
publishDate 2022
url https://etd.uum.edu.my/10228/1/s901775_01.pdf
https://etd.uum.edu.my/10228/2/s901775_02.pdf
https://etd.uum.edu.my/10228/
_version_ 1755875155190480896
score 13.19449