Analysing machine learning models to detect disaster events using social media

Disasters are instabilities that occur on the interface between society and the environment. During disasters, people communicate to inform and request for support for themselves or their community. Social media is used as a medium for communication due to its wide reach and global audience. Duri...

Full description

Saved in:
Bibliographic Details
Main Author: Faris Azni Azlan, Mr.
Format: text::Thesis
Language:English
Published: 2023
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Disasters are instabilities that occur on the interface between society and the environment. During disasters, people communicate to inform and request for support for themselves or their community. Social media is used as a medium for communication due to its wide reach and global audience. During disasters, people communicate via messages regarding similar or different types of emergencies in the same general location. Interpreting and validating these messages during the occurrence of a disaster costs a significant time and loss. Therefore, this study presents a comparison between three algorithms, K-Nearest Neighbour (KNN), Naive Bayes (NB), and Support Vector Machine (SVM), to classify and sort messages so that the process of examining them can be simplified and accelerated. To simulate the examining process further, a fuzzy algorithm is developed to automatically rate the severity of a disaster as described in each message in disaster environment. The results are gauged using four statistics-based metrics and a time constraint. The statistics-based metrics are accuracy, precision, recall and f1-score. For accuracy, KNN and SVM tied with a score 0.79 or 79%. The same trend extends for precision, recall and f1-score, where KNN and SVM are equal in performance. In time constraint results, KNN is faster than SVM at producing output but is slower than NB. Despite being fastest among the three at producing output, NB has the lowest scores in the statistics portion of the evaluation results. The study has found KNN to be the most suitable algorithm to sort messages in a disaster situation for having the highest ratio of accuracy and speed out of the three models.