Tracking dengue on twitter using hybrid filtration-polarity and Apache Flume

The world health organization (WHO) terms dengue as a serious illness that impacts almost half of the world's population and carries no specific treatment. Early and accurate detection of spread in affected regions can save precious lives. Despite the severity of the disease, a few noticeable w...

Full description

Saved in:
Bibliographic Details
Main Authors: Ghani, Norjihan Binti Abdul, Hamid, Suraya, Ahmad, Muneer, Saadi, Younes, Jhanjhi, N. Z., Alzain, Mohammed A., Masud, Mehedi
Format: Article
Published: Tech Science Press 2022
Subjects:
Online Access:http://eprints.um.edu.my/33543/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.um.eprints.33543
record_format eprints
spelling my.um.eprints.335432022-08-04T02:58:17Z http://eprints.um.edu.my/33543/ Tracking dengue on twitter using hybrid filtration-polarity and Apache Flume Ghani, Norjihan Binti Abdul Hamid, Suraya Ahmad, Muneer Saadi, Younes Jhanjhi, N. Z. Alzain, Mohammed A. Masud, Mehedi QA75 Electronic computers. Computer science The world health organization (WHO) terms dengue as a serious illness that impacts almost half of the world's population and carries no specific treatment. Early and accurate detection of spread in affected regions can save precious lives. Despite the severity of the disease, a few noticeable works can be found that involve sentiment analysis to mine accurate intuitions from the social media text streams. However, the massive data explosion in recent years has led to difficulties in terms of storing and processing large amounts of data, as reliable mechanisms to gather the data and suitable techniques to extract meaningful insights from the data are required. This research study proposes a sentiment analysis polarity approach for collecting data and extracting relevant information about dengue via Apache Hadoop. The method consists of two main parts: the first part collects data from social media using Apache Flume, while the second part focuses on querying and extracting relevant information via the hybrid filtration-polarity algorithm using Apache Hive. To overcome the noisy and unstructured nature of the data, the process of extracting information is characterized by pre and post -filtration phases. As a result, only with the integration of Flume and Hive with filtration and polarity analysis, can a reliable sentiment analysis technique be offered to collect and process large-scale data from the social network. We introduce how the Apache Hadoop ecosystem - Flume and Hive - can provide a sentiment analysis capability by storing and processing large amounts of data. An important finding of this paper is that developing efficient sentiment analysis applications for detecting diseases can be more reliable through the use of the Hadoop ecosystem components than through the use of normal machines. Tech Science Press 2022 Article PeerReviewed Ghani, Norjihan Binti Abdul and Hamid, Suraya and Ahmad, Muneer and Saadi, Younes and Jhanjhi, N. Z. and Alzain, Mohammed A. and Masud, Mehedi (2022) Tracking dengue on twitter using hybrid filtration-polarity and Apache Flume. Computer Systems Science and Engineering, 40 (3). pp. 913-926. ISSN 0267-6192, DOI https://doi.org/10.32604/csse.2022.018467 <https://doi.org/10.32604/csse.2022.018467>. 10.32604/csse.2022.018467
institution Universiti Malaya
building UM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaya
content_source UM Research Repository
url_provider http://eprints.um.edu.my/
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Ghani, Norjihan Binti Abdul
Hamid, Suraya
Ahmad, Muneer
Saadi, Younes
Jhanjhi, N. Z.
Alzain, Mohammed A.
Masud, Mehedi
Tracking dengue on twitter using hybrid filtration-polarity and Apache Flume
description The world health organization (WHO) terms dengue as a serious illness that impacts almost half of the world's population and carries no specific treatment. Early and accurate detection of spread in affected regions can save precious lives. Despite the severity of the disease, a few noticeable works can be found that involve sentiment analysis to mine accurate intuitions from the social media text streams. However, the massive data explosion in recent years has led to difficulties in terms of storing and processing large amounts of data, as reliable mechanisms to gather the data and suitable techniques to extract meaningful insights from the data are required. This research study proposes a sentiment analysis polarity approach for collecting data and extracting relevant information about dengue via Apache Hadoop. The method consists of two main parts: the first part collects data from social media using Apache Flume, while the second part focuses on querying and extracting relevant information via the hybrid filtration-polarity algorithm using Apache Hive. To overcome the noisy and unstructured nature of the data, the process of extracting information is characterized by pre and post -filtration phases. As a result, only with the integration of Flume and Hive with filtration and polarity analysis, can a reliable sentiment analysis technique be offered to collect and process large-scale data from the social network. We introduce how the Apache Hadoop ecosystem - Flume and Hive - can provide a sentiment analysis capability by storing and processing large amounts of data. An important finding of this paper is that developing efficient sentiment analysis applications for detecting diseases can be more reliable through the use of the Hadoop ecosystem components than through the use of normal machines.
format Article
author Ghani, Norjihan Binti Abdul
Hamid, Suraya
Ahmad, Muneer
Saadi, Younes
Jhanjhi, N. Z.
Alzain, Mohammed A.
Masud, Mehedi
author_facet Ghani, Norjihan Binti Abdul
Hamid, Suraya
Ahmad, Muneer
Saadi, Younes
Jhanjhi, N. Z.
Alzain, Mohammed A.
Masud, Mehedi
author_sort Ghani, Norjihan Binti Abdul
title Tracking dengue on twitter using hybrid filtration-polarity and Apache Flume
title_short Tracking dengue on twitter using hybrid filtration-polarity and Apache Flume
title_full Tracking dengue on twitter using hybrid filtration-polarity and Apache Flume
title_fullStr Tracking dengue on twitter using hybrid filtration-polarity and Apache Flume
title_full_unstemmed Tracking dengue on twitter using hybrid filtration-polarity and Apache Flume
title_sort tracking dengue on twitter using hybrid filtration-polarity and apache flume
publisher Tech Science Press
publishDate 2022
url http://eprints.um.edu.my/33543/
_version_ 1740826041779224576
score 13.160551