Exploration of COVID‑19 data in Malaysia through mapper graph
Huge amounts of data have been collected from various sources during the COVID-19 pandemic, providing a unique opportunity for analysis, data-driven modelling, and machine learning in understanding the complexity of COVID-19 more effectively and make informed decisions. To keep with the expanding qu...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Springer Nature
2024
|
Subjects: | |
Online Access: | http://ir.unimas.my/id/eprint/45351/3/Exploration%20of%20COVID%E2%80%9119%20data%20-%20Copy.pdf http://ir.unimas.my/id/eprint/45351/ https://link.springer.com/article/10.1007/s13721-024-00472-3 https://doi.org/10.1007/s13721-024-00472-3 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Huge amounts of data have been collected from various sources during the COVID-19 pandemic, providing a unique opportunity for analysis, data-driven modelling, and machine learning in understanding the complexity of COVID-19 more effectively and make informed decisions. To keep with the expanding quantity and complexity of data while employing
minimal assumptions, a topological data analysis tool known as the Mapper algorithm is used to explore Malaysia’s daily confirmed cases, deaths, and vaccination data from the onset of the pandemic to June 2022 via data visualization and clustering. A support vector-based feature selection and a heuristic approach for fine-tuning parameters internally within the algorithm are conducted. Two anomalous groups of nodes with exceptionally high case numbers emerged respectively for Delta and Omicron dominant periods in the Mapper graphs for daily data. Selangor cumulative cases have been found to be numerically dissimilar from other states from August 2021 onwards. The evolution of Mapper graphs revealed unique early COVID-19 progression in Johor, Negeri Sembilan, and Kuala Lumpur in the first half of 2020, followed by a significant increase in confirmed cases in Sabah in September 2020. Clusters identified by the Mapper algorithm are comparable with those obtained from principal component analysis and hierarchical clustering. Still, the hierarchical clustering does not further subdivide Selangor data into two to three separate clusters as the Mapper algorithm does. This research provides
valuable insights for comprehending the pandemic timeline in Malaysia via the Mapper algorithm, which serves as a highly compact data visualization technique. |
---|