Staff View: Evaluation of machine learning models for estimating pm2.5 concentrations across Malaysia

Evaluation of machine learning models for estimating pm2.5 concentrations across Malaysia

Southeast Asia (SEA) is a hotspot region for atmospheric pollution and haze conditions, due to extensive forest, agricultural and peat fires. This study aims to estimate the PM2.5 concentrations across Malaysia using machine-learning (ML) models like Random Forest (RF) and Support Vector Regression...

Full description

Saved in:

Bibliographic Details
Main Authors:	Kamarul Zaman, N. A. F., Kanniah, K. D., Kaskaoutis, D. G., Latif, M. T.
Format:	Article
Language:	English
Published:	MDPI AG 2021
Subjects:	TH Building construction
Online Access:	http://eprints.utm.my/id/eprint/95447/1/NurulAmalinFatihah2021_EvaluationofMachineLearningModels.pdf http://eprints.utm.my/id/eprint/95447/ http://dx.doi.org/10.3390/app11167326
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.utm.95447
record_format	eprints
spelling	my.utm.954472022-05-31T12:44:50Z http://eprints.utm.my/id/eprint/95447/ Evaluation of machine learning models for estimating pm2.5 concentrations across Malaysia Kamarul Zaman, N. A. F. Kanniah, K. D. Kaskaoutis, D. G. Latif, M. T. TH Building construction Southeast Asia (SEA) is a hotspot region for atmospheric pollution and haze conditions, due to extensive forest, agricultural and peat fires. This study aims to estimate the PM2.5 concentrations across Malaysia using machine-learning (ML) models like Random Forest (RF) and Support Vector Regression (SVR), based on satellite AOD (aerosol optical depth) observations, ground meas-ured air pollutants (NO2, SO2, CO, O3) and meteorological parameters (air temperature, relative hu-midity, wind speed and direction). The estimated PM2.5 concentrations for a two-year period (2018– 2019) are evaluated against measurements performed at 65 air-quality monitoring stations located at urban, industrial, suburban and rural sites. PM2.5 concentrations varied widely between the sta-tions, with higher values (mean of 24.2 ± 21.6 µg m−3) at urban/industrial stations and lower (mean of 21.3 ± 18.4 µg m−3) at suburban/rural sites. Furthermore, pronounced seasonal variability in PM2.5 is recorded across Malaysia, with highest concentrations during the dry season (June–September). Seven models were developed for PM2.5 predictions, i.e., separately for urban/industrial and subur-ban/rural sites, for the four dominant seasons (dry, wet and two inter-monsoon), and an overall model, which displayed accuracies in the order of R2 = 0.46–0.76. The validation analysis reveals that the RF model (R2 = 0.53–0.76) exhibits slightly better performance than SVR, except for the overall model. This is the first study conducted in Malaysia for PM2.5 estimations at a national scale com-bining satellite aerosol retrievals with ground-based pollutants, meteorological factors and ML tech-niques. The satisfactory prediction of PM2.5 concentrations across Malaysia allows a continuous monitoring of the pollution levels at remote areas with absence of measurement networks. MDPI AG 2021-07 Article PeerReviewed application/pdf en http://eprints.utm.my/id/eprint/95447/1/NurulAmalinFatihah2021_EvaluationofMachineLearningModels.pdf Kamarul Zaman, N. A. F. and Kanniah, K. D. and Kaskaoutis, D. G. and Latif, M. T. (2021) Evaluation of machine learning models for estimating pm2.5 concentrations across Malaysia. Applied Sciences (Switzerland), 11 (16). ISSN 2076-3417 http://dx.doi.org/10.3390/app11167326 DOI: 10.3390/app11167326
institution	Universiti Teknologi Malaysia
building	UTM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Teknologi Malaysia
content_source	UTM Institutional Repository
url_provider	http://eprints.utm.my/
language	English
topic	TH Building construction
spellingShingle	TH Building construction Kamarul Zaman, N. A. F. Kanniah, K. D. Kaskaoutis, D. G. Latif, M. T. Evaluation of machine learning models for estimating pm2.5 concentrations across Malaysia
description	Southeast Asia (SEA) is a hotspot region for atmospheric pollution and haze conditions, due to extensive forest, agricultural and peat fires. This study aims to estimate the PM2.5 concentrations across Malaysia using machine-learning (ML) models like Random Forest (RF) and Support Vector Regression (SVR), based on satellite AOD (aerosol optical depth) observations, ground meas-ured air pollutants (NO2, SO2, CO, O3) and meteorological parameters (air temperature, relative hu-midity, wind speed and direction). The estimated PM2.5 concentrations for a two-year period (2018– 2019) are evaluated against measurements performed at 65 air-quality monitoring stations located at urban, industrial, suburban and rural sites. PM2.5 concentrations varied widely between the sta-tions, with higher values (mean of 24.2 ± 21.6 µg m−3) at urban/industrial stations and lower (mean of 21.3 ± 18.4 µg m−3) at suburban/rural sites. Furthermore, pronounced seasonal variability in PM2.5 is recorded across Malaysia, with highest concentrations during the dry season (June–September). Seven models were developed for PM2.5 predictions, i.e., separately for urban/industrial and subur-ban/rural sites, for the four dominant seasons (dry, wet and two inter-monsoon), and an overall model, which displayed accuracies in the order of R2 = 0.46–0.76. The validation analysis reveals that the RF model (R2 = 0.53–0.76) exhibits slightly better performance than SVR, except for the overall model. This is the first study conducted in Malaysia for PM2.5 estimations at a national scale com-bining satellite aerosol retrievals with ground-based pollutants, meteorological factors and ML tech-niques. The satisfactory prediction of PM2.5 concentrations across Malaysia allows a continuous monitoring of the pollution levels at remote areas with absence of measurement networks.
format	Article
author	Kamarul Zaman, N. A. F. Kanniah, K. D. Kaskaoutis, D. G. Latif, M. T.
author_facet	Kamarul Zaman, N. A. F. Kanniah, K. D. Kaskaoutis, D. G. Latif, M. T.
author_sort	Kamarul Zaman, N. A. F.
title	Evaluation of machine learning models for estimating pm2.5 concentrations across Malaysia
title_short	Evaluation of machine learning models for estimating pm2.5 concentrations across Malaysia
title_full	Evaluation of machine learning models for estimating pm2.5 concentrations across Malaysia
title_fullStr	Evaluation of machine learning models for estimating pm2.5 concentrations across Malaysia
title_full_unstemmed	Evaluation of machine learning models for estimating pm2.5 concentrations across Malaysia
title_sort	evaluation of machine learning models for estimating pm2.5 concentrations across malaysia
publisher	MDPI AG
publishDate	2021
url	http://eprints.utm.my/id/eprint/95447/1/NurulAmalinFatihah2021_EvaluationofMachineLearningModels.pdf http://eprints.utm.my/id/eprint/95447/ http://dx.doi.org/10.3390/app11167326
_version_	1735386805236137984
score	13.18916

Evaluation of machine learning models for estimating pm2.5 concentrations across Malaysia

Similar Items