Imputing missing values in modelling the PM10 concentrations

Missing values have always been a problem in analysis. Most exclude the missing values from the analyses which may lead to biased parameter estimates. Some imputations methods are considered in this paper in which simulation study is conducted to compare three methods of imputation namely mean subst...

Full description

Saved in:
Bibliographic Details
Main Authors: Nuradhiathy Abd Razak,, Yong Zulina Zubairi,, Rossita M. Yunus,
Format: Article
Language:English
Published: Universiti Kebangsaan Malaysia 2014
Online Access:http://journalarticle.ukm.my/7824/1/18_Nuradhiathy.pdf
http://journalarticle.ukm.my/7824/
http://www.ukm.my/jsm/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-ukm.journal.7824
record_format eprints
spelling my-ukm.journal.78242016-12-14T06:45:19Z http://journalarticle.ukm.my/7824/ Imputing missing values in modelling the PM10 concentrations Nuradhiathy Abd Razak, Yong Zulina Zubairi, Rossita M. Yunus, Missing values have always been a problem in analysis. Most exclude the missing values from the analyses which may lead to biased parameter estimates. Some imputations methods are considered in this paper in which simulation study is conducted to compare three methods of imputation namely mean substitution, hot deck and expectation maximization (EM) imputation. The EM imputation is found to be superior especially when the percentage of missing values is high as it constantly gives low RMSE as compared with other two methods. The EM imputation method is then applied to the PM10 concentrations data set for the southwest and northeast monsoons in Petaling Jaya and Seberang Perai, Malaysia which has missing values. Four types of distributions, namely the Weibull, lognormal, gamma and Gumbel distribution are considered to describe the PM10 concentrations. The Weibull distribution gives the best fit for the southwest monsoon data for Petaling Jaya. The lognormal distribution outperformed the others in describing the southwest monsoon in Seberang Perai. Meanwhile, for the northeast monsoon in both locations, gamma distribution is the best distribution to describe the data. Universiti Kebangsaan Malaysia 2014-10 Article PeerReviewed application/pdf en http://journalarticle.ukm.my/7824/1/18_Nuradhiathy.pdf Nuradhiathy Abd Razak, and Yong Zulina Zubairi, and Rossita M. Yunus, (2014) Imputing missing values in modelling the PM10 concentrations. Sains Malaysiana, 43 (10). pp. 1599-1607. ISSN 0126-6039 http://www.ukm.my/jsm/
institution Universiti Kebangsaan Malaysia
building Perpustakaan Tun Sri Lanang Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Kebangsaan Malaysia
content_source UKM Journal Article Repository
url_provider http://journalarticle.ukm.my/
language English
description Missing values have always been a problem in analysis. Most exclude the missing values from the analyses which may lead to biased parameter estimates. Some imputations methods are considered in this paper in which simulation study is conducted to compare three methods of imputation namely mean substitution, hot deck and expectation maximization (EM) imputation. The EM imputation is found to be superior especially when the percentage of missing values is high as it constantly gives low RMSE as compared with other two methods. The EM imputation method is then applied to the PM10 concentrations data set for the southwest and northeast monsoons in Petaling Jaya and Seberang Perai, Malaysia which has missing values. Four types of distributions, namely the Weibull, lognormal, gamma and Gumbel distribution are considered to describe the PM10 concentrations. The Weibull distribution gives the best fit for the southwest monsoon data for Petaling Jaya. The lognormal distribution outperformed the others in describing the southwest monsoon in Seberang Perai. Meanwhile, for the northeast monsoon in both locations, gamma distribution is the best distribution to describe the data.
format Article
author Nuradhiathy Abd Razak,
Yong Zulina Zubairi,
Rossita M. Yunus,
spellingShingle Nuradhiathy Abd Razak,
Yong Zulina Zubairi,
Rossita M. Yunus,
Imputing missing values in modelling the PM10 concentrations
author_facet Nuradhiathy Abd Razak,
Yong Zulina Zubairi,
Rossita M. Yunus,
author_sort Nuradhiathy Abd Razak,
title Imputing missing values in modelling the PM10 concentrations
title_short Imputing missing values in modelling the PM10 concentrations
title_full Imputing missing values in modelling the PM10 concentrations
title_fullStr Imputing missing values in modelling the PM10 concentrations
title_full_unstemmed Imputing missing values in modelling the PM10 concentrations
title_sort imputing missing values in modelling the pm10 concentrations
publisher Universiti Kebangsaan Malaysia
publishDate 2014
url http://journalarticle.ukm.my/7824/1/18_Nuradhiathy.pdf
http://journalarticle.ukm.my/7824/
http://www.ukm.my/jsm/
_version_ 1643737258246799360
score 13.18916