A review of the current publication trends on missing data imputation over three decades: direction and future research

Studies on missing data have increased in the past few decades. It is an uncontrollable phenomenon and could occur during the data collection in practically any research field. Numerous missing data imputation techniques are well documented in the literature. However, very few studies have systemati...

Full description

Saved in:
Bibliographic Details
Main Authors: Adnan, Farah Adibah, Jamaludin, Khairur Rijal, Wan Muhamad, Wan Zuki Azman, Miskon, Suraya
Format: Article
Published: Springer Science and Business Media Deutschland GmbH 2022
Subjects:
Online Access:http://eprints.utm.my/103382/
http://dx.doi.org/10.1007/s00521-022-07702-7
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Studies on missing data have increased in the past few decades. It is an uncontrollable phenomenon and could occur during the data collection in practically any research field. Numerous missing data imputation techniques are well documented in the literature. However, very few studies have systematically examined the evolutionary nuances of a specific area while offering insight into the emerging imputation methods in that field. The primary objective of this paper is to provide a comprehensive review of studies concerning missing data imputation methods in classification problems from several viewpoints: (a) publication trends (by year, subject area, country, document language, and author), (b) keyword analysis, (c) the most cited documents and (d) the most influenced authors. Bibliometric analysis has been conducted using VOSviewer and Harzing Publish or Perish software, covering 430 journal articles published in Scopus from 1991 to June 2021. One of the findings reveals an emerging trend in missing data imputation methods using random forest and nearest neighbor. Above all, this research is a valuable resource for gaining insights into the available imputation techniques at a glance.