A comparative analysis of missing data imputation techniques on sedimentation data

Sediment data pertains to various hydrological variables with complex sediment hydrodynamics such as sedimentation rates which are often incompletely presented. Thus, the availability of sedimentation data is of utmost necessity for data accessibility. A comparative analysis on the missing fine sedi...

Full description

Saved in:
Bibliographic Details
Main Authors: Loh, Wing Son, Lloyd, Ling, Chin, Ren Jie, Lai, Sai Hin, Loo, Kar Kuan, Seah, Choon Sen
Format: Article
Language:English
Published: Elsevier Ltd. 2024
Subjects:
Online Access:http://ir.unimas.my/id/eprint/44864/2/A%20comparative%20analysi.pdf
http://ir.unimas.my/id/eprint/44864/
https://www.sciencedirect.com/science/article/pii/S2090447924000923#:~:text=A%20comparative%20analysis%20on%20the,imputation%20(SI)%20and%20multiple%20imputation
https://doi.org/10.1016/j.asej.2024.102717
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Sediment data pertains to various hydrological variables with complex sediment hydrodynamics such as sedimentation rates which are often incompletely presented. Thus, the availability of sedimentation data is of utmost necessity for data accessibility. A comparative analysis on the missing fine sediment data imputation performance was made based on four different techniques, namely the k-Nearest Neighbourhood (k-NN), Support Vector Regression (SVR), Multiple Regression (MR), and Artificial Neural Network (ANN), under the single imputation (SI) and multiple imputation (MI) regimes. Across different missing data proportions (10%-50%), the ANN demonstrated optimal results with consistent performance metrics recorded over both SI and MI regimes. For the highest missing data proportion (50%), the ANN presented the best imputation performance with a reported root mean squared error (RMSE) 0.000882, mean absolute error (MAE) 0.000595, coefficient of determination (R2 ) 71%, and Kling-Gupta Efficiency (KGE) 72%. The imputation performance ranking is as follows: ANN, SVR, MR, and k-NN.