Imputation Analysis of Time-Series Data Using a Random Forest Algorithm
Missing data poses a significant challenge in extensive datasets, particularly those containing time-series information, leading to potential inaccuracies in data analysis and machine learning model development. To address the issue, this paper compared and evaluated four imputation methods: MissFor...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Conference or Workshop Item |
Language: | English English |
Published: |
Springer Singapore
2024
|
Subjects: | |
Online Access: | http://umpir.ump.edu.my/id/eprint/41147/1/Imputation%20Analysis%20of%20Time-Series%20Data.pdf http://umpir.ump.edu.my/id/eprint/41147/2/Imputation%20Analysis%20of%20Time-Series%20Data%20Using%20a%20Random%20Forest%20Algorithm.pdf http://umpir.ump.edu.my/id/eprint/41147/ https://doi.org/10.1007/978-981-99-8819-8_4 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.ump.umpir.41147 |
---|---|
record_format |
eprints |
spelling |
my.ump.umpir.411472024-05-16T04:24:57Z http://umpir.ump.edu.my/id/eprint/41147/ Imputation Analysis of Time-Series Data Using a Random Forest Algorithm Nur Najmiyah, Jaafar Muhammad Nur Ajmal, Rosdi Khairur Rijal, Jamaludin Faizir, Ramlie Habibah, Abdul Talib TS Manufactures Missing data poses a significant challenge in extensive datasets, particularly those containing time-series information, leading to potential inaccuracies in data analysis and machine learning model development. To address the issue, this paper compared and evaluated four imputation methods: MissForest, MICE, Simplefill, and Softimpute which utilized Random Forest Algorithm. The research examines the impact of missing ratios and temporal variations on the performance of the imputation methods. The results indicated that MissForest consistently outperformed other methods, exhibiting the lowest RMSE values and a high coefficient of determination (R2), indicating its accuracy and ability to explain the variation in the data. Furthermore, graphical analyses demonstrated the stability of MissForest over time, while MICE and Simplefill showed higher sensitivity to date changes. Softimpute demonstrated relative consistency but slightly lower performance compared to MissForest. Overall, this study highlights the effectiveness of MissForest as the preferred imputation method for AVL time-series data. Springer Singapore 2024 Conference or Workshop Item PeerReviewed pdf en http://umpir.ump.edu.my/id/eprint/41147/1/Imputation%20Analysis%20of%20Time-Series%20Data.pdf pdf en http://umpir.ump.edu.my/id/eprint/41147/2/Imputation%20Analysis%20of%20Time-Series%20Data%20Using%20a%20Random%20Forest%20Algorithm.pdf Nur Najmiyah, Jaafar and Muhammad Nur Ajmal, Rosdi and Khairur Rijal, Jamaludin and Faizir, Ramlie and Habibah, Abdul Talib (2024) Imputation Analysis of Time-Series Data Using a Random Forest Algorithm. In: Intelligent Manufacturing and Mechatronics, Lecture Notes in Networks and Systems. 4th International conference on Innovative Manufacturing, Mechatronics and Materials Forum, iM3F2023 , 07 – 08 August 2023 , Pekan, Malaysia. pp. 51-60., 850. ISSN 2367-3389 ISBN 978-981-99-8819-8 https://doi.org/10.1007/978-981-99-8819-8_4 |
institution |
Universiti Malaysia Pahang Al-Sultan Abdullah |
building |
UMPSA Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Malaysia Pahang Al-Sultan Abdullah |
content_source |
UMPSA Institutional Repository |
url_provider |
http://umpir.ump.edu.my/ |
language |
English English |
topic |
TS Manufactures |
spellingShingle |
TS Manufactures Nur Najmiyah, Jaafar Muhammad Nur Ajmal, Rosdi Khairur Rijal, Jamaludin Faizir, Ramlie Habibah, Abdul Talib Imputation Analysis of Time-Series Data Using a Random Forest Algorithm |
description |
Missing data poses a significant challenge in extensive datasets, particularly those containing time-series information, leading to potential inaccuracies in data analysis and machine learning model development. To address the issue, this paper compared and evaluated four imputation methods: MissForest, MICE, Simplefill, and Softimpute which utilized Random Forest Algorithm. The research examines the impact of missing ratios and temporal variations on the performance of the imputation methods. The results indicated that MissForest consistently outperformed other methods, exhibiting the lowest RMSE values and a high coefficient of determination (R2), indicating its accuracy and ability to explain the variation in the data. Furthermore, graphical analyses demonstrated the stability of MissForest over time, while MICE and Simplefill showed higher sensitivity to date changes. Softimpute demonstrated relative consistency but slightly lower performance compared to MissForest. Overall, this study highlights the effectiveness of MissForest as the preferred imputation method for AVL time-series data. |
format |
Conference or Workshop Item |
author |
Nur Najmiyah, Jaafar Muhammad Nur Ajmal, Rosdi Khairur Rijal, Jamaludin Faizir, Ramlie Habibah, Abdul Talib |
author_facet |
Nur Najmiyah, Jaafar Muhammad Nur Ajmal, Rosdi Khairur Rijal, Jamaludin Faizir, Ramlie Habibah, Abdul Talib |
author_sort |
Nur Najmiyah, Jaafar |
title |
Imputation Analysis of Time-Series Data Using a Random Forest Algorithm |
title_short |
Imputation Analysis of Time-Series Data Using a Random Forest Algorithm |
title_full |
Imputation Analysis of Time-Series Data Using a Random Forest Algorithm |
title_fullStr |
Imputation Analysis of Time-Series Data Using a Random Forest Algorithm |
title_full_unstemmed |
Imputation Analysis of Time-Series Data Using a Random Forest Algorithm |
title_sort |
imputation analysis of time-series data using a random forest algorithm |
publisher |
Springer Singapore |
publishDate |
2024 |
url |
http://umpir.ump.edu.my/id/eprint/41147/1/Imputation%20Analysis%20of%20Time-Series%20Data.pdf http://umpir.ump.edu.my/id/eprint/41147/2/Imputation%20Analysis%20of%20Time-Series%20Data%20Using%20a%20Random%20Forest%20Algorithm.pdf http://umpir.ump.edu.my/id/eprint/41147/ https://doi.org/10.1007/978-981-99-8819-8_4 |
_version_ |
1822924310426157056 |
score |
13.235796 |