Forecasting Using K-means Clustering and RNN Methods with PCA Feature Selection

Artificial Neural Networks is a computing system that is inspired by how the nervous system works in humans and continues to grow rapidly until now. Just like the nervous system in humans, artificial neural networks work through the process of studying existing data to formulate new data outputs. An...

Full description

Saved in:
Bibliographic Details
Main Authors: Ferna, Marestiani, Sugiyarto, Surono
Format: Article
Language:English
Published: INTI International University 2022
Subjects:
Online Access:http://eprints.intimal.edu.my/1632/1/jods2022_04.pdf
http://eprints.intimal.edu.my/1632/
http://ipublishing.intimal.edu.my/jods.html
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-inti-eprints.1632
record_format eprints
spelling my-inti-eprints.16322024-05-07T09:37:17Z http://eprints.intimal.edu.my/1632/ Forecasting Using K-means Clustering and RNN Methods with PCA Feature Selection Ferna, Marestiani Sugiyarto, Surono QA75 Electronic computers. Computer science QC Physics T Technology (General) Artificial Neural Networks is a computing system that is inspired by how the nervous system works in humans and continues to grow rapidly until now. Just like the nervous system in humans, artificial neural networks work through the process of studying existing data to formulate new data outputs. An artificial neural network using the Recurrent Neural Network (RNN) method is one of the popular models used today, especially in forecasting cases. In simple terms, the forecasting flow using the RNN method begins by dividing the test data and training data, the forward calculation process, the backward calculation process, the optimization calculation, and the evaluation calculation of the forecasting model. The main obstacle of the RNN method is the presence of a vanishing gradient which can cause poor forecasting results. In this study, the authors propose a Principal Component Analysis (PCA) dimension reduction method to obtain the most influential variables and become inputs for the prediction model that is built to minimize existing errors. The author also uses the K-means clustering method to divide the data with similar trend variations. To increase the clustering effect, the researcher used similarity calculation based on Euclidean distance. So that in an effort to build optimal prediction results, first time series data with the most influential variables will be selected using the PCA method. Furthermore, the data are grouped using the K-means method and will be included in the prediction model that is built. In the RNN prediction model, the data will be trained using the Backpropagation Through Time (BPTT) method and the optimization method used is Stochastic Gradient Descent (SGD). Forecasting with the RNN method with PCA produces an accuracy of 93%, while forecasting using the RNN method without PCA produces an accuracy of 82%. The experimental results show that the RNN method with PCA achieves higher predictive accuracy and flexibility than RNN without PCA. INTI International University 2022-06 Article PeerReviewed text en cc_by_4 http://eprints.intimal.edu.my/1632/1/jods2022_04.pdf Ferna, Marestiani and Sugiyarto, Surono (2022) Forecasting Using K-means Clustering and RNN Methods with PCA Feature Selection. Journal of Data Science, 2022 (04). pp. 1-14. ISSN 2805-5160 http://ipublishing.intimal.edu.my/jods.html
institution INTI International University
building INTI Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider INTI International University
content_source INTI Institutional Repository
url_provider http://eprints.intimal.edu.my
language English
topic QA75 Electronic computers. Computer science
QC Physics
T Technology (General)
spellingShingle QA75 Electronic computers. Computer science
QC Physics
T Technology (General)
Ferna, Marestiani
Sugiyarto, Surono
Forecasting Using K-means Clustering and RNN Methods with PCA Feature Selection
description Artificial Neural Networks is a computing system that is inspired by how the nervous system works in humans and continues to grow rapidly until now. Just like the nervous system in humans, artificial neural networks work through the process of studying existing data to formulate new data outputs. An artificial neural network using the Recurrent Neural Network (RNN) method is one of the popular models used today, especially in forecasting cases. In simple terms, the forecasting flow using the RNN method begins by dividing the test data and training data, the forward calculation process, the backward calculation process, the optimization calculation, and the evaluation calculation of the forecasting model. The main obstacle of the RNN method is the presence of a vanishing gradient which can cause poor forecasting results. In this study, the authors propose a Principal Component Analysis (PCA) dimension reduction method to obtain the most influential variables and become inputs for the prediction model that is built to minimize existing errors. The author also uses the K-means clustering method to divide the data with similar trend variations. To increase the clustering effect, the researcher used similarity calculation based on Euclidean distance. So that in an effort to build optimal prediction results, first time series data with the most influential variables will be selected using the PCA method. Furthermore, the data are grouped using the K-means method and will be included in the prediction model that is built. In the RNN prediction model, the data will be trained using the Backpropagation Through Time (BPTT) method and the optimization method used is Stochastic Gradient Descent (SGD). Forecasting with the RNN method with PCA produces an accuracy of 93%, while forecasting using the RNN method without PCA produces an accuracy of 82%. The experimental results show that the RNN method with PCA achieves higher predictive accuracy and flexibility than RNN without PCA.
format Article
author Ferna, Marestiani
Sugiyarto, Surono
author_facet Ferna, Marestiani
Sugiyarto, Surono
author_sort Ferna, Marestiani
title Forecasting Using K-means Clustering and RNN Methods with PCA Feature Selection
title_short Forecasting Using K-means Clustering and RNN Methods with PCA Feature Selection
title_full Forecasting Using K-means Clustering and RNN Methods with PCA Feature Selection
title_fullStr Forecasting Using K-means Clustering and RNN Methods with PCA Feature Selection
title_full_unstemmed Forecasting Using K-means Clustering and RNN Methods with PCA Feature Selection
title_sort forecasting using k-means clustering and rnn methods with pca feature selection
publisher INTI International University
publishDate 2022
url http://eprints.intimal.edu.my/1632/1/jods2022_04.pdf
http://eprints.intimal.edu.my/1632/
http://ipublishing.intimal.edu.my/jods.html
_version_ 1800731497420816384
score 13.18916