Comparative Analysis of ML-Based Outlier Detection Techniques for IoT-Based Smart Energy Management Systems

With the development and advancement of ICST, data-driven technology such as the Internet of Things (IoT) and Smart Technology including Smart Energy Management Systems (SEMS) has become a trend in many regions and around the globe. There is no doubt that data quality and data quality problems are a...

Full description

Saved in:
Bibliographic Details
Main Authors: Parh Yong Wong, Parh Yong Wong, Nayef A. M. Alduais, Nayef A. M. Alduais, Omar, Nurul Aswa, Salama A. Mostafa, Salama A. Mostafa, Abdul-Malik H. Y. Saad, Abdul-Malik H. Y. Saad, Antar Shaddad H. Abdul-Qawy, Antar Shaddad H. Abdul-Qawy, Nasser, Abdullah, Waheed Ali H. M. Ghanem, Waheed Ali H. M. Ghanem
Format: Article
Language:English
Published: aspg 2024
Subjects:
Online Access:http://eprints.uthm.edu.my/11171/1/J17656_80b6f91c92b238eb2b089aeba84ca04e.pdf
http://eprints.uthm.edu.my/11171/
https://doi.org/10.54216/JISIoT.120204
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.uthm.eprints.11171
record_format eprints
spelling my.uthm.eprints.111712024-06-19T03:55:57Z http://eprints.uthm.edu.my/11171/ Comparative Analysis of ML-Based Outlier Detection Techniques for IoT-Based Smart Energy Management Systems Parh Yong Wong, Parh Yong Wong Nayef A. M. Alduais, Nayef A. M. Alduais Omar, Nurul Aswa Salama A. Mostafa, Salama A. Mostafa Abdul-Malik H. Y. Saad, Abdul-Malik H. Y. Saad Antar Shaddad H. Abdul-Qawy, Antar Shaddad H. Abdul-Qawy Nasser, Abdullah Waheed Ali H. M. Ghanem, Waheed Ali H. M. Ghanem T Technology (General) With the development and advancement of ICST, data-driven technology such as the Internet of Things (IoT) and Smart Technology including Smart Energy Management Systems (SEMS) has become a trend in many regions and around the globe. There is no doubt that data quality and data quality problems are among the most vital topics to be addressed for a successful application of IoT-based SEMS. Poor data in such major yet delicate systems will affect the quality of life (QoL) of millions, and even cause destruction and disruption to a country. This paper aims to tackle this problem by searching for suitable outlier detection techniques from the many developed ML-based outlier detection methods. Three methods are chosen and analyzed for their performances, namely the K-Nearest Neighbour (KNN)+ Mahalanobis Distance (MD), Minimum Covariance Determinant (MCD), and Local Outlier Factor (LOF) models. Three sensor-collected datasets that are related to SEMS and with different data types are used in this research, they are pre-processed and split into training and testing datasets with manually injected outliers. The training datasets are then used for searching the patterns of the datasets through training of the models, and the trained models are then tested with the testing datasets, using the found patterns to identify and label the outliers in the datasets. All the models can accurately identify the outliers, with their average accuracies scoring over 95%. However, the average execution time used for each model varies, where the KNN+MD model has the longest average execution time at 12.99 seconds, MCD achieving 3.98 seconds for execution time, and the LOF model at 0.60 seconds, the shortest among the three. aspg 2024 Article PeerReviewed text en http://eprints.uthm.edu.my/11171/1/J17656_80b6f91c92b238eb2b089aeba84ca04e.pdf Parh Yong Wong, Parh Yong Wong and Nayef A. M. Alduais, Nayef A. M. Alduais and Omar, Nurul Aswa and Salama A. Mostafa, Salama A. Mostafa and Abdul-Malik H. Y. Saad, Abdul-Malik H. Y. Saad and Antar Shaddad H. Abdul-Qawy, Antar Shaddad H. Abdul-Qawy and Nasser, Abdullah and Waheed Ali H. M. Ghanem, Waheed Ali H. M. Ghanem (2024) Comparative Analysis of ML-Based Outlier Detection Techniques for IoT-Based Smart Energy Management Systems. Journal of Intelligent Systems and Internet of Things, 12 (2). pp. 44-64. https://doi.org/10.54216/JISIoT.120204
institution Universiti Tun Hussein Onn Malaysia
building UTHM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Tun Hussein Onn Malaysia
content_source UTHM Institutional Repository
url_provider http://eprints.uthm.edu.my/
language English
topic T Technology (General)
spellingShingle T Technology (General)
Parh Yong Wong, Parh Yong Wong
Nayef A. M. Alduais, Nayef A. M. Alduais
Omar, Nurul Aswa
Salama A. Mostafa, Salama A. Mostafa
Abdul-Malik H. Y. Saad, Abdul-Malik H. Y. Saad
Antar Shaddad H. Abdul-Qawy, Antar Shaddad H. Abdul-Qawy
Nasser, Abdullah
Waheed Ali H. M. Ghanem, Waheed Ali H. M. Ghanem
Comparative Analysis of ML-Based Outlier Detection Techniques for IoT-Based Smart Energy Management Systems
description With the development and advancement of ICST, data-driven technology such as the Internet of Things (IoT) and Smart Technology including Smart Energy Management Systems (SEMS) has become a trend in many regions and around the globe. There is no doubt that data quality and data quality problems are among the most vital topics to be addressed for a successful application of IoT-based SEMS. Poor data in such major yet delicate systems will affect the quality of life (QoL) of millions, and even cause destruction and disruption to a country. This paper aims to tackle this problem by searching for suitable outlier detection techniques from the many developed ML-based outlier detection methods. Three methods are chosen and analyzed for their performances, namely the K-Nearest Neighbour (KNN)+ Mahalanobis Distance (MD), Minimum Covariance Determinant (MCD), and Local Outlier Factor (LOF) models. Three sensor-collected datasets that are related to SEMS and with different data types are used in this research, they are pre-processed and split into training and testing datasets with manually injected outliers. The training datasets are then used for searching the patterns of the datasets through training of the models, and the trained models are then tested with the testing datasets, using the found patterns to identify and label the outliers in the datasets. All the models can accurately identify the outliers, with their average accuracies scoring over 95%. However, the average execution time used for each model varies, where the KNN+MD model has the longest average execution time at 12.99 seconds, MCD achieving 3.98 seconds for execution time, and the LOF model at 0.60 seconds, the shortest among the three.
format Article
author Parh Yong Wong, Parh Yong Wong
Nayef A. M. Alduais, Nayef A. M. Alduais
Omar, Nurul Aswa
Salama A. Mostafa, Salama A. Mostafa
Abdul-Malik H. Y. Saad, Abdul-Malik H. Y. Saad
Antar Shaddad H. Abdul-Qawy, Antar Shaddad H. Abdul-Qawy
Nasser, Abdullah
Waheed Ali H. M. Ghanem, Waheed Ali H. M. Ghanem
author_facet Parh Yong Wong, Parh Yong Wong
Nayef A. M. Alduais, Nayef A. M. Alduais
Omar, Nurul Aswa
Salama A. Mostafa, Salama A. Mostafa
Abdul-Malik H. Y. Saad, Abdul-Malik H. Y. Saad
Antar Shaddad H. Abdul-Qawy, Antar Shaddad H. Abdul-Qawy
Nasser, Abdullah
Waheed Ali H. M. Ghanem, Waheed Ali H. M. Ghanem
author_sort Parh Yong Wong, Parh Yong Wong
title Comparative Analysis of ML-Based Outlier Detection Techniques for IoT-Based Smart Energy Management Systems
title_short Comparative Analysis of ML-Based Outlier Detection Techniques for IoT-Based Smart Energy Management Systems
title_full Comparative Analysis of ML-Based Outlier Detection Techniques for IoT-Based Smart Energy Management Systems
title_fullStr Comparative Analysis of ML-Based Outlier Detection Techniques for IoT-Based Smart Energy Management Systems
title_full_unstemmed Comparative Analysis of ML-Based Outlier Detection Techniques for IoT-Based Smart Energy Management Systems
title_sort comparative analysis of ml-based outlier detection techniques for iot-based smart energy management systems
publisher aspg
publishDate 2024
url http://eprints.uthm.edu.my/11171/1/J17656_80b6f91c92b238eb2b089aeba84ca04e.pdf
http://eprints.uthm.edu.my/11171/
https://doi.org/10.54216/JISIoT.120204
_version_ 1803337351334723584
score 13.18916