A buffer-based online clustering for evolving data stream

Data stream clustering plays an important role in data stream mining for knowledge extraction. Numerous researchers have recently studied density-based clustering algorithms due to their capability to generate arbitrarily shaped clusters. However, most of the algorithms are either fully offline, hyb...

Full description

Saved in:
Bibliographic Details
Main Authors: Islam, Md. Kamrul, Ahmed, Md. Manjur, Kamal Z., Zamli
Format: Article
Language:English
Published: Elsevier Ltd 2019
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/24676/1/A%20buffer-based%20online%20clustering%20for%20evolving%20data%20stream.pdf
http://umpir.ump.edu.my/id/eprint/24676/
https://doi.org/10.1016/j.ins.2019.03.022
https://doi.org/10.1016/j.ins.2019.03.022
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.ump.umpir.24676
record_format eprints
spelling my.ump.umpir.246762019-04-02T07:34:52Z http://umpir.ump.edu.my/id/eprint/24676/ A buffer-based online clustering for evolving data stream Islam, Md. Kamrul Ahmed, Md. Manjur Kamal Z., Zamli QA75 Electronic computers. Computer science Data stream clustering plays an important role in data stream mining for knowledge extraction. Numerous researchers have recently studied density-based clustering algorithms due to their capability to generate arbitrarily shaped clusters. However, most of the algorithms are either fully offline, hybrid online/offline, or cannot handle the property of evolving data stream. Recently, a fully online clustering algorithm for evolving data stream called CEDAS was proposed. However, similar to other density-based clustering algorithms, CEDAS requires predefining the global optimal radius of micro-clusters, which is a difficult task; in addition, an erroneous choice deteriorates cluster performance. Moreover, the algorithm ignores the presence of temporarily irrelevant micro-clusters, which may be relevant in the future. In this study, we present a fully online density-based clustering algorithm called buffer-based online clustering for evolving data stream (BOCEDS). This algorithm recursively updates the micro-cluster radius to its local optimal. It also introduces a buffer for storing irrelevant micro-clusters and a fully online pruning method for extracting the temporarily irrelevant micro-cluster from the buffer. In addition, BOCEDS proposes an online micro-cluster energy-updating function based on the spatial information of the data stream. Experimental results are compared with those of CEDAS and other alternative hybrid online/offline density-based clustering algorithms, and BOCEDS proves its superiority over the other clustering algorithms. The sensitivity of clustering parameters is also measured. The proposed algorithm is then applied to real-world weather data streams to demonstrate its capability to detect changes in data stream and discover arbitrarily shaped clusters. The proposed BOCEDS can be available in https://sites.google.com/view/md-manjur-ahmed and https://sites.google.com/view/kamrul-just. Elsevier Ltd 2019 Article PeerReviewed pdf en http://umpir.ump.edu.my/id/eprint/24676/1/A%20buffer-based%20online%20clustering%20for%20evolving%20data%20stream.pdf Islam, Md. Kamrul and Ahmed, Md. Manjur and Kamal Z., Zamli (2019) A buffer-based online clustering for evolving data stream. Information Sciences, 489. pp. 113-135. ISSN 0020-0255 https://doi.org/10.1016/j.ins.2019.03.022 https://doi.org/10.1016/j.ins.2019.03.022
institution Universiti Malaysia Pahang
building UMP Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaysia Pahang
content_source UMP Institutional Repository
url_provider http://umpir.ump.edu.my/
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Islam, Md. Kamrul
Ahmed, Md. Manjur
Kamal Z., Zamli
A buffer-based online clustering for evolving data stream
description Data stream clustering plays an important role in data stream mining for knowledge extraction. Numerous researchers have recently studied density-based clustering algorithms due to their capability to generate arbitrarily shaped clusters. However, most of the algorithms are either fully offline, hybrid online/offline, or cannot handle the property of evolving data stream. Recently, a fully online clustering algorithm for evolving data stream called CEDAS was proposed. However, similar to other density-based clustering algorithms, CEDAS requires predefining the global optimal radius of micro-clusters, which is a difficult task; in addition, an erroneous choice deteriorates cluster performance. Moreover, the algorithm ignores the presence of temporarily irrelevant micro-clusters, which may be relevant in the future. In this study, we present a fully online density-based clustering algorithm called buffer-based online clustering for evolving data stream (BOCEDS). This algorithm recursively updates the micro-cluster radius to its local optimal. It also introduces a buffer for storing irrelevant micro-clusters and a fully online pruning method for extracting the temporarily irrelevant micro-cluster from the buffer. In addition, BOCEDS proposes an online micro-cluster energy-updating function based on the spatial information of the data stream. Experimental results are compared with those of CEDAS and other alternative hybrid online/offline density-based clustering algorithms, and BOCEDS proves its superiority over the other clustering algorithms. The sensitivity of clustering parameters is also measured. The proposed algorithm is then applied to real-world weather data streams to demonstrate its capability to detect changes in data stream and discover arbitrarily shaped clusters. The proposed BOCEDS can be available in https://sites.google.com/view/md-manjur-ahmed and https://sites.google.com/view/kamrul-just.
format Article
author Islam, Md. Kamrul
Ahmed, Md. Manjur
Kamal Z., Zamli
author_facet Islam, Md. Kamrul
Ahmed, Md. Manjur
Kamal Z., Zamli
author_sort Islam, Md. Kamrul
title A buffer-based online clustering for evolving data stream
title_short A buffer-based online clustering for evolving data stream
title_full A buffer-based online clustering for evolving data stream
title_fullStr A buffer-based online clustering for evolving data stream
title_full_unstemmed A buffer-based online clustering for evolving data stream
title_sort buffer-based online clustering for evolving data stream
publisher Elsevier Ltd
publishDate 2019
url http://umpir.ump.edu.my/id/eprint/24676/1/A%20buffer-based%20online%20clustering%20for%20evolving%20data%20stream.pdf
http://umpir.ump.edu.my/id/eprint/24676/
https://doi.org/10.1016/j.ins.2019.03.022
https://doi.org/10.1016/j.ins.2019.03.022
_version_ 1643669885918642176
score 13.19449