An Improved Pheromone-Based Kohonen Self- Organising Map in Clustering and Visualising Balanced and Imbalanced Datasets
The data distribution issue remains an unsolved clustering problem in data mining, especially in dealing with imbalanced datasets. The Kohonen Self-Organising Map (KSOM) is one of the well-known clustering algorithms that can solve various problems without a pre- defined number of clusters. Howeve...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Universiti Utara Malaysia Press
2021
|
Subjects: | |
Online Access: | https://repo.uum.edu.my/id/eprint/28765/1/JICT%2020%2004%202021%20651-676.pdf https://doi.org/10.32890/jict2021.20.4.8 https://repo.uum.edu.my/id/eprint/28765/ https://e-journal.uum.edu.my/index.php/jict/article/view/13835 https://doi.org/10.32890/jict2021.20.4.8 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The data distribution issue remains an unsolved clustering problem in data mining, especially in dealing with imbalanced datasets. The
Kohonen Self-Organising Map (KSOM) is one of the well-known clustering algorithms that can solve various problems without a pre-
defined number of clusters. However, similar to other clustering algorithms, this algorithm requires sufficient data for its unsupervised learning process. The inadequate amount of class label data in a dataset significantly affects the clustering learning process, leading to inefficient and unreliable results. Numerous research have been conducted by hybridising and optimising the KSOM algorithm with various optimisation techniques. Unfortunately, the problems are still unsolved, especially separation boundary and overlapping clusters. Therefore, this research proposed an improved pheromonebased PKSOM algorithm known as iPKSOM to solve the mentioned problem. Six different datasets, i.e. Iris, Seed, Glass, Titanic, WDBC, and Tropical Wood datasets were chosen to investigate the effectiveness of the iPKSOM algorithm. All datasets were observed and compared with the original KSOM results. This modification significantly impacted the clustering process by improving and refining the scatteredness of clustering data and reducing overlapping clusters. Therefore, this proposed algorithm can be implemented in clustering other complex datasets, such as high dimensional and streaming data. |
---|