Improved clustering using robust and classical principal component

k-means algorithm is a popular data clustering algorithm. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. Finding the appropriate number of clusters for a given data set...

Full description

Saved in:
Bibliographic Details
Main Author: Hassn, Ahmed Kadom
Format: Thesis
Language:English
Published: 2017
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/70922/1/FS%202017%2047%20UPM.pdf
http://psasir.upm.edu.my/id/eprint/70922/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.upm.eprints.70922
record_format eprints
spelling my.upm.eprints.709222022-07-07T03:07:15Z http://psasir.upm.edu.my/id/eprint/70922/ Improved clustering using robust and classical principal component Hassn, Ahmed Kadom k-means algorithm is a popular data clustering algorithm. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. Finding the appropriate number of clusters for a given data set is generally a trial-and-error process which made more difficult by the subjective nature of deciding what constitutes ‘correct’ clustering. When dimension of data is large it is often difficult to apply k-means clustering algorithm since it needs lots of computational times. To remedy this problem, we propose to integrate Principal Component analysis (PCA) which is useful for dimensionality reduction of a dataset with the k-means clustering algorithm. We call our propose method as k-means by principal components (pc1). In this study, the kernels that are created by using the k-means method are replaced with kernels which are created by using PCA method where the PCA method reduces the dimensionality of a data. The results of the study show that the k-means by PCA is faster and more efficient than the classical k-means algorithm. The classical k-means algorithm and the k-means by PCA algorithm are very sensitive to the presence of outlier. Hence the k-means by robust PCA is developed to rectify the problem of outliers in the dataset. The findings indicate that in the absence of outliers, the performances of both methods; the k-means by PCA and the k-means by robust PCA are equally good. Nonetheless, the k-means by robust PCA is not much affected by outliers compared to the k-means by classical PCA. 2017-06 Thesis NonPeerReviewed text en http://psasir.upm.edu.my/id/eprint/70922/1/FS%202017%2047%20UPM.pdf Hassn, Ahmed Kadom (2017) Improved clustering using robust and classical principal component. Masters thesis, Universiti Putra Malaysia. Algorithms
institution Universiti Putra Malaysia
building UPM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Putra Malaysia
content_source UPM Institutional Repository
url_provider http://psasir.upm.edu.my/
language English
topic Algorithms
spellingShingle Algorithms
Hassn, Ahmed Kadom
Improved clustering using robust and classical principal component
description k-means algorithm is a popular data clustering algorithm. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. Finding the appropriate number of clusters for a given data set is generally a trial-and-error process which made more difficult by the subjective nature of deciding what constitutes ‘correct’ clustering. When dimension of data is large it is often difficult to apply k-means clustering algorithm since it needs lots of computational times. To remedy this problem, we propose to integrate Principal Component analysis (PCA) which is useful for dimensionality reduction of a dataset with the k-means clustering algorithm. We call our propose method as k-means by principal components (pc1). In this study, the kernels that are created by using the k-means method are replaced with kernels which are created by using PCA method where the PCA method reduces the dimensionality of a data. The results of the study show that the k-means by PCA is faster and more efficient than the classical k-means algorithm. The classical k-means algorithm and the k-means by PCA algorithm are very sensitive to the presence of outlier. Hence the k-means by robust PCA is developed to rectify the problem of outliers in the dataset. The findings indicate that in the absence of outliers, the performances of both methods; the k-means by PCA and the k-means by robust PCA are equally good. Nonetheless, the k-means by robust PCA is not much affected by outliers compared to the k-means by classical PCA.
format Thesis
author Hassn, Ahmed Kadom
author_facet Hassn, Ahmed Kadom
author_sort Hassn, Ahmed Kadom
title Improved clustering using robust and classical principal component
title_short Improved clustering using robust and classical principal component
title_full Improved clustering using robust and classical principal component
title_fullStr Improved clustering using robust and classical principal component
title_full_unstemmed Improved clustering using robust and classical principal component
title_sort improved clustering using robust and classical principal component
publishDate 2017
url http://psasir.upm.edu.my/id/eprint/70922/1/FS%202017%2047%20UPM.pdf
http://psasir.upm.edu.my/id/eprint/70922/
_version_ 1738511953109712896
score 13.160551