Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis
Clustering is basically one of the major sources of primary data mining tools. It makes researchers understand the natural grouping of attributes in datasets. Clustering is an unsupervised classification method with the major aim of partitioning, where objects in the same cluster are similar, and ob...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Universiti Putra Malaysia Press
2018
|
Online Access: | http://psasir.upm.edu.my/id/eprint/66312/1/17%20JST-1003-2017.pdf http://psasir.upm.edu.my/id/eprint/66312/ http://www.pertanika.upm.edu.my/Pertanika%20PAPERS/JST%20Vol.%2026%20(4)%20Oct.%202018/17%20JST-1003-2017.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.upm.eprints.66312 |
---|---|
record_format |
eprints |
spelling |
my.upm.eprints.663122019-02-12T07:04:42Z http://psasir.upm.edu.my/id/eprint/66312/ Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis Dalatu, Paul Inuwa Midi, Habshah Clustering is basically one of the major sources of primary data mining tools. It makes researchers understand the natural grouping of attributes in datasets. Clustering is an unsupervised classification method with the major aim of partitioning, where objects in the same cluster are similar, and objects which belong to different clusters vary significantly, with respect to their attributes. However, the classical Standardized Euclidean distance, which uses standard deviation to down weight maximum points of the ith features on the distance clusters, has been criticized by many scholars that the method produces outliers, lack robustness, and has 0% breakdown points. It also has low efficiency in normal distribution. Therefore, to remedy the problem, we suggest two statistical estimators which have 50% breakdown points namely the Sn and Qn estimators, with 58% and 82% efficiency, respectively. The proposed methods evidently outperformed the existing methods in down weighting the maximum points of the ith features in distance-based clustering analysis. Universiti Putra Malaysia Press 2018 Article PeerReviewed text en http://psasir.upm.edu.my/id/eprint/66312/1/17%20JST-1003-2017.pdf Dalatu, Paul Inuwa and Midi, Habshah (2018) Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis. Pertanika Journal of Science & Technology, 26 (4). pp. 1823-1836. ISSN 0128-7680; ESSN: 2231-8526 http://www.pertanika.upm.edu.my/Pertanika%20PAPERS/JST%20Vol.%2026%20(4)%20Oct.%202018/17%20JST-1003-2017.pdf |
institution |
Universiti Putra Malaysia |
building |
UPM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Putra Malaysia |
content_source |
UPM Institutional Repository |
url_provider |
http://psasir.upm.edu.my/ |
language |
English |
description |
Clustering is basically one of the major sources of primary data mining tools. It makes researchers understand the natural grouping of attributes in datasets. Clustering is an unsupervised classification method with the major aim of partitioning, where objects in the same cluster are similar, and objects which belong to different clusters vary significantly, with respect to their attributes. However, the classical Standardized Euclidean distance, which uses standard deviation to down weight maximum points of the ith features on the distance clusters, has been criticized by many scholars that the method produces outliers, lack robustness, and has 0% breakdown points. It also has low efficiency in normal distribution. Therefore, to remedy the problem, we suggest two statistical estimators which have 50% breakdown points namely the Sn and Qn estimators, with 58% and 82% efficiency, respectively. The proposed methods evidently outperformed the existing methods in down weighting the maximum points of the ith features in distance-based clustering analysis. |
format |
Article |
author |
Dalatu, Paul Inuwa Midi, Habshah |
spellingShingle |
Dalatu, Paul Inuwa Midi, Habshah Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis |
author_facet |
Dalatu, Paul Inuwa Midi, Habshah |
author_sort |
Dalatu, Paul Inuwa |
title |
Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis |
title_short |
Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis |
title_full |
Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis |
title_fullStr |
Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis |
title_full_unstemmed |
Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis |
title_sort |
statistical estimators as an alternative to standard deviation in weighted euclidean distance cluster analysis |
publisher |
Universiti Putra Malaysia Press |
publishDate |
2018 |
url |
http://psasir.upm.edu.my/id/eprint/66312/1/17%20JST-1003-2017.pdf http://psasir.upm.edu.my/id/eprint/66312/ http://www.pertanika.upm.edu.my/Pertanika%20PAPERS/JST%20Vol.%2026%20(4)%20Oct.%202018/17%20JST-1003-2017.pdf |
_version_ |
1643838569153822720 |
score |
13.160551 |