Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis

Clustering is basically one of the major sources of primary data mining tools. It makes researchers understand the natural grouping of attributes in datasets. Clustering is an unsupervised classification method with the major aim of partitioning, where objects in the same cluster are similar, and ob...

Full description

Saved in:
Bibliographic Details
Main Authors: Dalatu, Paul Inuwa, Midi, Habshah
Format: Article
Language:English
Published: Universiti Putra Malaysia Press 2018
Online Access:http://psasir.upm.edu.my/id/eprint/66312/1/17%20JST-1003-2017.pdf
http://psasir.upm.edu.my/id/eprint/66312/
http://www.pertanika.upm.edu.my/Pertanika%20PAPERS/JST%20Vol.%2026%20(4)%20Oct.%202018/17%20JST-1003-2017.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.upm.eprints.66312
record_format eprints
spelling my.upm.eprints.663122019-02-12T07:04:42Z http://psasir.upm.edu.my/id/eprint/66312/ Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis Dalatu, Paul Inuwa Midi, Habshah Clustering is basically one of the major sources of primary data mining tools. It makes researchers understand the natural grouping of attributes in datasets. Clustering is an unsupervised classification method with the major aim of partitioning, where objects in the same cluster are similar, and objects which belong to different clusters vary significantly, with respect to their attributes. However, the classical Standardized Euclidean distance, which uses standard deviation to down weight maximum points of the ith features on the distance clusters, has been criticized by many scholars that the method produces outliers, lack robustness, and has 0% breakdown points. It also has low efficiency in normal distribution. Therefore, to remedy the problem, we suggest two statistical estimators which have 50% breakdown points namely the Sn and Qn estimators, with 58% and 82% efficiency, respectively. The proposed methods evidently outperformed the existing methods in down weighting the maximum points of the ith features in distance-based clustering analysis. Universiti Putra Malaysia Press 2018 Article PeerReviewed text en http://psasir.upm.edu.my/id/eprint/66312/1/17%20JST-1003-2017.pdf Dalatu, Paul Inuwa and Midi, Habshah (2018) Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis. Pertanika Journal of Science & Technology, 26 (4). pp. 1823-1836. ISSN 0128-7680; ESSN: 2231-8526 http://www.pertanika.upm.edu.my/Pertanika%20PAPERS/JST%20Vol.%2026%20(4)%20Oct.%202018/17%20JST-1003-2017.pdf
institution Universiti Putra Malaysia
building UPM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Putra Malaysia
content_source UPM Institutional Repository
url_provider http://psasir.upm.edu.my/
language English
description Clustering is basically one of the major sources of primary data mining tools. It makes researchers understand the natural grouping of attributes in datasets. Clustering is an unsupervised classification method with the major aim of partitioning, where objects in the same cluster are similar, and objects which belong to different clusters vary significantly, with respect to their attributes. However, the classical Standardized Euclidean distance, which uses standard deviation to down weight maximum points of the ith features on the distance clusters, has been criticized by many scholars that the method produces outliers, lack robustness, and has 0% breakdown points. It also has low efficiency in normal distribution. Therefore, to remedy the problem, we suggest two statistical estimators which have 50% breakdown points namely the Sn and Qn estimators, with 58% and 82% efficiency, respectively. The proposed methods evidently outperformed the existing methods in down weighting the maximum points of the ith features in distance-based clustering analysis.
format Article
author Dalatu, Paul Inuwa
Midi, Habshah
spellingShingle Dalatu, Paul Inuwa
Midi, Habshah
Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis
author_facet Dalatu, Paul Inuwa
Midi, Habshah
author_sort Dalatu, Paul Inuwa
title Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis
title_short Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis
title_full Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis
title_fullStr Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis
title_full_unstemmed Statistical estimators as an alternative to standard deviation in weighted Euclidean distance cluster analysis
title_sort statistical estimators as an alternative to standard deviation in weighted euclidean distance cluster analysis
publisher Universiti Putra Malaysia Press
publishDate 2018
url http://psasir.upm.edu.my/id/eprint/66312/1/17%20JST-1003-2017.pdf
http://psasir.upm.edu.my/id/eprint/66312/
http://www.pertanika.upm.edu.my/Pertanika%20PAPERS/JST%20Vol.%2026%20(4)%20Oct.%202018/17%20JST-1003-2017.pdf
_version_ 1643838569153822720
score 13.160551