Robust correlation coefficient based on robust scale and location estimator

The correlation coefficient is the common statistical analysis that has been used in measuring the relationship between two variables. The most frequently used correlation coefficients is the Pearson correlation coefficient. This coefficient is powerful when the assumptions of linearity between two...

詳細記述

保存先:
書誌詳細
第一著者: Nur Amira, Zakaria
フォーマット: 学位論文
言語:English
English
English
出版事項: 2018
主題:
オンライン・アクセス:https://etd.uum.edu.my/9137/1/s818475_01.pdf
https://etd.uum.edu.my/9137/2/s818475_02.pdf
https://etd.uum.edu.my/9137/3/s818475_references.docx
https://etd.uum.edu.my/9137/
タグ: タグ追加
タグなし, このレコードへの初めてのタグを付けませんか!
id my.uum.etd.9137
record_format eprints
spelling my.uum.etd.91372022-03-28T00:41:33Z https://etd.uum.edu.my/9137/ Robust correlation coefficient based on robust scale and location estimator Nur Amira, Zakaria QA273-280 Probabilities. Mathematical statistics The correlation coefficient is the common statistical analysis that has been used in measuring the relationship between two variables. The most frequently used correlation coefficients is the Pearson correlation coefficient. This coefficient is powerful when the assumptions of linearity between two variables and the normality of the distribution are fulfilled. However, this correlation coefficient unable to perform well with the presence of the outlier in the data. The calculation of the Pearson correlation coefficient uses mean, which known to be very sensitive to the outlier. Alternatively, the Spearman rank correlation coefficient and Kendall’s Tau correlation coefficient are the solutions for this problem. The usage of rank in the calculation of these coefficients instead of original observation lead to losing useful information. For that reason, this study focusing on robust correlation approach based on the median. The existence of median based correlation coefficient used Median Absolute Deviation (MAD) as it scales estimator. Nevertheless, the MAD has low efficiency under Gaussian distribution and this estimator only view dispersion on symmetric distribution. Thus, this study modified the median based correlation using two approaches. Firstly, using the same median based correlation, this study proposed another robust scale estimator namely MADn, Sn, and Qn. Secondly, this study changed the median based correlation to the Hodges Lehmann based correlation and employed all robust scale estimators that are median, MAD, MADn, Sn, and Qn. The performances of the proposed procedures were evaluated based on two conditions of simulation data; perfect and contaminated data. Three indicators were used in evaluating the performance of the proposed procedures which are the correlation coefficient value, the average bias and the standard error. The proposed procedures were validated using a real dataset. The results of the simulation data show that the Qn correlation coefficient and Hodges Lehmann- Qn correlation coefficient performed better under contaminated data compared to the Pearson correlation coefficient and other existing robust correlation coefficients. As the conclusion, the Qn correlation coefficient and the Hodges Lehmann- Qn correlation coefficient are the good alternatives for the Pearson correlation coefficient when there is the outlier in the data. 2018 Thesis NonPeerReviewed text en https://etd.uum.edu.my/9137/1/s818475_01.pdf text en https://etd.uum.edu.my/9137/2/s818475_02.pdf text en https://etd.uum.edu.my/9137/3/s818475_references.docx Nur Amira, Zakaria (2018) Robust correlation coefficient based on robust scale and location estimator. Masters thesis, Universiti Utara Malaysia.
institution Universiti Utara Malaysia
building UUM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Utara Malaysia
content_source UUM Electronic Theses
url_provider http://etd.uum.edu.my/
language English
English
English
topic QA273-280 Probabilities. Mathematical statistics
spellingShingle QA273-280 Probabilities. Mathematical statistics
Nur Amira, Zakaria
Robust correlation coefficient based on robust scale and location estimator
description The correlation coefficient is the common statistical analysis that has been used in measuring the relationship between two variables. The most frequently used correlation coefficients is the Pearson correlation coefficient. This coefficient is powerful when the assumptions of linearity between two variables and the normality of the distribution are fulfilled. However, this correlation coefficient unable to perform well with the presence of the outlier in the data. The calculation of the Pearson correlation coefficient uses mean, which known to be very sensitive to the outlier. Alternatively, the Spearman rank correlation coefficient and Kendall’s Tau correlation coefficient are the solutions for this problem. The usage of rank in the calculation of these coefficients instead of original observation lead to losing useful information. For that reason, this study focusing on robust correlation approach based on the median. The existence of median based correlation coefficient used Median Absolute Deviation (MAD) as it scales estimator. Nevertheless, the MAD has low efficiency under Gaussian distribution and this estimator only view dispersion on symmetric distribution. Thus, this study modified the median based correlation using two approaches. Firstly, using the same median based correlation, this study proposed another robust scale estimator namely MADn, Sn, and Qn. Secondly, this study changed the median based correlation to the Hodges Lehmann based correlation and employed all robust scale estimators that are median, MAD, MADn, Sn, and Qn. The performances of the proposed procedures were evaluated based on two conditions of simulation data; perfect and contaminated data. Three indicators were used in evaluating the performance of the proposed procedures which are the correlation coefficient value, the average bias and the standard error. The proposed procedures were validated using a real dataset. The results of the simulation data show that the Qn correlation coefficient and Hodges Lehmann- Qn correlation coefficient performed better under contaminated data compared to the Pearson correlation coefficient and other existing robust correlation coefficients. As the conclusion, the Qn correlation coefficient and the Hodges Lehmann- Qn correlation coefficient are the good alternatives for the Pearson correlation coefficient when there is the outlier in the data.
format Thesis
author Nur Amira, Zakaria
author_facet Nur Amira, Zakaria
author_sort Nur Amira, Zakaria
title Robust correlation coefficient based on robust scale and location estimator
title_short Robust correlation coefficient based on robust scale and location estimator
title_full Robust correlation coefficient based on robust scale and location estimator
title_fullStr Robust correlation coefficient based on robust scale and location estimator
title_full_unstemmed Robust correlation coefficient based on robust scale and location estimator
title_sort robust correlation coefficient based on robust scale and location estimator
publishDate 2018
url https://etd.uum.edu.my/9137/1/s818475_01.pdf
https://etd.uum.edu.my/9137/2/s818475_02.pdf
https://etd.uum.edu.my/9137/3/s818475_references.docx
https://etd.uum.edu.my/9137/
_version_ 1729706558272569344
score 13.251813