Some problems of outliers in circular data / Ali H.M. Abuzaid

This study considers three problems of outliers in circular statistics. The first problem is an attempt to use the standard outlier detection procedures for linear data set by approximating circular variables by linear variables. This is possible for large values of concentration parameter. Series o...

Full description

Saved in:
Bibliographic Details
Main Author: Abuzaid, Ali H.M.
Format: Thesis
Published: 2010
Subjects:
Online Access:http://studentsrepo.um.edu.my/4277/1/Some_Problems_of_Outliers_in_Circular_Data.pdf
http://pendeta.um.edu.my/client/default/search/detailnonmodal/ent:$002f$002fSD_ILS$002f796$002fSD_ILS:796645/one?qu=Some+problems+of+outliers+in+circular+data
http://studentsrepo.um.edu.my/4277/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This study considers three problems of outliers in circular statistics. The first problem is an attempt to use the standard outlier detection procedures for linear data set by approximating circular variables by linear variables. This is possible for large values of concentration parameter. Series of simulation studies are carried out to specify the accepted value of the concentration parameter so that the von Mises distribution can be approximated by normal distribution. The second is the problem of outliers in circular samples. Two numerical tests of discordancy are proposed to identify outliers. The test statistics are based on the summation of circular distances and chord lengths respectively from the point of interest to all other observations on the circumference of a unit circle. The approximate distributions of the test statistics are derived. Simulation studies show that both statistics perform better than other known discordancy tests. On the other hand, a boxplot version for circular data sets is proposed. Via simulation studies, we show that the resistant criterion highly depends on the concentration of circular samples. The third problem is the existence of outliers in the circular regression model. Firstly, we propose a new definition of circular residuals which can be used to identify outliers using various graphical and numerical tests. Secondly, three numerical tests are developed to detect influential observations based on row deletion approach. The first two are defined using the circular distance between the observed and fitted values with the derivation of the approximate distributions. The other test is an extended version of the COVRATIO statistic in linear regression to the circular case. In general, the three numerical tests perform well in detecting influential observations. For illustration, we consider two real circular data sets, namely, the frogs’ data set and the wind direction data set. In conclusion, the statistics proposed by this study are able to solve some problems of outliers in circular data.