Contributions to outlier detection methods: some theory and applications

Show simple item record

dc.contributor Adams, Benjamin Michael
dc.contributor Barrett, Bruce E.
dc.contributor D'Souza, Giles
dc.contributor Lee, Junsoo
dc.contributor.advisor Chakraborti, Subhabrata Dovoedo, Yinaze Herve 2017-03-01T14:46:10Z 2017-03-01T14:46:10Z 2011
dc.identifier.other u0015_0000001_0000693
dc.identifier.other Dovoedo_alatus_0004D_10779
dc.description Electronic Thesis or Dissertation
dc.description.abstract Tukey's traditional boxplot (Tukey, 1977) is a widely used Exploratory Data Analysis (EDA) tools often used for outlier detection with univariate data. In this dissertation, a modification of Tukey's boxplot is proposed in which the probability of at least one false alarm is controlled, as in Sim et al. 2005. The exact expression for that probability is derived and is used to find the fence constants, for observations from any specified location-scale distribution. The proposed procedure is compared with that of Sim et al., 2005 in a simulation study. Outlier detection and control charting are closely related. Using the preceding procedure, one- and two-sided boxplot-based Phase I control charts for individual observations are proposed for data from an exponential distribution, while controlling the overall false alarm rate. The proposed charts are compared with the charts by Jones and Champ, 2002, in a simulation study. Sometimes, the practitioner is unable or unwilling to make an assumption about the form of the underlying distribution but is confident that the distribution is skewed. In that case, it is well documented that the application of Tukey's boxplot for outlier detection results in increased number of false alarms. To this end, in this dissertation, a modification of the so-called adjusted boxplot for skewed distributions by Hubert and Vandervieren, 2008, is proposed. The proposed procedure is compared to the adjusted boxplot and Tukey's procedure in a simulation study. In practice, the data are often multivariate. The concept of a (statistical) depth (or equivalently outlyingness) function provides a natural, nonparametric, "center-outward" ordering of a multivariate data point with respect to data cloud. The deeper a point, the less outlying it is. It is then natural to use some outlyingness functions as outlier identifiers. A simulation study is performed to compare the outlier detection capabilities of selected outlyingness functions available in the literature for multivariate skewed data. Recommendations are provided.
dc.format.extent 192 p.
dc.format.medium electronic
dc.format.mimetype application/pdf
dc.language English
dc.language.iso en_US
dc.publisher University of Alabama Libraries
dc.relation.ispartof The University of Alabama Electronic Theses and Dissertations
dc.relation.ispartof The University of Alabama Libraries Digital Collections
dc.relation.hasversion born digital
dc.rights All rights reserved by the author unless otherwise indicated.
dc.subject.other Statistics
dc.title Contributions to outlier detection methods: some theory and applications
dc.type thesis
dc.type text University of Alabama. Dept. of Information Systems, Statistics, and Management Science Applied Statistics The University of Alabama doctoral Ph.D.

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


My Account