Contributions to outlier detection methods: some theory and applications

dc.contributorAdams, Benjamin Michael
dc.contributorBarrett, Bruce E.
dc.contributorD'Souza, Giles
dc.contributorLee, Junsoo
dc.contributor.advisorChakraborti, Subhabrata
dc.contributor.authorDovoedo, Yinaze Herve
dc.contributor.otherUniversity of Alabama Tuscaloosa
dc.date.accessioned2017-03-01T14:46:10Z
dc.date.available2017-03-01T14:46:10Z
dc.date.issued2011
dc.descriptionElectronic Thesis or Dissertationen_US
dc.description.abstractTukey's traditional boxplot (Tukey, 1977) is a widely used Exploratory Data Analysis (EDA) tools often used for outlier detection with univariate data. In this dissertation, a modification of Tukey's boxplot is proposed in which the probability of at least one false alarm is controlled, as in Sim et al. 2005. The exact expression for that probability is derived and is used to find the fence constants, for observations from any specified location-scale distribution. The proposed procedure is compared with that of Sim et al., 2005 in a simulation study. Outlier detection and control charting are closely related. Using the preceding procedure, one- and two-sided boxplot-based Phase I control charts for individual observations are proposed for data from an exponential distribution, while controlling the overall false alarm rate. The proposed charts are compared with the charts by Jones and Champ, 2002, in a simulation study. Sometimes, the practitioner is unable or unwilling to make an assumption about the form of the underlying distribution but is confident that the distribution is skewed. In that case, it is well documented that the application of Tukey's boxplot for outlier detection results in increased number of false alarms. To this end, in this dissertation, a modification of the so-called adjusted boxplot for skewed distributions by Hubert and Vandervieren, 2008, is proposed. The proposed procedure is compared to the adjusted boxplot and Tukey's procedure in a simulation study. In practice, the data are often multivariate. The concept of a (statistical) depth (or equivalently outlyingness) function provides a natural, nonparametric, "center-outward" ordering of a multivariate data point with respect to data cloud. The deeper a point, the less outlying it is. It is then natural to use some outlyingness functions as outlier identifiers. A simulation study is performed to compare the outlier detection capabilities of selected outlyingness functions available in the literature for multivariate skewed data. Recommendations are provided.en_US
dc.format.extent192 p.
dc.format.mediumelectronic
dc.format.mimetypeapplication/pdf
dc.identifier.otheru0015_0000001_0000693
dc.identifier.otherDovoedo_alatus_0004D_10779
dc.identifier.urihttps://ir.ua.edu/handle/123456789/1198
dc.languageEnglish
dc.language.isoen_US
dc.publisherUniversity of Alabama Libraries
dc.relation.hasversionborn digital
dc.relation.ispartofThe University of Alabama Electronic Theses and Dissertations
dc.relation.ispartofThe University of Alabama Libraries Digital Collections
dc.rightsAll rights reserved by the author unless otherwise indicated.en_US
dc.subjectStatistics
dc.titleContributions to outlier detection methods: some theory and applicationsen_US
dc.typethesis
dc.typetext
etdms.degree.departmentUniversity of Alabama. Department of Information Systems, Statistics, and Management Science
etdms.degree.disciplineApplied Statistics
etdms.degree.grantorThe University of Alabama
etdms.degree.leveldoctoral
etdms.degree.namePh.D.

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
file_1.pdf
Size:
805.45 KB
Format:
Adobe Portable Document Format