- Browse by Author

# Department of Information Systems, Statistics & Management Science

## Permanent URI for this community

## Browse

### Browsing Department of Information Systems, Statistics & Management Science by Author "Chakraborti, Subhabrata"

Now showing 1 - 9 of 9

###### Results Per Page

###### Sort Options

Item Advances in mixture modeling and model based clustering(University of Alabama Libraries, 2015) Michael, Semhar K.; Melnykov, Volodymyr; University of Alabama TuscaloosaShow more Cluster analysis is part of unsupervised learning that deals with finding groups of similar observations in heterogeneous data. There are several clustering approaches with the goal of minimizing the within cluster variance while maximizing the variance between clusters. K-means or hierarchical clustering with different linkages can be thought as distance-based approaches. Another approach is model-based which relies on the idea of finite mixture models. This dissertations will propose new advances in clustering area mostly related to model-based clustering and its extension to the K-means algorithm. This report has five chapters. The first chapter is a literature review on recent advances in the area of model-based clustering and finite mixture modeling. Main advances and challenges are described in the methodology section. Then some interesting and diverse applications of model-based clustering are presented in the application section. The second chapter deals with a simulation study conducted to analyze the factors that affect complexity of model-based clustering. In the third chapter we develop a methodology for model-based clustering of regression time series data and show its application to annual tree rings. In the fourth chapter, we utilize the relationship between model-based clustering and the Kmeans algorithm to develop a methodology for merging clusters formed by K-means to find meaningful grouping. The final chapter is dedicated to the problem of initialization in model-based clustering. It is well known fact that the performance of model-based clustering is highly dependent on initialization of the EM algorithm. So far there is no method that comprehensively works in all situations. In this project, we use the idea of model averaging and initialization using the emEM algorithm to solve this problem.Show more Item Contributions to joint monitoring of location and scale parameters: some theory and applications(University of Alabama Libraries, 2012) McCracken, Amanda Kaye; Chakraborti, Subhabrata; University of Alabama TuscaloosaShow more Since their invention in the 1920s, control charts have been popular tools for use in monitoring processes in fields as varied as manufacturing and healthcare. Most of these charts are designed to monitor a single process parameter, but recently, a number of charts and schemes for jointly monitoring the location and scale of processes which follow two-parameter distributions have been developed. These joint monitoring charts are particularly relevant for processes in which special causes may result in a simultaneous shift in the location parameter and the scale parameter. Among the available schemes for jointly monitoring location and scale parameters, the vast majority are designed for normally distributed processes for which the in-control mean and variance are known rather than estimated from data. When the process data are non-normally distributed or the process parameters are unknown, alternative control charts are needed. This dissertation presents and compares several control schemes for jointly monitoring data from Laplace and shifted exponential distributions with known parameters as well as a pair of charts for monitoring data from normal distributions with unknown mean and variance. The normal theory charts are adaptations of two existing procedures for the known parameter case, Razmy's (2005) Distance chart and Chen and Cheng's (1998) Max chart, while the Laplace and shifted exponential charts are designed using an appropriate statistic for each parameter, such as the maximum likelihood estimators.Show more Item Contributions to multivariate control charting: studies of the Z chart and four nonparametric charts(University of Alabama Libraries, 2010) Boone, Jeffrey Michael; Chakraborti, Subhabrata; University of Alabama TuscaloosaShow more Autocorrelated data are common in today's process control applications. Many of these applications involve two or more related variables so that multivariate statistical process control (SPC) methods should be used in process monitoring since the relationship among the variables should be accounted for. Dealing with multivariate autocorrelated data poses many challenges. Even though no one chart is best for multivariate data, the Z chart proposed by Kalgonda and Kulkarni (2004) is fairly easy to implement and is particularly useful for its diagnostic ability, which is to pinpoint the variable(s) that is(are) out of control in case the chart signals. In this dissertation, the performance of the Z chart is compared to the chi-square chart and the multivariate EWMA (MEWMA) chart in a number of simulation studies. Simulations are also performed to study the effects of parameter estimation and non-normality (using the multivariate t and multivariate gamma distributions) on the performance of the Z chart. In addition to the problem of autocorrelation in multivariate quality control, in many quality control applications, the distribution assumption of the data is not met or there is not enough evidence showing that the assumption is met. In many situations, a control chart that does not require a strict distribution assumption, called a nonparametric or distribution-free chart, may be desirable. In this paper, four new multivariate nonparametric Shewhart control charts are proposed. They are relatively simple to use and are based on the multivariate forms of the sign and Wilcoxon signed-rank statistics and the maximum of multiple univariate sign and Wilcoxon signed-rank statistics. The performance of these charts is also studied. Illustrations and applications are also demonstrated.Show more Item Contributions to outlier detection methods: some theory and applications(University of Alabama Libraries, 2011) Dovoedo, Yinaze Herve; Chakraborti, Subhabrata; University of Alabama TuscaloosaShow more Tukey's traditional boxplot (Tukey, 1977) is a widely used Exploratory Data Analysis (EDA) tools often used for outlier detection with univariate data. In this dissertation, a modification of Tukey's boxplot is proposed in which the probability of at least one false alarm is controlled, as in Sim et al. 2005. The exact expression for that probability is derived and is used to find the fence constants, for observations from any specified location-scale distribution. The proposed procedure is compared with that of Sim et al., 2005 in a simulation study. Outlier detection and control charting are closely related. Using the preceding procedure, one- and two-sided boxplot-based Phase I control charts for individual observations are proposed for data from an exponential distribution, while controlling the overall false alarm rate. The proposed charts are compared with the charts by Jones and Champ, 2002, in a simulation study. Sometimes, the practitioner is unable or unwilling to make an assumption about the form of the underlying distribution but is confident that the distribution is skewed. In that case, it is well documented that the application of Tukey's boxplot for outlier detection results in increased number of false alarms. To this end, in this dissertation, a modification of the so-called adjusted boxplot for skewed distributions by Hubert and Vandervieren, 2008, is proposed. The proposed procedure is compared to the adjusted boxplot and Tukey's procedure in a simulation study. In practice, the data are often multivariate. The concept of a (statistical) depth (or equivalently outlyingness) function provides a natural, nonparametric, "center-outward" ordering of a multivariate data point with respect to data cloud. The deeper a point, the less outlying it is. It is then natural to use some outlyingness functions as outlier identifiers. A simulation study is performed to compare the outlier detection capabilities of selected outlyingness functions available in the literature for multivariate skewed data. Recommendations are provided.Show more Item On the detection and estimation of changes in a process mean based on kernel estimators(University of Alabama Libraries, 2012) Mercado Velasco, Gary Ricardo; Perry, Marcus B.; University of Alabama TuscaloosaShow more Parametric control charts are very attractive and have been used in the industry for a very long time. However, in many applications the underlying process distribution is not known sufficiently to assume a specific distribution function. When the distributional assumptions underlying a parametric control chart are violated, the performance of the control chart could be potentially affected. Since robustness to departures from normality is a desirable property for control charts, this dissertation reports three separate papers on the development and evaluation of robust Shewhart-type control charts for both the univariate and multivariate cases. In addition, a statistical procedure is developed for detecting step changes in the mean of the underlying process given that Shewhart-type control charts are not very sensitive to smaller changes in the process mean. The estimator is intended to be applied following a control chart signal to aid in diagnosing root cause of change. Results indicate that methodologies proposed throughout this dissertation research provide robust in-control average run length, better detection performance than that offered by the traditional Shewhart control chart and/or the Hotelling's control chart, and meaningful change point diagnostic statistics to aid in the search for the special cause.Show more Item Reduced bias prediction regions and estimators of the original response when using data transformations(University of Alabama Libraries, 2015) Walker, Michael; Perry, Marcus B.; University of Alabama TuscaloosaShow more Initially motivated by electron microscopy experiments, we develop an approximate prediction interval on the univariate response variable Y, where it is assumed that a normal-theory linear model is fit using a transformed version of Y, and the transformation type is contained in the Box-Cox family. Further motivated by A-10 single-engine climb experiments, we then develop an approximate prediction interval on the univariate response Y, in which a linear model is fit using a transformed version of Y, contained in the Manly exponential family. For each case, we derive a closed-form approximation to the kth moment of the original response variable Y, which is then used to estimate the mean and variance of Y, given parameter estimates obtained from fitting the model in the transformed domain. Chebychev’s inequality is then used to construct a 100(1 − α)% prediction interval estimator on Y based on these mean and variance estimators. Extended data obtained from the A-10 single-engine climb experiments motivates the development of prediction regions in the original domain of a q-variate response vector Y through the use of multivariate extensions of both the Box-Cox power transformation and the Manly exponential transformation. For each transformation, we derive closed-form approximations to the kth moment of each original response Y, as well as a closed-form approximation to E(Yi Yi'), which are used to estimate the mean and variance of each Y and the covariance between them, given parameter estimates obtained from fitting the model in the transformed domain. Exploiting two multivariate analogs of Chebyshev’s inequality, we construct an approximate 100(1 − α)% prediction sphere and ellipsoid on the original response vector Y.Show more Item Some Contributions to Tolerance Intervals and Statistical Process Control(University of Alabama Libraries, 2021) Alqurashi, Mosab; Chakraborti, Subhabrata; University of Alabama TuscaloosaShow more Tolerance Intervals play an important role in statistical process control along with control charts. When constructing a tolerance interval or a control chart for the mean of a quality characteristic, the normality assumption can be justifiable at least in an approximate sense. However, in applications where the individual observations are to be monitored or controlled, the normality assumption is not always satisfied. In addition, for high dimensional data, the normality is rarely, if ever, satisfied. The existing tolerance intervals for exponential random variables and sample variances are constructed under a condition that assumes a known parameter, leading to unbalanced tolerance intervals. Moreover, the existing multivariate distribution-free control charts in the literature lack the ability to identify the out-of-control variables directly from the chart signal and the scale of the original variables is often lost. In this dissertation, new tolerance intervals for exponential random variables and for the sample variances, and a multivariate distribution-free control chart are developed. This dissertation consists of three chapters. The summary of each chapter is provided below. In the first chapter, we introduce a tolerance interval for exponential random variables that gives the practitioner control over the ratio of the two tails probabilities without assuming that the parameter of the distribution, the mean, is known. The second chapter develops a tolerance interval and a guaranteed performance control chart for the sample variances without assuming that the population variance is known. The third chapter introduces a multivariate distribution-free control chart based on order statistics that can identify out-of-control variables and preserve the original scale.Show more Item Some contributions to univariate nonparametric tests and control charts(University of Alabama Libraries, 2017) Zheng, Rong; Chakraborti, Subhabrata; University of Alabama TuscaloosaShow more In general, statistical methods have two categories: parametric and nonparametric. Parametric analysis is usually made based on information regarding the probability distribution of the random variable. While, nonparametric method is also referred as a distribution-free procedure, which does not require prior knowledge of the distribution of the random variable. In reality, few cases allow practitioners to gain full knowledge of a random variable and tell the probability distribution for sure. Hence, there are two choices for practitioners. One can still use the parametric methods due to the scientific evaluations or the simplification of situation, with an assumption of the parametric distribution. Alternatively, one can directly apply the nonparametric methods without having much knowledge of the distribution. The conclusions from the parametric methods are valid as long as the assumptions are substantiated. These assumptions would help solving problems, but also risky because making a wrong assumption might be dangerous. Hence, nonparametric techniques would be a preferable alternative. One chief advantage of the nonparametric methods lies in its relaxation of the shapes of the distributions, namely, distribution-free property. Hence, from a research point of view, new methodology with nonparametric techniques applied, or further investigation related to existing nonparametric techniques could be interesting, informative and valuable. All research in this matter contributes to univariate nonparametric tests and control charts.Show more Item Three inventory models for non-traditional supply chains(University of Alabama Libraries, 2011) Neve, Benjamin V.; Schmidt, Charles P.; University of Alabama TuscaloosaShow more This work considers three different non-traditional supply chain structures with similar demand and replenishment parameters, and similar solution techniques. In the first article, we develop an inventory model that addresses inventory rationing based on customer priority. We use the framework of a multi-echelon inventory system to describe the physics of a critical level policy. To extend from previous research, we allow multiple demand classes while minimizing a cost objective. We assume a continuous-review, base stock replenishment policy and allow for full backordering. Simulation is used to estimate total expected cost, applying variance reduction to reduce sampling error. First differences are estimated using a Perturbation Analysis unique to inventory rationing literature, heuristics are used to minimize costs. In the second article, we consider a stockless hospital supply chain with inaccurate inventory records. The model presented here is conditional on the level of accuracy in a particular hospital department, or point-of-use (POU). Similar to previous research on inventory inaccuracy, we consider both actual net inventory and recorded inventory in deriving the performance measures. The resultant model is a periodic-review, cost minimization inventory model with full backordering that is centered at the POU. Similar to the previous article, we assume a base stock ordering policy, but in addition to choosing the optimal order-up-to level, we seek the optimal frequency of inventory counts to reconcile inaccurate records. We present both a service level model and a shortage cost model under this framework. In the final article, we consider a hybrid hospital supply chain with both regular and emergency ordering when inventory records are inaccurate. The resultant model is an extension from the previous article where there are opportunities for both regular replenishments and emergency replenishments. We seek an optimal solution to an approximate cost model, and then we compare the results to a simulation-optimization approach.Show more