Some Contributions to Modern Mixture Modeling and Model-Based Clustering

dc.contributorMelnykov, Volodymyr
dc.contributorZhu, Xuwen
dc.contributorMelnykov, Yana
dc.contributorWang, Qin
dc.contributorGeorge, Mugoya
dc.contributorLee, Danhyang
dc.contributor.advisorMelnykov, Volodymyr
dc.contributor.authorWang, Yang
dc.contributor.otherUniversity of Alabama Tuscaloosa
dc.date.accessioned2021-11-23T14:33:59Z
dc.date.available2021-11-23T14:33:59Z
dc.date.issued2021
dc.descriptionElectronic Thesis or Dissertationen_US
dc.description.abstractClustering analysis is a technique of recognizing groups of similar objects. Based on the finite mixture models, model-based clustering is one of the most popular methods due to its flexibility and interpretability in modeling heterogeneous data. In this background, the one-to-one correspondence between mixture components and groups is assumed. The clustering process can be viewed as the model estimation by using an optimization algorithm. The age of big data poses new challenges. Due to a potentially high number of parameters, finite mixture models are often at the risk of being overparameterized. The overparameterization in model-based clustering often results in mixture order underestimation. As a fast-growing field, developing simulation studies to validate the mixture models becomes another crucial topic. This thesis contributes to modern mixture modeling and model-based clustering, and mainly focuses on developing approaches for solving overparameterization issues in this context. In addition, algorithms for simulating various types of clusters are created, which can be utilized to evaluate and improve clustering techniques. For each of the chapters, the expectation-maximization (EM) algorithm of the proposed mixture is developed, the expressions for model parameter estimations are provided, and corresponding parsimonious procedures are proposed. The utilities of methodologies are tested on both synthetic and well-known classification datasets. The organization of the thesis is as follows. In the firstchapter, a variable selection procedure is developed and applied in the matrix mixture modeling. The second chapter develops a novel mixture modeling approach called conditional mixture modeling and its corresponding parsimonious procedure. The third chapter provides an extension for simulating heterogeneous data for studying the systematic performance of clustering algorithms. Finally, the fourth chapter describes an R package cmbClust functionality developed for clustering multivariate data using the methodology proposed in chapter two.en_US
dc.format.mediumelectronic
dc.format.mimetypeapplication/pdf
dc.identifier.otherhttp://purl.lib.ua.edu/181474
dc.identifier.otheru0015_0000001_0003913
dc.identifier.otherWang_alatus_0004D_14540
dc.identifier.urihttp://ir.ua.edu/handle/123456789/8145
dc.languageEnglish
dc.language.isoen_US
dc.publisherUniversity of Alabama Libraries
dc.relation.hasversionborn digital
dc.relation.ispartofThe University of Alabama Electronic Theses and Dissertations
dc.relation.ispartofThe University of Alabama Libraries Digital Collections
dc.rightsAll rights reserved by the author unless otherwise indicated.en_US
dc.titleSome Contributions to Modern Mixture Modeling and Model-Based Clusteringen_US
dc.typethesis
dc.typetext
etdms.degree.departmentUniversity of Alabama. Department of Information Systems, Statistics, and Management Science
etdms.degree.disciplineStatistics
etdms.degree.grantorThe University of Alabama
etdms.degree.leveldoctoral
etdms.degree.namePh.D.

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
u0015_0000001_0003913.pdf
Size:
13.65 MB
Format:
Adobe Portable Document Format