Application of Hidden Markov Model in Finite Mixture Modeling of High-Dimensional Data
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Finite mixture models (FMMs) are widely used in practice and are famous for modeling heterogeneous data in a convenient and effective way. Owing to their flexibility, finitemixtures have since been applied to a wide range of problems in diverse fields, includingimage analysis, medicine, agriculture, and many others. FMMs are particularly useful formodel-based clustering, where each group is assumed to be represented by one of the mixturecomponents. Although finite mixtures can adopt various functional forms, Gaussian densities are one of the most widely used representations. Specifically, multivariate Gaussiandensities based on vector-valued data have received considerable research attention owingto their wide utilization range. However, as datasets with higher dimensions are becomingincreasingly prevalent due to rapid improvements in computational power and data storagecapabilities, well-known issues such as overparameterization may emerge in the FMM framework, which leads to underestimation of the correct mixture order. This issue has motivatedthe current dissertation, as a part of which the FMM framework that can be applied todata with higher dimensions is proposed. The organization of this dissertation is as follows.In the first chapter, a hidden Markov model (HMM) for matrix-variate time series data isdeveloped, followed by a finite mixture of HMMs aimed at tensor-variate time series data inthe second chapter. Finally, in the third chapter, we develop an extension of matrix-variatetime series modeling with HMMs observed over multiple time points that will be completedlater and applied to the US institutions enrollment data in the context of a hypothesis testingproblem.