On the use of transformations for modeling multidimensional heterogeneous data

Loading...
Thumbnail Image

Date

2019

Journal Title

Journal ISSN

Volume Title

Publisher

University of Alabama Libraries

Abstract

The objective of cluster analysis is to find distinct groups of similar observations. There are many algorithms in literature that can perform this task and among them model based clustering is one of the most flexible tools. Assumption of Gaussian density for mixture components is quite popular in this field of study due to it’s convenient form. However, this assumption is not always valid. This thesis explores the use of various transformations for finding clusters in heterogeneous data. In this process, the thesis also attends to several data structures such as vector-, matrix-, tensor-, and network-valued data. In the first chapter, linear and non-linear transformations are used to model heterogeneous vector-valued observations when the data suffer from measurement inconsistency. The second chapter discusses an extensive set of parsimonious models for matrix-valued data. In the third chapter a methodology for clustering skewed tensor-valued data is developed and it is applied for analyzing remuneration of professors in American universities. The fourth chapter focuses on network-valued data and a novel finite mixture model addressing the dependent structure of network data is proposed. Finally, the fifth chapter describes the functionality of a R package “netClust” developed by the author for clustering unilayer and multilayer networks following the methodology proposed in Chapter four.

Description

Electronic Thesis or Dissertation

Keywords

Statistics, Computer science, Mathematics

Citation