The statistical detection of clusters in networks

Loading...
Thumbnail Image

Date

2018

Journal Title

Journal ISSN

Volume Title

Publisher

University of Alabama Libraries

Abstract

A network consists of vertices and edges that connect the vertices. A network is clustered by assigning each of the N vertices to one of k groups, usually in order to optimize a given objective function. This dissertation proposes statistical likelihood as an objective function for network clustering for both undirected networks, in which edges have no direction, and directed networks, in which edges have direction. Clustering networks by optimizing an objective function is computationally expensive and quickly becomes prohibitive as the number of vertices in a network grows large. To address this, theorems are developed to increase the efficiency of likelihood parameter estimation during the optimization and a significant decrease in time-to-solution is demonstrated. When the clustering performance of likelihood is rigorously compared to competitor objective function modularity using Monte Carlo simulation, likelihood is frequently found to be superior. A novel statistical significance test for clusters identified when using likelihood as an objective function is also derived and both clustering using the likelihood objective function and subsequent significance testing are demonstrated on real-world networks, both undirected and directed.

Description

Electronic Thesis or Dissertation

Keywords

Statistics, Operations research, Applied mathematics

Citation