On the identification of statistically significant network topology

Show simple item record

dc.contributor Visscher, Pieter B.
dc.contributor Gray, J. Brian
dc.contributor McManus, Denise J.
dc.contributor Conerly, Michael D.
dc.contributor.advisor Perry, Marcus B.
dc.contributor.author Michaelson, Gregory Vincent
dc.date.accessioned 2017-02-28T22:25:34Z
dc.date.available 2017-02-28T22:25:34Z
dc.date.issued 2010
dc.identifier.other u0015_0000001_0000254
dc.identifier.other Michaelson_alatus_0004D_10281
dc.identifier.uri https://ir.ua.edu/handle/123456789/760
dc.description Electronic Thesis or Dissertation
dc.description.abstract Determining the structure of large and complex networks is a problem that has stirred great interest in many fields including mathematics, computer science, sociology, biomedical research, and epidemiology. Despite this high level of interest, though, there still exists no procedure for formal hypothesis testing to measure the significance of detected community structure in an observed network. First, this work proposes three, more general alternatives to modularity, the most common measure of community structure, which allow for the detection of more general structure in networks. An approach based upon the likelihood ratio test is shown not only to be as effective as modularity in detecting modular structure but also able to detect a wide variety of other network topologies. Second, this work proposes a general and novel test, the Likelihood Ratio Cluster (LRC) test, for assessing the statistical significance of the output of clustering algorithms. This technique is demonstrated by applying it to the sample partitions generated by both network and conventional clustering algorithms. Finally, a method for evaluating the capability of heuristic clustering techniques to detect the optimal sample partition is developed. This technique is used to evaluate several common community detection algorithms. Surprisingly, the most popular community detection algorithm is found to be largely ineffective at detecting the optimal partition of a random network. Also surprisingly, Clauset's fast algorithm (Clauset et al 2004), which is commonly thought to be fast but inaccurate, is found to be the most effective of the algorithms examined at detecting the optimal partition in random networks.
dc.format.extent 130 p.
dc.format.medium electronic
dc.format.mimetype application/pdf
dc.language English
dc.language.iso en_US
dc.publisher University of Alabama Libraries
dc.relation.ispartof The University of Alabama Electronic Theses and Dissertations
dc.relation.ispartof The University of Alabama Libraries Digital Collections
dc.relation.hasversion born digital
dc.rights All rights reserved by the author unless otherwise indicated.
dc.subject.other Statistics
dc.title On the identification of statistically significant network topology
dc.type thesis
dc.type text
etdms.degree.department University of Alabama. Dept. of Information Systems, Statistics, and Management Science
etdms.degree.discipline Applied Statistics
etdms.degree.grantor The University of Alabama
etdms.degree.level doctoral
etdms.degree.name Ph.D.


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account