On the identification of statistically significant network topology

dc.contributorVisscher, Pieter B.
dc.contributorGray, J. Brian
dc.contributorMcManus, Denise J.
dc.contributorConerly, Michael D.
dc.contributor.advisorPerry, Marcus B.
dc.contributor.authorMichaelson, Gregory Vincent
dc.contributor.otherUniversity of Alabama Tuscaloosa
dc.date.accessioned2017-02-28T22:25:34Z
dc.date.available2017-02-28T22:25:34Z
dc.date.issued2010
dc.descriptionElectronic Thesis or Dissertationen_US
dc.description.abstractDetermining the structure of large and complex networks is a problem that has stirred great interest in many fields including mathematics, computer science, sociology, biomedical research, and epidemiology. Despite this high level of interest, though, there still exists no procedure for formal hypothesis testing to measure the significance of detected community structure in an observed network. First, this work proposes three, more general alternatives to modularity, the most common measure of community structure, which allow for the detection of more general structure in networks. An approach based upon the likelihood ratio test is shown not only to be as effective as modularity in detecting modular structure but also able to detect a wide variety of other network topologies. Second, this work proposes a general and novel test, the Likelihood Ratio Cluster (LRC) test, for assessing the statistical significance of the output of clustering algorithms. This technique is demonstrated by applying it to the sample partitions generated by both network and conventional clustering algorithms. Finally, a method for evaluating the capability of heuristic clustering techniques to detect the optimal sample partition is developed. This technique is used to evaluate several common community detection algorithms. Surprisingly, the most popular community detection algorithm is found to be largely ineffective at detecting the optimal partition of a random network. Also surprisingly, Clauset's fast algorithm (Clauset et al 2004), which is commonly thought to be fast but inaccurate, is found to be the most effective of the algorithms examined at detecting the optimal partition in random networks.en_US
dc.format.extent130 p.
dc.format.mediumelectronic
dc.format.mimetypeapplication/pdf
dc.identifier.otheru0015_0000001_0000254
dc.identifier.otherMichaelson_alatus_0004D_10281
dc.identifier.urihttps://ir.ua.edu/handle/123456789/760
dc.languageEnglish
dc.language.isoen_US
dc.publisherUniversity of Alabama Libraries
dc.relation.hasversionborn digital
dc.relation.ispartofThe University of Alabama Electronic Theses and Dissertations
dc.relation.ispartofThe University of Alabama Libraries Digital Collections
dc.rightsAll rights reserved by the author unless otherwise indicated.en_US
dc.subjectStatistics
dc.titleOn the identification of statistically significant network topologyen_US
dc.typethesis
dc.typetext
etdms.degree.departmentUniversity of Alabama. Department of Information Systems, Statistics, and Management Science
etdms.degree.disciplineApplied Statistics
etdms.degree.grantorThe University of Alabama
etdms.degree.leveldoctoral
etdms.degree.namePh.D.

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
file_1.pdf
Size:
7.16 MB
Format:
Adobe Portable Document Format