My Account Log in

1 option

Statistical Methods for Modeling Complex Dependency Structures in Zero-Inflated Metagenomic Sequencing Data / Rebecca Ann Deek.

Dissertations & Theses @ University of Pennsylvania Available online

View online
Format:
Book
Thesis/Dissertation
Author/Creator:
Deek, Rebecca Ann, author.
Contributor:
University of Pennsylvania. Epidemiology and Biostatistics, degree granting institution.
Language:
English
Subjects (All):
Biostatistics.
Statistics.
Epidemiology.
Epidemiology and Biostatistics--Penn dissertations.
Penn dissertations--Epidemiology and Biostatistics.
Local Subjects:
Biostatistics.
Statistics.
Epidemiology.
Epidemiology and Biostatistics--Penn dissertations.
Penn dissertations--Epidemiology and Biostatistics.
Physical Description:
1 online resource (115 pages)
Distribution:
Ann Arbor : ProQuest Dissertations & Theses, 2023
Contained In:
Dissertations Abstracts International 84-12B.
Place of Publication:
[Philadelphia, Pennsylvania] : University of Pennsylvania, 2022.
Language Note:
English
Summary:
Advances in high-throughput sequencing technologies have enabled large-scale metagenomic sequencing studies of microbial compositions. As such, there is a growing scientific interest in understanding the human microbiome, defined as all the microorganisms and their genes in, or on, the body. Of particular interest is its functional role in human-host health. Nevertheless, there remains a statistical and computational bottleneck in effectively analyzing data from 16S rRNA and metagenomic sequencing studies. This is due to the characteristic excessive zeros, sequencing depth constraints, and high dimensionality of such data. Motivated by numerous microbiome studies, this dissertation aims to narrow the gap by developing novel statistical methods specifically designed to capture the excessive zeros of the data. The specific aims are to develop statistical models, inference procedures, and computational fast algorithms to (1) identify distinct microbial communities in a given data set, as well as each community's important bacterial taxa, and (2) build microbial covariation networks based upon the estimated covariation between a pair of zero-inflated variables. To this end, three methodological advances are proposed. First, a generative latent mixture model of microbial counts that distinguishes between structural and sampling zeros. Second, a mixture margin copula model and two-stage inference procedure for microbial covariation networks in cross-sectional studies. Third, an extension to random-effects mixture margin copula models, as well as a corresponding Monte Carlo EM algorithm and likelihood ratio test to build temporally conserved covariation networks from longitudinal data. Furthermore, the performance and utility of these methods are demonstrated using simulations and several publicly available microbiome data sets.
Notes:
Source: Dissertations Abstracts International, Volume: 84-12, Section: B.
Advisors: Li, Hongzhe; Committee members: Li, Mingyao; Huang, Jing; Collman, Ronald G.
Department: Epidemiology and Biostatistics.
Ph.D. University of Pennsylvania 2023.
Local Notes:
School code: 0175
ISBN:
9798379758516
Access Restriction:
Restricted for use by site license.

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account