My Account Log in

2 options

Scalable machine learning methods for the analysis of single-cell transcriptomics and multiomics data / Justin Lakkis.

Online

Available online

View online

Dissertations & Theses @ University of Pennsylvania Available online

View online
Format:
Book
Thesis/Dissertation
Author/Creator:
Lakkis, Justin, author.
Contributor:
Li, Mingyao, degree supervisor.
University of Pennsylvania. Department of Epidemiology and Biostatistics, degree granting institution.
Language:
English
Subjects (All):
Biostatistics.
Epidemiology.
Artificial intelligence.
Epidemiology and Biostatistics--Penn dissertations.
Penn dissertations--Epidemiology and Biostatistics.
Local Subjects:
Biostatistics.
Epidemiology.
Artificial intelligence.
Epidemiology and Biostatistics--Penn dissertations.
Penn dissertations--Epidemiology and Biostatistics.
Physical Description:
1 online resource (192 pages)
Contained In:
Dissertations Abstracts International 83-08B.
Place of Publication:
[Philadelphia, Pennsylvania] : University of Pennsylvania ; Ann Arbor : ProQuest Dissertations & Theses, 2021.
Language Note:
English
System Details:
Mode of access: World Wide Web.
Summary:
Transcriptomics and proteomics-based expression profiling technologies have become increasingly popular, more affordable, and more accurate in recent years. Expression profiling of expression at the single-cell resolution allows investigators to identify rare cell subtypes in human tissue which would be otherwise confounded in lower-resolution, bulk sequencing technologies. Previously, investigators studied human cell populations by profiling RNA expression in single cells using single-cell RNA sequencing (scRNA-seq) technologies. More recently, multi-modality sequencing technologies such as Cellular Indexing of Transcriptomes and Epitopes by Sequencing (CITE-seq) have emerged, which allow investigators to profile multiple forms of biological expression (in this case RNA and protein expression) simultaneously in the same cells. Investigators can study human biology now with greater detail than ever before, but challenges remain. (1) Cell subpopulations are not always neatly separated from one another, which makes cell type classification difficult. (2) Technical batch effects also often plague scRNA-seq studies and confound real biological signals. (3) Multi-modality technologies are excellent but remain expensive to do at scale. In this work, we seek to address these various challenges and difficulties associated with scRNA-seq and CITE-seq analyses.To address challenge (1), we propose a smooth pseudotemporal modeling approach which characterizes a cell's identity as a mixture of two discrete identities, allowing for a continuous sliding-scale cell type rather than requiring cells to separate into discrete types. To address challenge (2), we propose an augmented autoencoder which uses a self-supervised Kullback-Leibler divergence, along with a specialized branching architecture to correct for batch effects in the full gene expression feature space. Lastly, to address challenge (3), we develop a hybrid feedforward-recurrent neural network approach which supports protein prediction, imputation, embedding, uncertainty quantification, and cell type label transfer, allowing the user to use reference CITE-seq datasets to predict and study protein expression in larger single modality RNA-only data. We validate the utility of each of our approaches using real datasets with gold standard true expression and experimentally validated cell type labels. We also demonstrate real use cases for our methods, such as improving downstream pseudotime analyses using batch correction and identifying immune response biomarkers to an H1N1 vaccine.
Notes:
Source: Dissertations Abstracts International, Volume: 83-08, Section: B.
Advisors: Li, Mingyao; Committee members: Xiao, Rui; Barnett, Ian; Lee, Edward; Morris, Jeffrey; Ungar, Lyle.
Department: Epidemiology and Biostatistics.
Ph.D. University of Pennsylvania 2021.
Local Notes:
School code: 0175
ISBN:
9798780646310
Access Restriction:
Restricted for use by site license.
This item must not be sold to any third party vendors.

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account