My Account Log in

2 options

Privacy-preserving distributed regression algorithms for analysis of multi-site real-world data / Mackenzie John Edmondson.

Online

Available online

View online

Dissertations & Theses @ University of Pennsylvania Available online

View online
Format:
Book
Thesis/Dissertation
Author/Creator:
Edmondson, Mackenzie John, author.
Contributor:
Chen, Yong, degree supervisor.
University of Pennsylvania. Department of Epidemiology and Biostatistics, degree granting institution.
Language:
English
Subjects (All):
Biostatistics.
Statistics.
Datasets.
Collaboration.
Regression analysis.
Estimates.
Mortality.
Data analysis.
Privacy.
Hospitalization.
Patients.
Simulation.
Medical research.
Databases.
Primary care.
Methods.
Algorithms.
Meta-analysis.
Epidemiology and Biostatistics--Penn dissertations.
Penn dissertations--Epidemiology and Biostatistics.
Local Subjects:
Biostatistics.
Statistics.
Datasets.
Collaboration.
Regression analysis.
Estimates.
Mortality.
Data analysis.
Privacy.
Hospitalization.
Patients.
Simulation.
Medical research.
Databases.
Primary care.
Methods.
Algorithms.
Meta-analysis.
Epidemiology and Biostatistics--Penn dissertations.
Penn dissertations--Epidemiology and Biostatistics.
Genre:
Academic theses.
Physical Description:
1 online resource (97 pages)
Contained In:
Dissertations Abstracts International 83-03B.
Place of Publication:
[Philadelphia, Pennsylvania] : University of Pennsylvania ; Ann Arbor : ProQuest Dissertations & Theses, 2021.
Language Note:
English
System Details:
Mode of access: World Wide Web.
text file
Summary:
Real-world data, including electronic health records and administrative claims data, are widelyused in modern healthcare research to generate real-world evidence for improving patient care. The widespread availability of observational data from a variety of institutions has prompted manylarge-scale, multi-site studies in recent years. Studies incorporating data from multiple institutionsoften attain results more generalizable than those from single-site studies and offer improved powerfor studying rare outcomes or exposures. Various challenges concerning patient-level data sharing, primarily those related to data privacy, have made distributed data analysis a practical alternativeto analyzing centralized data in multi-site studies. Under a distributed data analysis framework,patient-level data are not shared across institutions. Instead, aggregated data are shared and communicatedto a coordinating site to obtain analysis results. While methods for performing distributedanalyses are increasingly available, analytical methods for analyzing binary and count outcomesare limited. In this work, we propose two distributed regression algorithms for modeling count outcomesin multi-site studies. The first algorithm uses distributed quasi-Poisson regression to modelcounts while accounting for institution-specific heterogeneity in the outcome. The second uses distributedhurdle regression to model counts subject to zero-inflation. Both algorithms are communicationefficient and highly accurate, requiring at most two or three rounds of communication amongparticipating institutions and achieving results close to those obtained using pooled regression ofall patient-level data, a method usable only if data are centralized. We evaluate the performance ofeach method through simulations and applications to real-world clinical research networks. Finally,we illustrate a novel application of a distributed generalized linear mixed modeling algorithm withbinary outcomes to study the effect of admitting hospital on racial disparities in mortality for patientshospitalized with COVID-19 via counterfactual modeling.
Notes:
Source: Dissertations Abstracts International, Volume: 83-03, Section: B.
Advisors: Chen, Yong; Committee members: Shults, Justine; Boland, Mary; Forrest, Christopher; Ryan, Patrick.
Department: Epidemiology and Biostatistics.
Ph.D. University of Pennsylvania 2021.
Local Notes:
School code: 0175
ISBN:
9798535570310
Access Restriction:
Restricted for use by site license.
This item is not available from ProQuest Dissertations & Theses.
This item must not be sold to any third party vendors.

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account