My Account Log in

1 option

Estimation and Prediction Problems in Missing Data / Yachong Yang.

Dissertations & Theses @ University of Pennsylvania Available online

View online
Format:
Book
Thesis/Dissertation
Author/Creator:
Yang, Yachong, author.
Contributor:
University of Pennsylvania. Statistics and Data Science, degree granting institution.
Language:
English
Subjects (All):
Statistics.
Statistics and Data Science--Penn dissertations.
Penn dissertations--Statistics and Data Science.
Local Subjects:
Statistics.
Statistics and Data Science--Penn dissertations.
Penn dissertations--Statistics and Data Science.
Physical Description:
1 online resource (257 pages)
Contained In:
Dissertations Abstracts International 85-12B.
Place of Publication:
[Philadelphia, Pennsylvania] : University of Pennsylvania, 2022.
Ann Arbor : ProQuest Dissertations & Theses, 2024
Language Note:
English
Summary:
In recent years, conformal prediction has emerged as a robust methodology for making finite-sample valid, distribution-free predictions, attracting significant attention across various statistical and machine learning domains. This technique is particularly valued for its versatility, being applicable alongside any machine learning algorithm to produce valid prediction regions, with the efficiency of these regions closely tied to the underlying algorithm's performance. Despite the wealth of research on optimizing point predictions through methods such as cross-validation, discussion on selecting the most efficient machine learning algorithm to produce conformal prediction regions is significantly lacking in the literature. We aim to address this gap by introducing selection algorithms designed to minimize the conformal prediction region's width while considering both coverage and efficiency. While classic conformal prediction requires the underlying data which includes both the training data and the test points to predict at (which we will abbreviate as test data) to be exchangeable (sharing the same distribution), we seek to relax this condition allowing for some covariate shift between the training and test data. This can be used to address challenges in many areas, including the missing data literature and causal inference. We reveal and leverage deep connections between modern semiparametric efficiency theory, missing data and causal inference, and emerging methods for in conformal prediction for well-calibrated prediction inference. We propose a novel framework that leverages efficient influence functions, allowing for the adaptive calibration of prediction regions under covariate shifts, akin to the missing at random assumption. This advancement unlocks the potential for more effective prediction intervals without sacrificing coverage accuracy. Indeed we are able to show that our framework attains large sample efficiency and validity for any collection of machine learning techniques and their respective tuning parameters, which is doubly robust in the sense that it only requires the relatively mild requirement that at least one of two estimated nuisance functions is consistent, without necessarily requiring fast convergence rates for the latter.Complementing these developments, we also delve into series regression, a cornerstone of non-parametric regression techniques, by introducing a new estimator inspired by the Forster-Warmuth (FW) learner. This estimator not only relaxes the stringent conditions required by traditional series estimators but also extends the FW learner's utility to a broader array of counterfactual nonparametric regression problems, in which the response variable of interest may not be directly observed on all sampled units. By focusing on a unified pseudo-outcome approach, we offer a comprehensive solution to counterfactual regression, achieving minimax rate optimality under less restrictive conditions and demonstrating its application in missing data and causal inference scenarios. Through these innovations, we aim to bridge gaps in the current literature and introduce tools that promise greater precision and adaptability in statistical predictions and inference.
Notes:
Source: Dissertations Abstracts International, Volume: 85-12, Section: B.
Advisors: Tchetgen Tchetgen, Eric Joel; Committee members: Kuchibhotla, Arun Kumar; Su, Weijie; Small, Dylan.
Department: Statistics and Data Science.
Ph.D. University of Pennsylvania 2024.
Local Notes:
School code: 0175
ISBN:
9798382830001
Access Restriction:
Restricted for use by site license.

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account