1 option

Safeguarding AI Systems Against Unexpected Inputs Yahan Yang

Dissertations & Theses @ University of Pennsylvania Available online

Format:: Book; Thesis/Dissertation
Author/Creator:: Yang, Yahan, author.
Contributor:: University of Pennsylvania. Electrical and Systems Engineering., degree granting institution.
Language:: English
Subjects (All):: 0464.; 0723.; 0800.; 0984.
Local Subjects:: 0464.; 0723.; 0800.; 0984.
Physical Description:: 1 electronic resource (155 pages)
Contained In:: Dissertations Abstracts International 87-07A
Place of Publication:: Ann Arbor : ProQuest Dissertations and Theses, 2025
Language Note:: English
Summary:: Artificial intelligence systems powered by deep neural networks have achieved remarkable success across a broad range of applications. However, perturbations such as natural image corruptions or crafted malicious queries, can cause significant performance degradation. This poses severe risks in safety-critical applications, such as autonomous driving and clinical decision-making. A key vulnerability of machine learning models is their inability to handle data outside the training distribution or knowledge. When facing unseen or otherwise challenging inputs, models often make incorrect decisions without warning users. This thesis improves the safety of machine learning systems by building three stages for handling challenging/unexpected inputs: (1) rejecting unexpected inputs with an explanation, (2) providing statistical guarantees on rejection, and (3) enabling models to adapt to challenging inputs. We consider two distinct scenarios: models with known training distributions (e.g., in cyber-physical systems) where challenges are out-of-distribution data, and models with unknown training distributions (e.g., large language models in a multilingual context) where challenges are defined by standards like harmful content across languages. For cyber-physical systems, we develop memory based prototypes that characterize the training distribution for out of distribution detection and provide statistical guarantees for a window based detector. We then leverage these prototypes to adapt the model to inputs from new distributions. For multilingual large language models, we design a reasoning enabled guardrail to shield against unsafe multilingual prompts. Finally, we study challenging inputs arising from natural distribution shift in a clinical application: acne lesion classification
Notes:: Advisors: Lee, Insup Committee members: Roth, Dan; Mangharam, Rahul; Dutta, Souradeep; Source: Dissertations Abstracts International, Volume: 87-07, Section: A.; Ph.D. University of Pennsylvania 2025; Vendor supplied data
Local Notes:: School code: 0175
ISBN:: 9798276005102
Access Restriction:: Restricted for use by site license

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

1 option

Safeguarding AI Systems Against Unexpected Inputs Yahan Yang

My Account

Guides