My Account Log in

1 option

Towards Multilingual Evaluations of Knowledge for Large Language Models Bryan Li

Dissertations & Theses @ University of Pennsylvania Available online

View online
Format:
Book
Thesis/Dissertation
Author/Creator:
Li, Bryan, author.
Contributor:
University of Pennsylvania. Computer and Information Science., degree granting institution.
Language:
English
Subjects (All):
0464.
0723.
0800.
0984.
Local Subjects:
0464.
0723.
0800.
0984.
Physical Description:
1 electronic resource (189 pages)
Contained In:
Dissertations Abstracts International 87-07A
Place of Publication:
Ann Arbor : ProQuest Dissertations and Theses, 2025
Language Note:
English
Summary:
Contemporary language models (LMs) understand and support interaction in dozens of languages, and have significant potential to empower equitable information access for users across the world. Existing multilingual evaluations have focused on tasks which challenge whether facts learned in one language can be reproduced in the same language. This limitation has led to a narrow view of multilingual knowledge to simpler tasks which can be translated across language. In contrast, the real-world distribution of knowledge often involves tasks where across languages, topic coverage differs and cultural perspectives conflict. In this thesis, I study several multilingual knowledge-intensive tasks, and how LMs interact with the parametric knowledge encoded internally in their parameters, and with the contextual knowledge presented externally during interaction. The first part of this thesis investigates parametric knowledge, introducing two benchmarks which challenge whether LMs can generalize their knowledge across languages. First, I introduce a multi-task, multilingual complex reasoning benchmark, and find that LMs which reason well in English struggle far more in other languages. I address this with methods leveraging program code to bridge LM's competencies in multilinguality and in reasoning. Second, I introduce a benchmark of territorial disputes. By posing these queries in different languages (e.g., asking on Crimea in Russian or Ukrainian), I reveal the issue of cross-lingual robustness, where an LM's response to a query is inconsistent depending on the language of interaction. I then show that lightweight methods of leveraging program code and persona-based prompting can mitigate these issues. The second part of this thesis investigates contextual knowledge, focusing on the retrieval-augmented generation (RAG) paradigm, which provides an LM with contextual knowledge retrieved from external knowledge-bases (KBs). I begin with a study on the machine translation task, which finds that contextual knowledge is most effective when presented as stylistically similar demonstrations. I then return to the territorial disputes task. While RAG provides relevant information, it also invites an array of culturally and language-informed perspectives. Thus, the composition of KBs determines what information can be retrieved. Experiments find that multilingual retrieval effectively improves robustness over retrieval only in query languages. I further expand to the multi-source setting, collecting a large-scale dataset of articles from state media outlets. I similarly find that for retrieval, multi-source improves robustness over single-source, and furthermore, that foreign-targeted state media causes far less robust responses than domestic-targeted ones. In summary, this thesis contributes multilingual evaluations of LMs on knowledge-intensive tasks, and explore contextually-informed methods to improve them. We ultimately highlight the need for multilingual LMs that can navigate, and assist users in navigating, the real-world distribution of knowledge across languages and sources
Notes:
Advisors: Callison-Burch, Chris Committee members: Ungar, Lyle; Watts, Duncan; Roth, Dan; Cherry, Colin; Rasooli, Mohammad Sadegh
Source: Dissertations Abstracts International, Volume: 87-07, Section: A.
Ph.D. University of Pennsylvania 2025
Vendor supplied data
Local Notes:
School code: 0175
ISBN:
9798276005294
Access Restriction:
Restricted for use by site license

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account