1 option
Paraphrase-based models of lexical semantics / Anne O'Donnell Cocos.
LIBRA QA003 2019 .C6477
Available from offsite location
- Format:
- Book
- Manuscript
- Thesis/Dissertation
- Author/Creator:
- Cocos, Anne O'Donnell, author.
- Language:
- English
- Subjects (All):
- Penn dissertations--Computer and information science.
- Computer and information science--Penn dissertations.
- Local Subjects:
- Penn dissertations--Computer and information science.
- Computer and information science--Penn dissertations.
- Physical Description:
- xxi, 197 leaves : illustrations (some color) ; 29 cm
- Production:
- [Philadelphia, Pennsylvania] : University of Pennsylvania, 2019.
- Summary:
- Models of lexical semantics are a key component of natural language understanding. The bulk of work in this area has focused on learning the meanings of words and phrases and their inter-relationships from signals present in large monolingual corpora--including the distributional properties of words and phrases, and the lexico-syntactic patterns within which they appear. Each of these signals, while useful, has drawbacks related to challenges in modeling polysemy or limited coverage. The goal of this thesis is to examine bilingually-induced paraphrases as a different and complementary source of information for building computational models of semantics.
- First, focusing on the two tasks of discriminating word sense and predicting scalar adjective intensity, we build models that rely on paraphrases as a source of signal. In each case, the performance of the paraphrase-based models is compared to that of models incorporating more traditional feature types, such as monolingual distributional similarity and lexico-syntactic patterns. We find that combining these traditional signals with paraphrase-based features leads to the highest performing models overall, indicating that the different types of information are complementary. Next, we shift focus to the use of paraphrases to model the fine-grained meanings of a word. This idea is leveraged to automatically generate a large resource of meaning-specific word instances called Paraphrase-Sense-Tagged Sentences (PSTS). Distributional models for sense embedding, word sense induction, and contextual hypernym prediction are trained successfully by using PSTS as a sense-tagged corpus. In this way we reaffirm the notion that signals from paraphrases and monolingual distributional properties can be combined to construct robust models of lexical semantics.
- Notes:
- Ph. D. University of Pennsylvania 2019.
- Department: Computer and Information Science.
- Supervisor: Chris Callison-Burch.
- Includes bibliographical references.
- Other Format:
- Online version: Cocos, Anne O'Donnell. Paraphrase-based models of lexical semantics.
- OCLC:
- 1121201999
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.