2 options
Exploiting Cross-lingual Representations for Natural Language Processing / Shyam Upadhyay.
- Format:
- Book
- Thesis/Dissertation
- Author/Creator:
- Upadhyay, Shyam, author.
- Language:
- English
- Subjects (All):
- Computer science.
- Computer and information science--Penn dissertations.
- Penn dissertations--Computer and information science.
- Local Subjects:
- Computer science.
- Computer and information science--Penn dissertations.
- Penn dissertations--Computer and information science.
- Genre:
- Academic theses.
- Physical Description:
- 1 online resource (210 pages)
- Contained In:
- Dissertations Abstracts International 81-02B.
- Place of Publication:
- [Philadelphia, Pennsylvania] : University of Pennsylvania ; Ann Arbor : ProQuest Dissertations & Theses, 2019.
- Language Note:
- English
- System Details:
- Mode of access: World Wide Web.
- text file
- Summary:
- Traditional approaches to supervised learning require a generous amount of labeled data for good generalization. While such annotation-heavy approaches have proven useful for some Natural Language Processing (NLP) tasks in high-resource languages (like English), they are unlikely to scale to languages where collecting labeled data is di cult and time-consuming. Translating supervision available in English is also not a viable solution, because developing a good machine translation system requires expensive to annotate resources which are not available for most languages.In this thesis, I argue that cross-lingual representations are an effective means of extending NLP tools to languages beyond English without resorting to generous amounts of annotated data or expensive machine translation. These representations can be learned in an inexpensive manner, often from signals completely unrelated to the task of interest. I begin with a review of different ways of inducing such representations using a variety of cross-lingual signals and study algorithmic approaches of using them in a diverse set of downstream tasks. Examples of such tasks covered in this thesis include learning representations to transfer a trained model across languages for document classification, assist in monolingual lexical semantics like word sense induction, identify asymmetric lexical relationships like hypernymy between words in different languages, or combining supervision across languages through a shared feature space for cross-lingual entity linking. In all these applications, the representations make information expressed in other languages available in English, while requiring minimal additional supervision in the language of interest.
- Notes:
- Source: Dissertations Abstracts International, Volume: 81-02, Section: B.
- Advisors: Roth, Dan; Committee members: Adam Kalai; Chris Callison-Burch; Lyle Ungar; Mitchell Marcus.
- Department: Computer and Information Science.
- Ph.D. University of Pennsylvania 2019.
- Local Notes:
- School code: 0175
- ISBN:
- 9781085565288
- Access Restriction:
- Restricted for use by site license.
- This item must not be sold to any third party vendors.
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.