My Account Log in

2 options

Approximate inference for determinantal point processes.

Online

Available online

View online

Dissertations & Theses @ University of Pennsylvania Available online

View online
Format:
Book
Thesis/Dissertation
Author/Creator:
Gillenwater, Jennifer, author.
Contributor:
Fox, Emily, degree supervisor.
Taskar, Ben, degree supervisor.
University of Pennsylvania. Computer and Information Science.
Language:
English
Subjects (All):
Computer science.
Statistics.
Computer engineering.
Computer and Information Science--Penn dissertations.
Penn dissertations--Computer and Information Science.
Local Subjects:
Computer science.
Statistics.
Computer engineering.
Computer and Information Science--Penn dissertations.
Penn dissertations--Computer and Information Science.
Genre:
Academic theses.
Physical Description:
1 online resource (161 pages)
Contained In:
Dissertation Abstracts International 76-05B(E).
Place of Publication:
[Philadelphia, Pennsylvania] : University of Pennsylvania ; Ann Arbor, MI : ProQuest, 2014.
System Details:
Mode of access: World Wide Web.
text file
Summary:
In this thesis we explore a probabilistic model that is well-suited to a variety of subset selection tasks: the determinantal point process (DPP). DPPs were originally developed in the physics community to describe the repulsive interactions of fermions. More recently, they have been applied to machine learning problems such as search diversification and document summarization, which can be cast as subset selection tasks. A challenge, however, is scaling such DPP-based methods to the size of the datasets of interest to this community, and developing approximations for DPP inference tasks whose exact computation is prohibitively expensive.
A DPP defines a probability distribution over all subsets of a ground set of items. Consider the inference tasks common to probabilistic models, which include normalizing, marginalizing, conditioning, sampling, estimating the mode, and maximizing likelihood. For DPPs, exactly computing the quantities necessary for the first four of these tasks requires time cubic in the number of items or features of the items. In this thesis, we propose a means of making these four tasks tractable even in the realm where the number of items and the number of features is large. Specifically, we analyze the impact of randomly projecting the features down to a lower-dimensional space and show that the variational distance between the resulting DPP and the original is bounded. In addition to expanding the circumstances in which these first four tasks are tractable, we also tackle the other two tasks, the first of which is known to be NP-hard (with no PTAS) and the second of which is conjectured to be NP-hard. For mode estimation, we build on submodular maximization techniques to develop an algorithm with a multiplicative approximation guarantee. For likelihood maximization, we exploit the generative process associated with DPP sampling to derive an expectation-maximization (EM) algorithm. We experimentally verify the practicality of all the techniques that we develop, testing them on applications such as news and research summarization, political candidate comparison, and product recommendation.
Notes:
Source: Dissertation Abstracts International, Volume: 76-05(E), Section: B.
Advisers: Ben Taskar; Emily Fox.
Department: Computer and Information Science.
Thesis Ph.D. University of Pennsylvania 2014.
Local Notes:
School code: 0175.
ISBN:
9781321479508
Access Restriction:
Restricted for use by site license.

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account