My Account Log in

3 options

Data annotations, provenance, and archiving / Wang-Chiew Tan.

LIBRA Diss. POPM2002.347
Loading location information...

Available from offsite location This item is stored in our repository but can be checked out.

Log in to request item
LIBRA QA003 2002 .T161
Loading location information...

Available from offsite location This item is stored in our repository but can be checked out.

Log in to request item
LIBRA Microfilm P38:2002
Loading location information...

Mixed Availability Some items are available, others may be requested.

Log in to request item
Format:
Book
Manuscript
Microformat
Thesis/Dissertation
Author/Creator:
Tan, Wang-Chiew.
Contributor:
Buneman, Peter, 1943- advisor.
Khanna, Sanjeev, advisor.
University of Pennsylvania.
Language:
English
Subjects (All):
Penn dissertations--Computer and information science.
Computer and information science--Penn dissertations.
Local Subjects:
Penn dissertations--Computer and information science.
Computer and information science--Penn dissertations.
Physical Description:
xi, 179 pages : illustrations ; 29 cm
Production:
2002.
Summary:
This dissertation examines the problem of data provenance and two main issues related to provenance: Annotation and archiving. The provenance of data is the description of the origins of that piece of data. Our contribution is the distinction between two kinds of provenance: Why-provenance and where-provenance. The why-provenance of a piece of output data is the set of all witnesses to why that piece of data exists in the output. Where-provenance describes which pieces of source data contribute to a piece of output data. We showed that why-provenance and where-provenance can be computed by generating a new query from the original query and applying the new query on the same database.
Provenance is related to the view updates. In particular, where-provenance is related to the annotation placement problem, and why-provenance is related to the view deletion problem. When an annotation is placed on a piece of data in the output, we wish to attach the annotation back to the source. The right source to attach the annotation is one that will not unnecessarily spread that annotation to other output data. The annotation placement problem is to find the right source to attach the annotation so that it will spread to the least number of other view data. Our results show that there is a dichotomy in the complexity of the annotation placement problem depending on the type of query that is used to generate the view. The view deletion problem is concerned with finding the right sources to delete in order to delete a piece of view data. Our results also show that there is a dichotomy in the complexity of the view deletion problem depending on the type of query that is used to generate the view. Moreover, computing why-provenance and where-provenance is intractable in general.
We have developed a technique for specifying key constraints for hierarchical data that generalizes the way keys are specified in relational databases. (Abstract shortened by UMI.)
Notes:
Supervisors: Peter Buneman; Sanjeev Khanna.
Thesis (Ph.D. in Computer and Information Science) -- University of Pennsylvania, 2002.
Includes bibliographical references.
Local Notes:
University Microfilms order no.: 3073058.
OCLC:
244972145

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account