1 option
A distributed storage and query subsystem for collaborative data sharing / Nicholas E. Taylor.
LIBRA QA003 2010 .T238
Available from offsite location
- Format:
- Book
- Manuscript
- Thesis/Dissertation
- Author/Creator:
- Taylor, Nicholas E., 1984-
- Language:
- English
- Subjects (All):
- Penn dissertations--Computer and information science.
- Computer and information science--Penn dissertations.
- Local Subjects:
- Penn dissertations--Computer and information science.
- Computer and information science--Penn dissertations.
- Physical Description:
- xi, 211 pages : illustrations ; 29 cm
- Production:
- 2010.
- Summary:
- Cooperative management of data is a difficult challenge. In the absence of a central authority, there is often no single data format, and users may not even agree on what is true and what is not. The data is typically not static and will evolve over time, leading to issues of staleness and conflicting changes. Dedicated machines to run a management system may not be available, and furthermore the machines supplied by the users to run the system may be unreliable or only transiently available. A reliable system must be built over these machines, and should be self-configuring and self-tuning, to avoid placing an undue burden on end users that are unwilling or unable to manage it themselves.
- The Orchestra collaborative data sharing system responds to these challenges by providing a general approach for propagating updates between a heterogeneous collection of peer databases, which are connected by high-level rules that specify the correspondences between them. The system maintains these correspondences while enforcing trust conditions to filter the data from other databases, maintaining transactional atomicity, and respecting database integrity constraints. In this thesis, I detail my work on the semantics of transactional atomicity and dependency in this context, which lead to a general reconciliation algorithm; I also describe the prototype centralized and peer-to-peer implementations of Orchestra. I then develop a specialized reliable peer-to-peer storage and query processor that will enable the logging and computation needed to maintain an Orchestra instance to be distributed. I show ways to extend this system to recover from node failure, to perform load balancing to ensure even distribution of work, and to compensate for node heterogeneity and data skew.
- Notes:
- Adviser: Zachary G. Ives.
- Thesis (Ph.D. in Computer and Information Science) -- University of Pennsylvania, 2010.
- Includes bibliographical references.
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.