1 option
Data Cleaning : Ihab F. Ilyas, Xu Chu
- Format:
- Book
- Author/Creator:
- Ilyas, Ihab F., author.
- Series:
- ACM books - Collection 2 ; #28.
- ACM books, 2374-6777 ; #28
- Language:
- English
- Subjects (All):
- Data Cleaning (Computer Science).
- Genre:
- Electronic books.
- Physical Description:
- 1 online resource (xx, 262 pages) illustrations.
- Edition:
- First Edition
- Place of Publication:
- [New York, NY, USA] : Association for Computing Machinery; [2019].
- System Details:
- Mode of access: World Wide Web
- System requirements: Adobe Acrobat Reader
- Contents:
- Preface
- 1 Introduction
- 1.1 Data Cleaning Workflow
- 1.2 Book Scope
- 2 Outlier Detection
- 2.1 A Taxonomy of Outlier Detection Methods
- 2.2 Statistics-Based Outlier Detection
- 2.3 Distance-Based Outlier Detection
- 2.4 Model-Based Outlier Detection
- 2.5 Outlier Detection in High-Dimensional Data
- 2.6 Conclusion
- 3 Data Deduplication
- 3.1 Similarity Metrics
- 3.2 Predicting Duplicate Pairs
- 3.3 Clustering
- 3.4 Blocking for Deduplication
- 3.5 Distributed Data Deduplication
- 3.6 Record Fusion and Entity Consolidation
- 3.7 Human-Involved Data Deduplication
- 3.8 Data Deduplication Tools
- 3.9 Conclusion
- 4 Data Transformation
- 4.1 Syntactic Data Transformations
- 4.2 Semantic Data Transformations
- 4.3 ETL Tools
- 4.4 Conclusion
- 5 Data Quality Rule Definition and Discovery
- 5.1 Functional Dependencies
- 5.2 Conditional Functional Dependencies
- 5.3 Denial Constraints
- 5.4 Other Types of Constraints
- 5.5 Conclusion
- 6 Rule-Based Data Cleaning
- 6.1 Violation Detection
- 6.2 Error Repair
- 6.3 Conclusion
- 7 Machine Learning and Probabilistic Data Cleaning
- 7.1 Machine Learning for Data Deduplication
- 7.2 Machine Learning for Data Repair
- 7.3 Data Cleaning for Analytics and Machine Learning
- 8 Conclusion and Future Thoughts
- References
- Index
- Author Biographies
- Other Format:
- Print version:
- ISBN:
- 3310205
- 9781450371551
- 9781450371544
- Access Restriction:
- Restricted for use by site license.
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.