My Account Log in

4 options

Using OpenRefine : the essential OpenRefine guide that takes you from data analysis and error fixing to linking your dataset to the Web / Ruben Verborgh, Max De Wilde ; cover image by Aniket Sawant.

EBSCOhost Academic eBook Collection (North America) Available online

View online

Ebook Central Academic Complete Available online

View online

Ebook Central College Complete Available online

View online

O'Reilly Online Learning: Academic/Public Library Edition Available online

View online
Format:
Book
Author/Creator:
Verborgh, Ruben, author.
De Wilde, Max, author.
Contributor:
Sawant, Aniket, cover designer.
Series:
Community experience distilled.
Community Experience Distilled
Language:
English
Subjects (All):
Data mining.
Electronic data processing.
Physical Description:
1 online resource (114 p.)
Edition:
1st edition
Place of Publication:
Birmingham, England : Packt Publishing, 2013.
Language Note:
English
System Details:
Mode of access: World Wide Web.
text file
Summary:
With this book on OpenRefine, managing and cleaning your large datasets suddenly got a lot easier! With a cookbook approach and free datasheets included, you’ll quickly and painlessly improve your data managing capabilities. Create links between your dataset and others in an instant Effectively transform data with regular expressions and the General Refine Expression Language Spot issues in your dataset and take effective action with just a few clicks In Detail Data is supposed to be the new gold, but how can you unlock the value in your data? Managing large datasets used to be a task for specialists, but you don't have to worry about inconsistencies or errors anymore. OpenRefine lets you clean, link, and publish your dataset in a breeze. Using OpenRefine takes you on a practical tour of all the handy features of this well-known data transformation tool. It is a hands-on recipe book that teaches you data techniques by example. Starting from the basics, it gradually transforms you into an OpenRefine expert. This book will teach you all the necessary skills to handle any large dataset and to turn it into high-quality data for the Web. After you learn how to analyze data and spot issues, we'll see how we can solve them to obtain a clean dataset. Messy and inconsistent data is recovered through advanced techniques such as automated clustering. We'll then show extract links from keyword and full-text fields using reconciliation and named-entity extraction. Using OpenRefine is more than a manual: it's a guide stuffed with tips and tricks to get the best out of your data.
Contents:
Intro
Using OpenRefine
Table of Contents
Credits
Foreword
About the Authors
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers and more
Why Subscribe?
Free Access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example files
Errata
Piracy
Questions
1. Diving Into OpenRefine
Introducing OpenRefine
Recipe 1 - installing OpenRefine
Windows
Mac
Linux
Recipe 2 - creating a new project
File formats supported by OpenRefine
Recipe 3 - exploring your data
Recipe 4 - manipulating columns
Collapsing and expanding columns
Moving columns around
Renaming and removing columns
Recipe 5 - using the project history
Recipe 6 - exporting a project
Recipe 7 - going for more memory
Summary
2. Analyzing and Fixing Data
Recipe 1 - sorting data
Reordering rows
Recipe 2 - faceting data
Text facets
Numeric facets
Customized facets
Faceting by star or flag
Recipe 3 - detecting duplicates
Recipe 4 - applying a text filter
Recipe 5 - using simple cell transformations
Recipe 6 - removing matching rows
3. Advanced Data Operations
Recipe 1 - handling multi-valued cells
Recipe 2 - alternating between rows and records mode
Recipe 3 - clustering similar cells
Recipe 4 - transforming cell values
Recipe 5 - adding derived columns
Recipe 6 - splitting data across columns
Recipe 7 - transposing rows and columns
4. Linking Datasets
Recipe 1 - reconciling values with Freebase
Recipe 2 - installing extensions
Recipe 3 - adding a reconciliation service
Recipe 4 - reconciling with Linked Data.
Recipe 5 - extracting named entities
A. Regular Expressions and GREL
Regular expressions for text patterns
Character classes
Quantifiers
Anchors
Choices
Groups
Overview
General Refine Expression Language (GREL)
Transforming data
Creating custom facets
Solving problems with GREL
Index.
Notes:
Includes index.
Includes bibliographical references and index.
Description based on online resource; title from PDF title page (ebrary, viewed August 11, 2014).
ISBN:
9781783289097
1783289090
OCLC:
889271264

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Library Catalog Using Articles+ Library Account