1 option
Data management for social scientists from files to databases Nils B. Weidmann
- Format:
- Book
- Author/Creator:
- Weidmann, Nils B., 1976- author.
- Series:
- Methodological tools in the social sciences
- Language:
- English
- Subjects (All):
- Social sciences--Research.
- Social sciences.
- Social sciences--Data processing.
- Physical Description:
- 1 online resource
- Place of Publication:
- Cambridge Cambridge University Press 2023
- Summary:
- Much training in quantitative social science focuses on data analysis and fails to equip researchers with the skills to prepare the data required for this. This book is a comprehensive introduction to simple and advanced tools for data management, drawing on established concepts and techniques from computer science
- Contents:
- Cover
- Half-title
- Series information
- Title page
- Copyright information
- Dedication
- Contents
- Preface
- Part I Introduction
- 1 Motivation
- 1.1 Data Processing and the Research Cycle
- 1.2 What We Do (and Don't Do) in this Book
- 1.3 Why Focus on Data Processing?
- 1.4 Data in Files vs. Data in Databases
- 1.5 Target Audience, Requirements and Software
- 1.6 Plan of the Book
- 2 Gearing Up
- 2.1 R and RStudio
- 2.2 Setting Up the Project Environment for Your Work
- 2.3 The PostgreSQL Database System
- 2.4 Summary and Outlook
- 3 Data = Content + Structure
- 3.1 What Is Data?
- 3.2 Data Content and Structure
- 3.3 Tables, Tables, Tables
- 3.4 The Structure of Tables Matters
- 3.5 Summary and Outlook
- Part II Data in Files
- 4 Storing Data in Files
- 4.1 Text and Binary Files
- 4.2 File Formats for Tabular Data
- 4.3 Transparent and Efficient Use of Files
- 4.4 Summary and Outlook
- 5 Managing Data in Spreadsheets
- 5.1 Application: Spatial Inequality
- 5.2 Spreadsheet Tables and (the Lack of) Structure
- 5.3 Retrieving Data from a Table
- 5.4 Changing Table Structure and Content
- 5.5 Aggregating Data from a Table
- 5.6 Exporting Spreadsheet Data
- 5.7 Results: Spatial Inequality
- 5.8 Summary and Outlook
- 6 Basic Data Management in R
- 6.1 Application: Inequality and Economic Performance in the US
- 6.2 Loading the Data
- 6.3 Merging Tables
- 6.4 Aggregating Data from a Table
- 6.5 Results: Inequality and Economic Performance in the US
- 6.6 Summary and Outlook
- 7 R and the tidyverse
- 7.1 Application: Global Patterns of Inequality across Regime Types
- 7.2 A New Operator: The Pipe
- 7.3 Loading the Data
- 7.4 Merging the WID and Polity IV Datasets
- 7.5 Grouping and Aggregation
- 7.6 Results: Global Patterns of Inequality across Regime Types
- 7.7 Other Useful Functions in the tidyverse
- 7.8 Summary and Outlook
- Part III Data in Databases
- 8 Introduction to Relational Databases
- 8.1 Database Servers and Clients
- 8.2 SQL Basics
- 8.3 Application: Electoral Disproportionality by Country
- 8.4 Creating a Table with National Elections
- 8.5 Computing Electoral Disproportionality
- 8.6 Results: Electoral Disproportionality by Country
- 8.7 Summary and Outlook
- 9 Relational Databases and Multiple Tables
- 9.1 Application: The Rise of Populism in Europe
- 9.2 Adding the Tables
- 9.3 Joining the Tables
- 9.4 Merging Data from the PopuList
- 9.5 Maintaining Referential Integrity
- 9.6 Results: The Rise of Populism in Europe
- 9.7 Summary and Outlook
- 10 Database Fine-Tuning
- 10.1 Speeding Up Data Access with Indexes
- 10.2 Collaborative Data Management with Multiple Users
- 10.3 Summary and Outlook
- Part IV Special Types of Data
- 11 Spatial Data
- 11.1 What Is Spatial Data?
- 11.2 Application: Patterns of Violence in the Bosnian Civil War
- 11.3 Reading and Visualizing Spatial Data in R
- 11.4 Spatial Data in a Relational Database
- 11.5 Results: Patterns of Violence in the Bosnian Civil War
- 11.6 Summary and Outlook
- 12 Text Data
- 12.1 What Is Textual Data?
- 12.2 Application: References to (In)equality in UN Speeches
- 12.3 Working with Strings in (Base) R
- 12.4 Natural Language Processing with quanteda
- 12.5 Using PostgreSQL to Manage Documents
- 12.6 Results: References to (In)equality in UN Speeches
- 12.7 Summary and Outlook
- 13 Network Data
- 13.1 What Is Network Data?
- 13.2 Application: Trade and Democracy
- 13.3 Exploring Network Data in R with igraph
- 13.4 Network Data in a Relational Database
- 13.5 Results: Trade and Democracy
- 13.6 Summary and Outlook
- Part V Conclusion
- 14 Best Practices in Data Management
- 14.1 Two General Recommendations
- 14.2 Collaborative Data Management
- 14.3 Disseminating Research Data and Code
- 14.4 Summary and Outlook
- Bibliography
- Index
- Notes:
- Also issued in print: 2023
- Includes bibliographical references and index
- Description based on online resource; title from PDF title page (viewed on April 11, 2023)
- Vendor-supplied metadata
- Other Format:
- ebook version :
- ISBN:
- 1108990428
- 9781108990424
- OCLC:
- 1376019195
- Access Restriction:
- Restricted for use by site license
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.