2 options
Beginning data science in R : data analysis, visualization, and modelling for the data scientist / Thomas Mailund.
- Format:
- Book
- Author/Creator:
- Mailund, Thomas, author.
- Language:
- English
- Subjects (All):
- Quantitative research.
- Physical Description:
- 1 online resource
- polychrome
- Place of Publication:
- New York : Apress, [2017].
- System Details:
- text file
- Contents:
- At a Glance; Contents; About the Author; About the Technical Reviewer; Acknowledgments; Introduction; Chapter 1: Introduction to R Programming; Basic Interaction with R; Using R as a Calculator; Simple Expressions; Assignments; Actually, All of the Above Are Vectors of Values&; Indexing Vectors; Vectorized Expressions; Comments; Functions; Getting Documentation for Functions; Writing Your Own Functions; Vectorized Expressions and Functions; A Quick Look at Control Structures; Factors; Data Frames; Dealing with Missing Values; Using R Packages
- Data Pipelines (or Pointless Programming)Writing Pipelines of Function Calls; Writing Functions that Work with Pipelines; The magical "." argument; Defining Functions Using .; Anonymous Functions; Other Pipeline Operations; Coding and Naming Conventions; Exercises; Mean of Positive Values; Root Mean Square Error; Chapter 2: Reproducible Analysis; Literate Programming and Integration of Workflow and Documentation; Creating an R Markdown/knitr Document in RStudio; The YAML Language; The Markdown Language; Formatting Text; Cross-Referencing; Bibliographies
- Controlling the Output (Templates/Stylesheets)Running R Code in Markdown Documents; Using Chunks when Analyzing Data (Without Compiling Documents); Caching Results; Displaying Data; Exercises; Create an R Markdown Document; Produce Different Output; Add Caching; Chapter 3: Data Manipulation; Data Already in R; Quickly Reviewing Data; Reading Data; Examples of Reading and Formatting Datasets; Breast Cancer Dataset; Boston Housing Dataset; The readr Package; Manipulating Data with dplyr; Some Useful dplyr Functions; select(): Pick Selected Columns and Get Rid of the Rest
- Mutate():Add Computed Values to Your Data FrameTransmute(): Add Computed Values to Your Data Frame and Get Rid of All Other Columns; arrange(): Reorder Your Data Frame by Sorting Columns; filter(): Pick Selected Rows and Get Rid of the Rest; group_by(): Split Your Data Into Subtables Based on Column Values; summarise/summarize(): Calculate Summary Statistics; Breast Cancer Data Manipulation; Tidying Data with tidyr; Exercises; Importing Data; Using dplyr; Using tidyr; Chapter 4: Visualizing Data; Basic Graphics; The Grammar of Graphics and the ggplot2 Package; Using qplot(); Using Geometries
- FacetsScaling; Themes and Other Graphics Transformations; Figures with Multiple Plots; Exercises; Chapter 5: Working with Large Datasets; Subsample Your Data Before You Analyze the Full Dataset; Running Out of Memory During Analysis; Too Large to Plot; Too Slow to Analyze; Too Large to Load; Exercises; Subsampling; Hex and 2D Density Plots; Chapter 6: Supervised Learning; Machine Learning; Supervised Learning; Regression versus Classification; Inference versus Prediction; Specifying Models; Linear Regression; Logistic Regression (Classification, Really); Model Matrices and Formula
- Notes:
- Includes index.
- Electronic reproduction. Palo Alto, Calif. Available via World Wide Web.
- Vendor-supplied metadata.
- ISBN:
- 9781484226711
- 1484226712
- Publisher Number:
- 99971659598
- Access Restriction:
- Restricted for use by site license.
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.