2 options
Data analysis with R : load, wrangle, and analyze your data using the world's most powerful statistical programming language / Tony Fischetti.
- Format:
- Book
- Author/Creator:
- Fischetti, Tony, author.
- Series:
- Community experience distilled
- Language:
- English
- Subjects (All):
- R (Computer program language).
- Physical Description:
- 1 online resource (388 p.)
- Edition:
- 1st ed.
- Place of Publication:
- Birmingham, England ; Mumbai, [India] : Packt Publishing, 2015.
- Summary:
- Key Features[*]Load, manipulate and analyze data from different sources[*]Gain a deeper understanding of fundamentals of applied statistics[*]A practical guide to performing data analysis in practiceBook DescriptionFrequently the tool of choice for academics, R has spread deep into the private sector and can be found in the production pipelines at some of the most advanced and successful enterprises. The power and domain-specificity of R allows the user to express complex analytics easily, quickly, and succinctly. With over 7,000 user contributed packages, it’s easy to find support for the latest and greatest algorithms and techniques. Starting with the basics of R and statistical reasoning, Data Analysis with R dives into advanced predictive analytics, showing how to apply those techniques to real-world data though with real-world examples. Packed with engaging problems and exercises, this book begins with a review of R and its syntax. From there, get to grips with the fundamentals of applied statistics and build on this knowledge to perform sophisticated and powerful analytics. Solve the difficulties relating to performing data analysis in practice and find solutions to working with “messy data”, large data, communicating results, and facilitating reproducibility. This book is engineered to be an invaluable resource through many stages of anyone’s career as a data analyst. What you will learn[*]Navigate the R environment[*]Describe and visualize the behavior of data and relationships between data[*]Gain a thorough understanding of statistical reasoning and sampling[*]Employ hypothesis tests to draw inferences from your data[*]Learn Bayesian methods for estimating parameters[*]Perform regression to predict continuous variables[*]Apply powerful classification methods to predict categorical data[*]Handle missing data gracefully using multiple imputation[*]Identify and manage problematic data points[*]Employ parallelization and Rcpp to scale your analyses to larger data[*]Put best practices into effect to make your job easier and facilitate reproducibilityWho this book is forWhether you are learning data analysis for the first time, or you want to deepen the understanding you already have, this book will prove to an invaluable resource. If you are looking for a book to bring you all the way through the fundamentals to the application of advanced and effective analytics methodologies, and have some prior programming experience and a mathematical background, then this is for you.
- Contents:
- Cover; Copyright; Credits; About the Author; About the Reviewer; www.PacktPub.com; Table of Contents; Preface; Chapter 1: RefresheR; Navigating the basics; Arithmetic and assignment; Logicals and characters; Flow of control; Getting help in R; Vectors; Subsetting; Vectorized functions; Advanced subsetting; Recycling; Functions; Matrices; Loading data into R; Working with packages; Exercises; Summary; Chapter 2: The Shape of Data; Univariate data; Frequency distributions; Central tendency; Spread; Populations, samples, and estimation; Probability distributions; Visualization methods; Exercises
- SummaryChapter 3: Describing Relationships; Multivariate data; Relationships between a categorical and a continuous variable; Relationships between two categorical variables; The relationship between two continuous variables; Covariance; Correlation coefficients; Comparing multiple correlations; Visualization methods; Categorical and continuous variables; Two categorical variables; Two continuous variables; More than two continuous variables; Exercises; Summary; Chapter 4: Probability; Basic probability; A tale of two interpretations; Sampling from distributions; Parameters
- The binomial distributionThe normal distribution; The three-sigma rule and using z-tables; Exercises; Summary; Chapter 5: Using Data to Reason About the World; Estimating means; The sampling distribution; Interval estimation; How did we get 1.96?; Smaller samples; Exercises; Summary; Chapter 6: Testing Hypotheses; Null Hypothesis Significance Testing; One and two-tailed tests; When things go wrong; A warning about significance; A warning about p-values; Testing the mean of one sample; Assumptions of the one sample t-test; Testing two means; Don't be fooled!
- Assumptions of the independent samples t-testTesting more than two means; Assumptions of ANOVA; Testing independence of proportions; What if my assumptions are unfounded?; Exercises; Summary; Chapter 7: Bayesian Methods; The big idea behind Bayesian analysis; Choosing a prior; Who cares about coin flips; Enter MCMC - stage left; Using JAGS and runjags; Fitting distributions the Bayesian way; The Bayesian independent samples t-test; Exercises; Summary; Chapter 8: Predicting Continuous Variables; Linear models; Simple linear regression; Simple linear regression with a binary predictor
- A word of warningMultiple regression; Regression with a non-binary predictor; Kitchen sink regression; The bias-variance trade-off; Cross-validation; Striking a balance; Linear regression diagnostics; Second Anscombe relationship; Third Anscombe relationship; Fourth Anscombe relationship; Advanced topics; Exercises; Summary; Chapter 9: Predicting Categorical Variables; k-Nearest Neighbors; Using k-NN in R; Confusion matrices; Limitations of k-NN; Logistic regression; Using logistic regression in R; Decision trees; Random forests; Choosing a classifier; The vertical decision boundary
- The diagonal decision boundary
- Notes:
- Includes index.
- Description based on online resource; title from PDF title page (ebrary, viewed May 31, 2017).
- ISBN:
- 9781785286445
- 1785286447
- OCLC:
- 951065030
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.