3 options
Statistical application development with R and python : power of statistics using R and python / Prabhanjan Narayanachar Tattar.
- Format:
- Book
- Author/Creator:
- Tattar, Prabhanjan Narayanachar, author.
- Language:
- English
- Subjects (All):
- Application software--Development.
- Application software.
- R (Computer program language).
- Physical Description:
- 1 online resource (405 pages) : illustrations (some color)
- Edition:
- Second edition.
- Place of Publication:
- Birmingham, England ; Mumbai, [India] : Packt, 2017.
- System Details:
- text file
- Summary:
- Software Implementation Illustrated with R and Python About This Book Learn the nature of data through software which takes the preliminary concepts right away using R and Python. Understand data modeling and visualization to perform efficient statistical analysis with this guide. Get well versed with techniques such as regression, clustering, classification, support vector machines and much more to learn the fundamentals of modern statistics. Who This Book Is For If you want to have a brief understanding of the nature of data and perform advanced statistical analysis using both R and Python, then this book is what you need. No prior knowledge is required. Aspiring data scientist, R users trying to learn Python and vice versa What You Will Learn Learn the nature of data through software with preliminary concepts right away in R Read data from various sources and export the R output to other software Perform effective data visualization with the nature of variables and rich alternative options Do exploratory data analysis for useful first sight understanding building up to the right attitude towards effective inference Learn statistical inference through simulation combining the classical inference and modern computational power Delve deep into regression models such as linear and logistic for continuous and discrete regressands for forming the fundamentals of modern statistics Introduce yourself to CART ? a machine learning tool which is very useful when the data has an intrinsic nonlinearity In Detail Statistical Analysis involves collecting and examining data to describe the nature of data that needs to be analyzed. It helps you explore the relation of data and build models to make better decisions. This book explores statistical concepts along with R and Python, which are well integrated from the word go. Almost every concept has an R code going with it which exemplifies the strength of R and applications. The R code and programs have been further strengthened with equivalent Python programs. Thus, you will first understand the data characteristics, descriptive statistics and the exploratory attitude, which will give you firm footing of data analysis. Statistical inference will complete the technical footing of statistical methods. Regression, linear, logistic modeling, and CART, builds the essential toolkit. This will help you complete complex problems in the real world. You will begin with a brief understanding of the nature of data and e...
- Contents:
- Cover
- Copyright
- Credits
- About the Author
- Acknowledgment
- About the Reviewers
- www.PacktPub.com
- Customer Feedback
- Table of Contents
- Preface
- Chapter 1: Data Characteristics
- Questionnaire and its components
- Understanding the data characteristics in an R environment
- Experiments with uncertainty in computer science
- Installing and setting up R
- Using R packages
- RSADBE - the books R package
- Python installation and setup
- Using pip for packages
- IDEs for R and Python
- The companion code bundle
- Discrete distributions
- Discrete uniform distribution
- Binomial distribution
- Hypergeometric distribution
- Negative binomial distribution
- Poisson distribution
- Continuous distributions
- Uniform distribution
- Exponential distribution
- Normal distribution
- Summary
- Chapter 2: Import/Export Data
- Packages and settings - R and Python
- Understanding data.frame and other formats
- Constants, vectors, and matrices
- Time for action - understanding constants, vectors, and basic arithmetic
- What just happened?
- Doing it in Python
- Time for action - matrix computations
- The list object
- Time for action - creating a list object
- The data.frame object
- Time for action - creating a data.frame object
- Have a go hero
- The table object
- Time for action - creating the Titanic dataset as a table object
- Using utils and the foreign packages
- Time for action - importing data from external files
- Importing data from MySQL
- Exporting data/graphs
- Exporting R objects
- Exporting graphs
- Time for action - exporting a graph
- Managing R sessions.
- Time for action - session management
- Pop quiz
- Chapter 3: Data Visualization
- Visualization techniques for categorical data
- Bar chart
- Going through the built-in examples of R
- Time for action - bar charts in R
- Dot chart
- Time for action - dot charts in R
- Spine and mosaic plots
- Time for action - spine plot for the shift and operator data
- Time for action - mosaic plot for the Titanic dataset
- Pie chart and the fourfold plot
- Visualization techniques for continuous variable data
- Boxplot
- Time for action - using the boxplot
- Histogram
- Time for action - understanding the effectiveness of histograms
- Scatter plot
- Time for action - plot and pairs R functions
- Pareto chart
- A brief peek at ggplot2
- Time for action - qplot
- Time for action - ggplot
- Chapter 4: Exploratory Analysis
- Essential summary statistics
- Percentiles, quantiles, and median
- Hinges
- Interquartile range
- Time for action - the essential summary statistics for The Wall dataset
- Techniques for exploratory analysis
- The stem-and-leaf plot
- Time for action - the stem function in play
- Letter values
- Data re-expression
- Bagplot - a bivariate boxplot
- Time for action - the bagplot display for multivariate datasets.
- What just happened?
- Resistant line
- Time for action - resistant line as a first regression model
- Smoothing data
- Time for action - smoothening the cow temperature data
- Median polish
- Time for action - the median polish algorithm
- Chapter 5: Statistical Inference
- Maximum likelihood estimator
- Visualizing the likelihood function
- Time for action - visualizing the likelihood function
- Finding the maximum likelihood estimator
- Using the fitdistr function
- Time for action - finding the MLE using mle and fitdistr functions
- Confidence intervals
- Time for action - confidence intervals
- Hypothesis testing
- Binomial test
- Time for action - testing probability of success
- Tests of proportions and the chi-square test
- Time for action - testing proportions
- Tests based on normal distribution - one sample
- Time for action - testing one-sample hypotheses
- Tests based on normal distribution - two sample
- Time for action - testing two-sample hypotheses
- Chapter 6: Linear Regression Analysis
- The essence of regression
- The simple linear regression model
- What happens to the arbitrary choice of parameters?
- Time for action - the arbitrary choice of parameters
- Building a simple linear regression model
- Time for action - building a simple linear regression model
- Have a go hero.
- ANOVA and the confidence intervals
- Time for action - ANOVA and the confidence intervals
- Model validation
- Time for action - residual plots for model validation
- Multiple linear regression model
- Averaging k simple linear regression models or a multiple linear regression model
- Time for action - averaging k simple linear regression models
- Building a multiple linear regression model
- Time for action - building a multiple linear regression model
- The ANOVA and confidence intervals for the multiple linear regression model
- Time for action - the ANOVA and confidence intervals for the multiple linear regression model
- Useful residual plots
- Time for action - residual plots for the multiple linear regression model
- Regression diagnostics
- Leverage points
- Influential points
- DFFITS and DFBETAS
- The multicollinearity problem
- Time for action - addressing the multicollinearity problem for the gasoline data
- Model selection
- Stepwise procedures
- The backward elimination
- The forward selection
- The stepwise regression
- Criterion-based procedures
- Time for action - model selection using the backward, forward, and AIC criteria
- Chapter 7: Logistic Regression Model
- The binary regression problem
- Time for action - limitation of linear regression model
- Probit regression model
- Time for action - understanding the constants
- Logistic regression model
- Time for action - fitting the logistic regression model.
- Hosmer-Lemeshow goodness-of-fit test statistic
- Time for action - Hosmer-Lemeshow goodness-of-fit statistic
- Model validation and diagnostics
- Residual plots for the GLM
- Time for action - residual plots for logistic regression model
- Influence and leverage for the GLM
- Time for action - diagnostics for the logistic regression
- Receiving operator curves
- Time for action - ROC construction
- Logistic regression for the German credit screening dataset
- Time for action - logistic regression for the German credit dataset
- Chapter 8: Regression Models with Regularization
- The overfitting problem
- Time for action - understanding overfitting
- Regression spline
- Basis functions
- Piecewise linear regression model
- Time for action - fitting piecewise linear regression models
- Natural cubic splines and the general B-splines
- Time for action - fitting the spline regression models
- Ridge regression for linear models
- Protecting against overfitting
- Time for action - ridge regression for the linear regression model
- Ridge regression for logistic regression models
- Time for action - ridge regression for the logistic regression model
- Another look at model assessment
- Time for action - selecting iteratively and other topics
- Summary.
- Chapter 9: Classification and Regression Trees.
- Notes:
- Includes index.
- Includes bibliographical references and index.
- Description based on online resource; title from PDF title page (ebrary, viewed September 26, 2017).
- OCLC:
- 1004966445
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.