1 option
Building Statistical Models in Python : Develop Useful Models for Regression, Classification, Time Series, and Survival Analysis / Huy Hoang Nguyen, Paul N. Adams, and Stuart J. Miller.
- Format:
- Book
- Author/Creator:
- Nguyễn, Huy Hoàng, TS., author.
- Adams, Paul N., author.
- Miller, Stuart J., 1938- author.
- Language:
- English
- Subjects (All):
- Mathematical statistics.
- Mathematical models.
- Python (Computer program language).
- Statistics.
- Physical Description:
- 1 online resource (420 pages)
- Edition:
- First edition.
- Place of Publication:
- Birmingham, England : Packt Publishing Ltd., [2023]
- Biography/History:
- Nguyen Huy Hoang: Huy Hoang Nguyen is a Mathematician and a Data Scientist with far-ranging experience, championing advanced mathematics and strategic leadership, and applied machine learning research. He holds a Master's in Data Science and a PhD in Mathematics. His previous work was related to Partial Differential Equations, Functional Analysis and their applications in Fluid Mechanics. He transitioned from academia to the healthcare industry and has performed different Data Science projects from traditional Machine Learning to Deep Learning. Adams Paul N: Paul Adams is a Data Scientist with a background primarily in the healthcare industry. Paul applies statistics and machine learning in multiple areas of industry, focusing on projects in process engineering, process improvement, metrics and business rules development, anomaly detection, forecasting, clustering and classification. Paul holds a Master of Science in Data Science from Southern Methodist University. Miller Stuart J: Stuart Miller is a Machine Learning Engineer with degrees in Data Science, Electrical Engineering, and Engineering Physics. Stuart has worked at several Fortune 500 companies, including Texas Instruments and StateFarm, where he built software that utilized statistical and machine learning techniques. Stuart is currently an engineer at Toyota Connected helping to build a more modern cockpit experience for drivers using machine learning.
- Summary:
- The ability to proficiently perform statistical modeling is a fundamental skill for data scientists and essential for businesses reliant on data insights. Building Statistical Models with Python is a comprehensive guide that will empower you to leverage mathematical and statistical principles in data assessment, understanding, and inference generation. This book not only equips you with skills to navigate the complexities of statistical modeling, but also provides practical guidance for immediate implementation through illustrative examples. Through emphasis on application and code examples, you’ll understand the concepts while gaining hands-on experience. With the help of Python and its essential libraries, you’ll explore key statistical models, including hypothesis testing, regression, time series analysis, classification, and more. By the end of this book, you’ll gain fluency in statistical modeling while harnessing the full potential of Python's rich ecosystem for data analysis.
- Contents:
- Cover
- Copyright
- Contributors
- Table of Contents
- Preface
- Part 1: Introduction to Statistics
- Chapter 1: Sampling and Generalization
- Software and environment setup
- Population versus sample
- Population inference from samples
- Randomized experiments
- Observational study
- Sampling strategies - random, systematic, stratified, and clustering
- Probability sampling
- Non-probability sampling
- Summary
- Chapter 2: Distributions of Data
- Technical requirements
- Understanding data types
- Nominal data
- Ordinal data
- Interval data
- Ratio data
- Visualizing data types
- Measuring and describing distributions
- Measuring central tendency
- Measuring variability
- Measuring shape
- The normal distribution and central limit theorem
- The Central Limit Theorem
- Bootstrapping
- Confidence intervals
- Standard error
- Correlation coefficients (Pearson's correlation)
- Permutations
- Permutations and combinations
- Permutation testing
- Transformations
- References
- Chapter 3: Hypothesis Testing
- The goal of hypothesis testing
- Overview of a hypothesis test for the mean
- Scope of inference
- Hypothesis test steps
- Type I and Type II errors
- Type I errors
- Type II errors
- Basics of the z-test - the z-score, z-statistic, critical values, and p-values
- The z-score and z-statistic
- A z-test for means
- z-test for proportions
- Power analysis for a two-population pooled z-test
- Chapter 4: Parametric Tests
- Assumptions of parametric tests
- Normally distributed population data
- Equal population variance
- T-test - a parametric hypothesis test
- T-test for means
- Two-sample t-test - pooled t-test
- Two-sample t-test - Welch's t-test
- Paired t-test
- Tests with more than two groups and ANOVA
- Multiple tests for significance
- ANOVA.
- Pearson's correlation coefficient
- Power analysis examples
- Chapter 5: Non-Parametric Tests
- When parametric test assumptions are violated
- Permutation tests
- The Rank-Sum test
- The test statistic procedure
- Normal approximation
- Rank-Sum example
- The Signed-Rank test
- The Kruskal-Wallis test
- Chi-square distribution
- Chi-square goodness-of-fit
- Chi-square test of independence
- Chi-square goodness-of-fit test power analysis
- Spearman's rank correlation coefficient
- Part 2: Regression Models
- Chapter 6: Simple Linear Regression
- Simple linear regression using OLS
- Coefficients of correlation and determination
- Coefficients of correlation
- Coefficients of determination
- Required model assumptions
- A linear relationship between the variables
- Normality of the residuals
- Homoscedasticity of the residuals
- Sample independence
- Testing for significance and validating models
- Model validation
- Chapter 7: Multiple Linear Regression
- Multiple linear regression
- Adding categorical variables
- Evaluating model fit
- Interpreting the results
- Feature selection
- Statistical methods for feature selection
- Performance-based methods for feature selection
- Recursive feature elimination
- Shrinkage methods
- Ridge regression
- LASSO regression
- Elastic Net
- Dimension reduction
- PCA - a hands-on introduction
- PCR - a hands-on salary prediction study
- Part 3: Classification Models
- Chapter 8: Discrete Models
- Probit and logit models
- Multinomial logit model
- Poisson model
- The Poisson distribution
- Modeling count data
- The negative binomial regression model
- Negative binomial distribution
- Chapter 9: Discriminant Analysis
- Bayes' theorem
- Probability
- Conditional probability.
- Discussing Bayes' Theorem
- Linear Discriminant Analysis
- Supervised dimension reduction
- Quadratic Discriminant Analysis
- Part 4: Time Series Models
- Chapter 10: Introduction to Time Series
- What is a time series?
- Goals of time series analysis
- Statistical measurements
- Mean
- Variance
- Autocorrelation
- Cross-correlation
- The white-noise model
- Stationarity
- Chapter 11: ARIMA Models
- Models for stationary time series
- Autoregressive (AR) models
- Moving average (MA) models
- Autoregressive moving average (ARMA) models
- Models for non-stationary time series
- ARIMA models
- Seasonal ARIMA models
- More on model evaluation
- Chapter 12: Multivariate Time Series
- Multivariate time series
- Time-series cross-correlation
- ARIMAX
- Preprocessing the exogenous variables
- Fitting the model
- Assessing model performance
- VAR modeling
- Step 1 - visual inspection
- Step 2 - selecting the order of AR(p)
- Step 3 - assessing cross-correlation
- Step 4 - building the VAR(p,q) model
- Step 5 - testing the forecast
- Step 6 - building the forecast
- Part 5: Survival Analysis
- Chapter 13: Time-to-Event Variables - An Introduction
- What is censoring?
- Left censoring
- Right censoring
- Interval censoring
- Type I and Type II censoring
- Survival data
- Survival Function, Hazard and Hazard Ratio
- Chapter 14: Survival Models
- Kaplan-Meier model
- Model definition
- Model example
- Exponential model
- Cox Proportional Hazards regression model
- Step 1
- Step 2
- Step 3
- Step 4
- Step 5
- Index
- Other Books You May Enjoy.
- Notes:
- Includes index.
- Description based on print version record.
- ISBN:
- 1-80461-215-4
- OCLC:
- 1396227320
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.