My Account Log in

1 option

R for Health Data Science / Ewen Harrison, Riinu Pius.

Ebook Central Academic Complete Available online

View online
Format:
Book
Author/Creator:
Harrison, Ewen, author.
Pius, Riinu, author.
Language:
English
Subjects (All):
Computational biology.
Physical Description:
1 online resource (364 pages)
Edition:
1st ed.
Place of Publication:
Boca Raton : Chapman and Hall/CRC, 2020.
Language Note:
In English.
Summary:
In this age of information, the manipulation, analysis, and interpretation of data have become a fundamental part of professional life; nowhere more so than in the delivery of healthcare. From the understanding of disease and the development of new treatments, to the diagnosis and management of individual patients, the use of data and technology is now an integral part of the business of healthcare. Those working in healthcare interact daily with data, often without realising it. The conversion of this avalanche of information to useful knowledge is essential for high-quality patient care. R for Health Data Science includes everything a healthcare professional needs to go from R novice to R guru. By the end of this book, you will be taking a sophisticated approach to health data science with beautiful visualisations, elegant tables, and nuanced analyses. Features Provides an introduction to the fundamentals of R for healthcare professionals Highlights the most popular statistical approaches to health data science Written to be as accessible as possible with minimal mathematics Emphasises the importance of truly understanding the underlying data through the use of plots Includes numerous examples that can be adapted for your own data Helps you create publishable documents and collaborate across teams With this book, you are in safe hands - Prof. Harrison is a clinician and Dr. Pius is a data scientist, bringing 25 years' combined experience of using R at the coal face. This content has been taught to hundreds of individuals from a variety of backgrounds, from rank beginners to experts moving to R from other platforms.
Contents:
Cover
Half Title
Title Page
Copyright Page
Dedication
Contents
Preface
About the Authors
I. Data wrangling and visualisation
1. Why we love R
1.1. Help, what's a script?
1.2. What is RStudio?
1.3. Getting started
1.4. Getting help
1.5. Work in a Project
1.6. Restart R regularly
1.7. Notation throughout this book
2. R basics
2.1. Reading data into R
2.1.1. Import Dataset interface
2.1.2. Reading in the Global Burden of Disease example dataset
2.2. Variable types and why we care
2.2.1. Numeric variables (continuous)
2.2.2. Character variables
2.2.3. Factor variables (categorical)
2.2.4. Date/time variables
2.3. Objects and functions
2.3.1. data frame/tibble
2.3.2. Naming objects
2.3.3. Function and its arguments
2.3.4. Working with objects
2.3.5. &lt
- and =
2.3.6. Recap: object, function, input, argument
2.4. Pipe - %&gt
%
2.4.1. Using . to direct the pipe
2.5. Operators for filtering data
2.5.1. Worked examples
2.6. The combine function: c ()
2.7. Missing values (NAs) and filters
2.8. Creating new columns - mutate ()
2.8.1. Worked example/exercise
2.9. Conditional calculations - if_else ()
2.10. Create labels - paste()
2.11. Joining multiple datasets
2.11.1 Further notes about joins
3. Summarising data
3.1. Get the data
3.2. Plot the data
3.3. Aggregating: group_by (), summarise ()
3.4. Add new columns: mutate ()
3.4.1. Percentages formatting: percent ()
3.5. summarise () vs mutate ()
3.6. Common arithmetic functions - sum (), mean (), median (), etc.
3.7. select () columns
3.8. Reshaping data - long vs wide format
3.8.1. Pivot values from rows into columns (wider)
3.8.2. Pivot values from columns to rows (longer)
3.8.3. separate () a column into multiple columns.
3.9. arrange () rows
3.9.1. Factor levels
3.10. Exercises
3.10.1. Exercise - pivot_wider ()
3.10.2. Exercise - group_by (), summarise ()
3.10.3. Exercise - full_join (), percent ()
3.10.4. Exercise - mutate (), summarise ()
3.10.5. Exercise - filter (), summarise (), pivot_wider ()
4. Different types of plots
4.1. Get the data
4.2. Anatomy of ggplot explained
4.3. Set your theme - grey vs white
4.4. Scatter plots/bubble plots
4.5. Line plots/time series plots
4.5.1. Exercise
4.6. Bar plots
4.6.1. Summarised data
4.6.2. Countable data
4.6.3. colour vs fill
4.6.4. Proportions
4.6.5. Exercise
4.7. Histograms
4.8. Box plots
4.9. Multiple geoms, multiple aes ()
4.9.1. Worked example - three geoms together
4.10. All other types of plots
4.11. Solutions
4.12. Extra: Advanced examples
5. Fine tuning plots
5.1. Get the data
5.2. Scales
5.2.1. Logarithmic
5.2.2. Expand limits
5.2.3. Zoom in
5.2.4. Exercise
5.2.5. Axis ticks
5.3. Colours
5.3.1. Using the Brewer palettes:
5.3.2. Legend title
5.3.3. Choosing colours manually
5.4. Titles and labels
5.4.1. Annotation
5.4.2. Annotation with a superscript and a variable
5.5. Overall look - theme ()
5.5.1. Text size
5.5.2. Legend position
5.6. Saving your plot
II. Data analysis
6. Working with continuous outcome variables
6.1. Continuous data
6.2. The Question
6.3. Get and check the data
6.4. Plot the data
6.4.1. Histogram
6.4.2. Quantile-quantile (Q-Q) plot
6.4.3. Boxplot
6.5. Compare the means of two groups
6.5.1. t-test
6.5.2. Two-sample t-tests
6.5.3. Paired t-tests
6.5.4. What if I run the wrong test?
6.6. Compare the mean of one group: one sample t-tests
6.6.1. Interchangeability of t-tests.
6.7. Compare the means of more than two groups
6.7.1. Plot the data
6.7.2. ANOVA
6.7.3. Assumptions
6.8. Multiple testing
6.8.1. Pairwise testing and multiple comparisons
6.9. Non-parametric tests
6.9.1. Transforming data
6.9.2. Non-parametric test for comparing two groups
6.9.3. Non-parametric test for comparing more than two groups
6.10. Finalfit approach
6.11. Conclusions
6.12. Exercises
6.12.1. Exercise
6.12.2. Exercise
6.12.3. Exercise
6.12.4. Exercise
6.13. Solutions
7. Linear regression
7.1. Regression
7.1.1. The Question (1)
7.1.2. Fitting a regression line
7.1.3. When the line fits well
7.1.4. The fitted line and the linear equation
7.1.5. Effect modification
7.1.6. R-squared and model fit
7.1.7. Confounding
7.1.8. Summary
7.2. Fitting simple models
7.2.1. The Question (2)
7.2.2. Get the data
7.2.3. Check the data
7.2.4. Plot the data
7.2.5. Simple linear regression
7.2.6. Multivariable linear regression
7.2.7. Check assumptions
7.3. Fitting more complex models
7.3.1. The Question (3)
7.3.2. Model fitting principles
7.3.3. AIC
7.3.4. Get the data
7.3.5. Check the data
7.3.6. Plot the data
7.3.7. Linear regression with finalfit
7.3.8. Summary
7.4. Exercises
7.4.1. Exercise
7.4.2. Exercise
7.4.3. Exercise
7.4.4. Exercise
7.5. Solutions
8. Working with categorical outcome variables
8.1. Factors
8.2. The Question
8.3. Get the data
8.4. Check the data
8.5. Recode the data
8.6. Should I convert a continuous variable to a categorical variable?
8.6.1. Equal intervals vs quantiles
8.7. Plot the data
8.8. Group factor levels together - fct_collapse ()
8.9. Change the order of values within a factor - fct_relevel ()
8.10. Summarising factors with finalfit.
8.11. Pearson's chi-squared and Fisher's exact tests
8.11.1. Base R
8.12. Fisher's exact test
8.13. Chi-squared / Fisher's exact test using finalfit
8.14. Exercises
8.14.1. Exercise
8.14.2. Exercise
8.14.3. Exercise
9. Logistic regression
9.1. Generalised linear modelling
9.2. Binary logistic regression
9.2.1. The Question (1)
9.2.2. Odds and probabilities
9.2.3. Odds ratios
9.2.4. Fitting a regression line
9.2.5. The fitted line and the logistic regression equation
9.2.6. Effect modification and confounding
9.3. Data preparation and exploratory analysis
9.3.1. The Question (2)
9.3.2. Get the data
9.3.3. Check the data
9.3.4. Recode the data
9.3.5. Plot the data
9.3.6. Tabulate data
9.4. Model assumptions
9.4.1. Linearity of continuous variables to the response
9.4.2. Multicollinearity
9.5. Fitting logistic regression models in base R
9.6. Modelling strategy for binary outcomes
9.7. Fitting logistic regression models with finalfit
9.7.1. Criterion-based model fitting
9.8. Model fitting
9.8.1. Odds ratio plot
9.9. Correlated groups of observations
9.9.1. Simulate data
9.9.2. Plot the data
9.9.3. Mixed effects models in base R
9.10. Exercises
9.10.1. Exercise
9.10.2. Exercise
9.10.3. Exercise
9.10.4. Exercise
9.11. Solutions
10. Time-to-event data and survival
10.1. The Question
10.2. Get and check the data
10.3. Death status
10.4. Time and censoring
10.5. Recode the data
10.6. Kaplan Meier survival estimator
10.6.1. KM analysis for whole cohort
10.6.2. Model
10.6.3. Life table
10.7. Kaplan Meier plot
10.8. Cox proportional hazards regression
10.8.1. coxph ()
10.8.2. finalfit ()
10.8.3. Reduced model
10.8.4. Testing for proportional hazards
10.8.5. Stratified models.
10.8.6. Correlated groups of observations
10.8.7. Hazard ratio plot
10.9. Competing risks regression
10.10. Summary
10.11. Dates in R
10.11.1. Converting dates to survival time
10.12. Exercises
10.12.1. Exercise
10.12.2. Exercise
10.13. Solutions
III. Workflow
11. The problem of missing data
11.1. Identification of missing data
11.1.1. Missing completely at random (MCAR)
11.1.2. Missing at random (MAR)
11.1.3. Missing not at random (MNAR)
11.2. Ensure your data are coded correctly: ff_glimpse ()
11.2.1. The Question
11.3. Identify missing values in each variable: missing_plot ()
11.4. Look for patterns of missingness: missing_pattern ()
11.5. Including missing data in demographics tables
11.6. Check for associations between missing and observed data
11.6.1. For those who like an omnibus test
11.7. Handling missing data: MCAR
11.7.1. Common solution: row-wise deletion
11.7.2. Other considerations
11.8. Handling missing data: MAR
11.8.1. Common solution: Multivariate Imputation by Chained Equations (mice)
11.9. Handling missing data: MNAR
11.10. Summary
12. Notebooks and Markdown
12.1. What is a Notebook?
12.2. What is Markdown?
12.3. What is the difference between a Notebook and an R Markdown file?
12.4. Notebook vs HTML vs PDF vs Word
12.5. The anatomy of a Notebook / R Markdown file
12.5.1. YAML header
12.5.2. R code chunks
12.5.3. Setting default chunk options
12.5.4. Setting default figure options
12.5.5. Markdown elements
12.6. Interface and outputting
12.6.1. Running code and chunks, knitting
12.7. File structure and workflow
12.7.1. Why go to all this bother?
13. Exporting and reporting
13.1. Which format should I use?
13.2. Working in a .R file
13.3. Demographics table.
13.4. Logistic regression table.
Notes:
Description based on: online resource; title from PDF information screen (Routledge, viewed December 29, 2022).
ISBN:
1-000-22610-7
0-367-85542-9
1-000-22616-6
9780367855420
OCLC:
1222803132

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account