1 option
The statistical evaluation of medical tests for classification and prediction / Margaret Sullivan Pepe.
Holman Biotech Commons RA409 .P47 2003
Available
- Format:
- Book
- Author/Creator:
- Pepe, Margaret Sullivan, 1961-
- Series:
- Oxford statistical science series ; 28.
- Oxford statistical science series ; 28
- Language:
- English
- Subjects (All):
- Medical statistics.
- Diagnosis, Laboratory--Research--Statistical methods.
- Diagnosis, Laboratory.
- Biometry--methods.
- Clinical Laboratory Techniques--statistics & numerical data.
- Research.
- Statistics.
- Medical Subjects:
- Biometry--methods.
- Clinical Laboratory Techniques--statistics & numerical data.
- Physical Description:
- xvi, 302 pages : illustrations ; 25 cm.
- Place of Publication:
- Oxford : Oxford University Press, 2003.
- Summary:
- The use of clinical and laboratory information to detect conditions and predict patient outcomes is a mainstay of medical practice. This book describes the statistical concepts and techniques for evaluating the accuracy of medical tests. Main topics include: estimation and comparison of measures of accuracy, including receiver operating characteristic curves; regression frameworks for assessing factors that influence test accuracy and for comparing tests while adjusting for such factors; and sample size calculations and other issues pertinent to study design. Problems relating to missing and imperfect reference data are discussed in detail. Additional topics include: meta-analysis for summarizing the results of multiple studies of a test; the evaluation of markers for predicting event time data; and procedures for combining the results of multiple tests to improve classification. A variety of worked examples are provided. The Statistical Evaluation of Medical Tests for Classification and Prediction will be of interest to quantitative researchers and to practicing statisticians. The book also covers the theoretical foundations for statistical inference and will be of interest to academic statisticians.
- Contents:
- 1.1 The medical test 1
- 1.1.1 Tests, classification and the broader context 1
- 1.1.2 Disease screening versus diagnosis 2
- 1.1.3 Criteria for a useful medical test 2
- 1.2 Elements of study design 3
- 1.2.1 Scale for the test result 4
- 1.2.2 Selection of study subjects 4
- 1.2.3 Comparing tests 5
- 1.2.4 Test integrity 5
- 1.2.5 Sources of bias 6
- 1.3 Examples and datasets 8
- 1.3.2 The CASS dataset 8
- 1.3.3 Pancreatic cancer serum biomarkers study 10
- 1.3.4 Hepatitis metastasis ultrasound study 10
- 1.3.5 CARET PSA biomarker study 10
- 1.3.6 Ovarian cancer gene expression study 11
- 1.3.7 Neonatal audiology data 11
- 1.3.8 St Louis prostate cancer screening study 11
- 1.4 Topics and organization 11
- 2 Measures of accuracy for binary tests 14
- 2.1 Measures of accuracy 14
- 2.1.2 Disease-specific classification probabilities 14
- 2.1.3 Predictive values 16
- 2.1.4 Diagnostic likelihood ratios 17
- 2.2 Estimating accuracy with data 21
- 2.2.1 Data from a cohort study 21
- 2.2.2 Proportions: (FPF, TPF) and (PPV, NPV) 22
- 2.2.3 Ratios of proportions: DLRs 24
- 2.2.4 Estimation from a case-control study 25
- 2.2.5 Merits of case-control versus cohort studies 26
- 2.3 Quantifying the relative accuracy of tests 27
- 2.3.1 Comparing classification probabilities 28
- 2.3.2 Comparing predictive values 29
- 2.3.3 Comparing diagnostic likelihood ratios 30
- 2.3.4 Which test is better? 31
- 3 Comparing binary tests and regression analysis 35
- 3.1 Study designs for comparing tests 35
- 3.1.1 Unpaired designs 35
- 3.1.2 Paired designs 36
- 3.2 Comparing accuracy with unpaired data 37
- 3.2.1 Empirical estimators of comparative measures 37
- 3.2.2 Large sample inference 38
- 3.3 Comparing accuracy with paired data 41
- 3.3.1 Sources of correlation 41
- 3.3.2 Estimation of comparative measures 41
- 3.3.3 Wide or long data representations 42
- 3.3.4 Large sample inference 43
- 3.3.5 Efficiency of paired versus unpaired designs 44
- 3.3.6 Small sample properties 45
- 3.3.7 The CASS study 45
- 3.4 The regression modeling framework 48
- 3.4.1 Factors potentially affecting test performance 48
- 3.4.2 Questions addressed by regression modeling 50
- 3.4.3 Notation and general set-up 50
- 3.5 Regression for true and false positive fractions 51
- 3.5.1 Binary marginal GLM models 51
- 3.5.2 Fitting marginal models to data 51
- 3.5.3 Illustration: factors affecting test accuracy 53
- 3.5.4 Comparing tests with regression analysis 55
- 3.6 Regression modeling of predictive values 58
- 3.6.1 Model formulation and fitting 58
- 3.6.2 Comparing tests 59
- 3.6.3 The incremental value of a test for prediction 59
- 3.7 Regression models for DLRs 61
- 3.7.1 The model form 61
- 3.7.2 Fitting the DLR model 61
- 3.7.3 Comparing DLRs of two tests 61
- 3.7.4 Relationships with other regression models 62
- 4 The receiver operating characteristic curve 66
- 4.1.1 Examples of non-binary tests 66
- 4.1.2 Dichotomizing the test result 66
- 4.2 The ROC curve for continuous tests 67
- 4.2.2 Mathematical properties of the ROC curve 68
- 4.2.3 Attributes of and uses for the ROC curve 71
- 4.2.4 Restrictions and alternatives to the ROC curve 75
- 4.3 Summary indices 76
- 4.3.1 The area under the ROC curve (AUC) 77
- 4.3.2 The ROC(t[subscript 0]) and partial AUC 79
- 4.3.4 Measures of distance between distributions 81
- 4.4 The binormal ROC curve 81
- 4.4.1 Functional form 82
- 4.4.2 The binormal AUC 83
- 4.4.3 The binormal assumption 84
- 4.5 The ROC for ordinal tests 85
- 4.5.1 Tests with ordered discrete results 85
- 4.5.2 The latent decision variable model 86
- 4.5.3 Identification of the latent variable ROC 86
- 4.5.4 Changes in accuracy versus thresholds 88
- 4.5.5 The discrete ROC curve 89
- 4.5.6 Summary measures for the discrete ROC curve 92
- 5 Estimating the ROC curve 96
- 5.1.1 Approaches 96
- 5.1.2 Notation and assumptions 96
- 5.2 Empirical estimation 97
- 5.2.1 The empirical estimator 97
- 5.2.2 Sampling variability at a threshold 99
- 5.2.3 Sampling variability of ROC[subscript e](t) 99
- 5.2.4 The empirical AUC and other indices 103
- 5.2.5 Variability in the empirical AUC 104
- 5.2.6 Comparing empirical ROC curves 107
- 5.2.7 Illustration: pancreatic cancer biomarkers 109
- 5.2.8 Discrete ordinal data ROC curves 110
- 5.3 Modeling the test result distributions 111
- 5.3.1 Fully parametric modeling 111
- 5.3.2 Semiparametric location-scale models 112
- 5.3.3 Arguments against modeling test results 114
- 5.4 Parametric distribution-free methods: ordinal tests 114
- 5.4.1 The binormal latent variable framework 115
- 5.4.2 Fitting the discrete binormal ROC function 117
- 5.4.3 Generalizations and comparisons 118
- 5.5 Parametric distribution-free methods: continuous tests 119
- 5.5.1 LABROC 119
- 5.5.2 The ROC-GLM estimator 120
- 5.5.3 Inference with parametric distribution-free methods 124
- 5.8 Proofs of theoretical results 128
- 6 Covariate effects on continuous and ordinal tests 130
- 6.1 How and why? 130
- 6.1.2 Aspects to model 131
- 6.1.3 Omitting covariates/pooling data 132
- 6.2 Reference distributions 136
- 6.2.1 Non-diseased as the reference population 136
- 6.2.2 The homogenous population 137
- 6.2.3 Nonparametric regression quantiles 139
- 6.2.4 Parametric estimation of S[subscript D,Z] 140
- 6.2.5 Semiparametric models 141
- 6.2.6 Application 141
- 6.2.7 Ordinal test results 143
- 6.3 Modeling covariate effects on test results 144
- 6.3.2 Induced ROC curves for continuous tests 144
- 6.3.3 Semiparametric location-scale families 148
- 6.3.4 Induced ROC curves for ordinal tests 150
- 6.3.5 Random effect models for test results 150
- 6.4 Modeling covariate effects on ROC curves 151
- 6.4.1 The ROC-GLM regression model 152
- 6.4.2 Fitting the model to data 154
- 6.4.3 Comparing ROC curves 157
- 6.5 Approaches to ROC regression 164
- 6.5.1 Modeling ROC summary indices 164
- 6.5.2 A qualitative comparison 164
- 7 Incomplete data and imperfect reference tests 168
- 7.1 Verification biased sampling 168
- 7.1.1 Context and definition 168
- 7.1.2 The missing at random assumption 170
- 7.1.3 Correcting for bias with Bayes' theorem 170
- 7.1.4 Inverse probability weighting/imputation 171
- 7.1.5 Sampling variability of corrected estimates 172
- 7.1.6 Adjustments for other biasing factors 175
- 7.1.7 A broader context 177
- 7.1.8 Non-binary tests 179
- 7.2 Verification restricted to screen positives 180
- 7.2.1 Extreme verification bias 180
- 7.2.2 Identificable parameters for a single test 181
- 7.2.3 Comparing tests 183
- 7.2.4 Evaluating covariate effects on (DP, FP) 185
- 7.2.5 Evaluating covariate effects on (TPF, FPF) and on prevalence 187
- 7.2.6 Evaluating covariate effects on (rTPF, rFPF) 189
- 7.2.7 Alternative strategies 193
- 7.3 Imperfect reference tests 194
- 7.3.2 Effects on accuracy parameters 194
- 7.3.3 Classic latent class analysis 197
- 7.3.4 Relaxing the conditional independence assumption 200
- 7.3.5 A critique of latent class analysis 203
- 7.3.6 Discrepant resolution 205
- 7.3.7 Composite reference standards 206
- 7.6 Proofs of theoretical results 210
- 8 Study design and hypothesis testing 214
- 8.1 The phases of medical test development 214
- 8.1.1 Research as a process 214
- 8.1.2 Five phases for the development of a medical test 215
- 8.2 Sample sizes for phase 2 studies 218
- 8.2.1 Retrospective validation of a binary test 218
- 8.2.2 Retrospective validation of a continuous test 220
- 8.2.3 Sample size based on the AUC 224
- 8.2.4 Ordinal tests 228
- 8.3 Sample sizes for phase 3 studies 229
- 8.3.1 Comparing two binary tests
- paired data 229
- 8.3.2 Comparing two binary tests
- unpaired data 233
- 8.3.3 Evaluating population effects on test performance 233
- 8.3.4 Comparisons with continuous test results 234
- 8.3.5 Estimating the threshold for screen positivity 237
- 8.3.6 Remarks on phase 3 analyses 238
- 8.4 Sample sizes for phase 4 studies 239
- 8.4.1 Designs for inference about (FPF, TPF) 239
- 8.4.2 Designs for predictive values 241
- 8.4.3 Designs for (FP, DP) 243
- 8.4.4 Selected verification of screen negatives 244
- 8.5 Phase 5 245
- 8.6 Matching and stratification 246
- 9.1 Meta-analysis 253
- 9.1.1 Goals of meta-analysis 253
- 9.1.2 Design of a meta-analysis study 253
- 9.1.3 The summary ROC curve 255
- 9.1.4 Binomial regression models 258
- 9.2 Incorporating the time dimension 259
- 9.2.2 Incident cases and long-term controls 260
- 9.2.3 Interval cases and controls 263
- 9.2.4 Predictive values 266
- 9.2.5 Longitudinal measurements 266
- 9.3 Combining multiple test results 267
- 9.3.1 Boolean combinations 267
- 9.3.2 The likelihood ratio principle 269
- 9.3.3 Optimality of the risk score 271
- 9.3.4 Estimating the risk score 274
- 9.3.5 Development and assessment of the combination score 276
- 9.4.2 New applications and new technologies 277.
- Notes:
- Includes bibliographical references and index.
- ISBN:
- 0198509847
- OCLC:
- 51108967
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.