2 options

Analysis of microarray gene expression data / Mei-Ling Ting Lee.

QP624.5.D726 L44 2004

Loading location information...

Available This item is available for access.

QP624.5.D726 L44 2004

Loading location information...

Available This item is available for access.

Format:: Book
Author/Creator:: Lee, Mei-Ling Ting.
Language:: English
Subjects (All):: DNA microarrays--Statistical methods.; DNA microarrays.; Gene expression--Statistical methods.; Gene expression.; Oligonucleotide Array Sequence Analysis--methods.; Gene Expression--methods.; Statistics.
Medical Subjects:: Oligonucleotide Array Sequence Analysis--methods.; Gene Expression--methods.
Physical Description:: xvi, 371 pages : illustrations (some color) ; 25 cm
Place of Publication:: Boston : Kluwer Academic, [2004]
Summary:: After genomic sequencing, microarray technology has emerged as a widely used platform for genomic studies in the life sciences. Microarray technology provides a systematic way to survey DNA and RNA variation. With the abundance of data produced from microarray studies, however, the ultimate impact of the studies on biology will depend heavily on data mining and statistical analysis. The contribution of this book is to provide readers with an integrated presentation of various topics on analyzing microarray data.
Contents:: Part I Genome Probing Using Microarrays; 2. DNA, RNA, Proteins, and Gene Expression 7; 2.1 The Molecules of Life 7; 2.2 Genes 8; 2.3 DNA 9; 2.4 RNA 12; 2.5 The Genetic Code 13; 2.6 Proteins 14; 2.7 Gene Expression and Microarrays 15; 2.8 Complementary DNA (cDNA) 16; 2.9 Nucleic Acid Hybridization 16; 3. Microarray Technology 19; 3.1 Transcriptional Profiling 20; 3.1.1 Sequencing-based Transcriptional Profiling 20; 3.1.2 Hybridization-based Transcriptional Profiling 22; 3.2 Microarray Technological Platforms 23; 3.3 Probe Selection and Synthesis 24; 3.4 Array Manufacturing 30; 3.5 Target Labeling 31; 3.6 Hybridization 34; 3.7 Scanning and Image Analysis 35; 3.8 Microarray Data 36; 3.8.1 Spotted Array Data 36; 3.8.2 In-situ Oligonucleotide Array Data 37; 3.9 So I Have My Microarray Data - What's Next? 39; 3.9.1 Confirming Microarray Results 39; 3.9.2 Northern Blot Analysis 40; 3.9.3 Reverse-transcription PCR and Quantitative Real-time RT-PCR 40; 4. Inherent Variability in Array Data 45; 4.1 Genetic Populations 45; 4.2 Variability in Gene Expression Levels 47; 4.2.1 Variability Due to Specimen Sampling 47; 4.2.2 Variability Due to Cell Cycle Regulation 48; 4.2.3 Experimental Variability 48; 4.3 Test the Variability by Replication 50; 4.3.1 Duplicated Spots 50; 4.3.2 Multiple Arrays and Biological Replications 51; 5. Background Noise 53; 5.1 Pixel-by-pixel Analysis of Individual Spots 53; 5.2 General Models for Background Noise 56; 5.2.1 Additive Background Noise 57; 5.2.2 Correction for Background Noise 58; 5.2.3 Example: Replication Test Data Set 59; 5.2.4 Noise Models for GeneChip Arrays 62; 5.2.5 Elusive Nature of Background Noise 63; 6. Transformation and Normalization 67; 6.1 Data Transformations 67; 6.1.1 Logarithmic Transformation 67; 6.1.2 Square Root Transformation 68; 6.1.3 Box-Cox Transformation Family 69; 6.1.4 Affine Transformation 69; 6.1.5 The Generalized-log Transformation 71; 6.2 Data Normalization 72; 6.2.1 Normalization Across G Genes 74; 6.2.2 Example: Mouse Juvenile Cystic Kidney Data Set 75; 6.2.3 Normalization Across G Genes and N Samples 77; 6.2.4 Color Effects and MA Plots 78; 6.2.5 Normalization Based on LOWESS Function 80; 6.2.6 Normalization Based on Rank-invariant Genes 82; 6.2.7 Normalization Based on a Sample Pool 82; 6.2.8 Global Normalization Using ANOVA Models 82; 6.2.9 Other Normalization Issues 83; 7. Missing Values in Array Data 85; 7.1 Missing Values in Array Data 85; 7.1.1 Sources of Problem 85; 7.2 Statistical Classification of Missing Data 86; 7.3 Missing Values in Replicated Designs 88; 7.4 Imputation of Missing Values 89; 8. Saturated Intensity Readings 93; 8.1 Saturated Intensity Readings 93; 8.2 Multiple Power-levels for Spotted Arrays 93; 8.2.1 Imputing Saturated Intensity Readings 95; 8.3 High Intensities in Oligonucleotide Arrays 97; Part II Statistical Models and Analysis; 9. Experimental Design 103; 9.1 Factors Involved in Experiments 103; 9.2 Types of Design Structures 106; 9.3 Common Practice in Microarray Studies 112; 9.3.1 Reference Design 112; 9.3.2 Time-course Experiment 114; 9.3.3 Color Reversal 115; 9.3.4 Loop Design 116; 9.3.5 Example: Time-course Loop Design 117; 10. ANOVA Models for Microarray Data 121; 10.1 A Basic Log-linear Model 121; 10.2 ANOVA With Multiple Factors 123; 10.2.1 Main Effects 123; 10.2.2 Interaction Effects 123; 10.3 A Generic Fixed-Effects ANOVA Model 124; 10.3.1 Estimation for Interaction Effects 126; 10.4 Two-stage Estimation Procedures 126; 10.5 Identifying Differentially Expressed Genes 130; 10.5.1 Standard MSE-based Approach 130; 10.5.2 Other Approaches 132; 10.5.3 Modified MSE-based Approach 132; 10.6 Mixed-effects Models 135; 10.7 ANOVA for Split-plot Design 136; 10.8 Log Intensity Versus Log Ratio 138; 11. Multiple Testing in Microarray Studies 143; 11.1 Hypothesis Testing for Any Individual Gene 143; 11.2 Multiple Testing for the Entire Gene Set 144; 11.2.1 Framework for Multiple Testing 144; 11.2.2 Test Statistic for Each Gene 145; 11.2.3 Two Error Control Criteria in Multiple Testing 146; 11.2.4 Implementation Algorithms 147; 11.2.5 Example of Multiple Testing Algorithms 152; 12. Permutation Tests in Microarray Data 157; 12.2 Permutation Tests in Microarray Studies 160; 12.2.1 Exchangeability in Microarray Designs 160; 12.2.2 Limitation of Having Few Permutations 162; 12.2.3 Pooling Test Results Across Genes 162; 12.3 Lipopolysaccharide-E. coli Data Set 163; 12.3.1 Statistical Model 164; 12.3.2 Permutation Testing and Results 166; 13. Bayesian Methods for Microarray Data 171; 13.1 Mixture Model for Gene Expression 171; 13.1.1 Variations on the Mixture Model 173; 13.1.2 Example of Gamma Models 175; 13.2 Mixture Model for Differential Expression 176; 13.2.1 Mixture Model for Color Ratio Data 176; 13.2.2 Relation of Mixture Model to ANOVA Model 180; 13.2.3 Bayes Interpretation of Mixture Model 182; 13.3 Empirical Bayes Methods 183; 13.3.1 Example of Empirical Bayes Fitting 184; 13.4 Hierarchical Bayes Models 187; 13.4.1 Example of Hierarchical Modeling 189; 14. Power and Sample Size Considerations 193; 14.1 Test Hypotheses in Microarray Studies 194; 14.2 Distributions of Estimated Differential Expression 196; 14.3 Summary Measures of Estimated Differential Expression 196; 14.4 Multiple Testing Framework 197; 14.5 Dependencies of Estimation Errors 199; 14.6 Familywise Type I Error Control 200; 14.6.1 Type I Error Control: the Sidak Approach 201; 14.6.2 Type I Error Control: the Bonferroni Approach 203; 14.7 Familywise Type II Error Control 204; 14.7.1 Type II Error Control: the Sidak Approach 206; 14.7.2 Type II Error Control: the Bonferroni Approach 206; 14.8 Contrast of Planning and Implementation in Multiple Testing 207; 14.9 Power Calculations for Different Summary Measures 208; 14.9.1 Designs with Linear Summary Measure 208; 14.9.2 Numerical Example for Linear Summary 210; 14.9.3 Designs with Quadratic Summary Measure 211; 14.9.4 Numerical Example for Quadratic Summary 213; 14.10 A Bayesian Perspective on Power and Sample Size 214; 14.10.1 Connection to Local Discovery Rates 215; 14.10.2 Representative Local True Discovery Rate 215; 14.10.3 Numerical Example for TDR and FDR 216; 14.11 Applications to Standard Designs 216; 14.11.1 Treatment-control Designs 217; 14.11.2 Sample Size for a Treatment-control Design 218; 14.11.3 Multiple-treatment Designs 221; 14.11.4 Power Table for a Multiple-treatment Design 224; 14.11.5 Time-course and Similar Multiple-treatment Designs 227; 14.12 Relation Between Power, Replication and Design 228; 14.12.1 Effects of Replication 228; 14.12.2 Controlling Sources of Variability 229; 14.13 Assessing Power from Microarray Pilot Studies 230; 14.13.1 Example 1: Juvenile Cystic Kidney Disease 230; 14.13.2 Example 2: Opioid Dependence 231; Part III Unsupervised Exploratory Analysis; 15. Cluster Analysis 237; 15.1 Distance and Similarity Measures 238; 15.2 Distance Measures 239; 15.2.1 Properties of Distance Measures 239; 15.2.2 Minkowski Distance Measures 240; 15.2.3 Mahalanobis Distance 241; 15.3 Similarity Measures 241; 15.3.1 Inner Product 241; 15.3.2 Pearson Correlation Coefficient 242; 15.3.3 Spearman Rank Correlation Coefficient 243; 15.4 Inter-cluster Distance 243; 15.4.1 Mahalanobis Inter-cluster Distance 244; 15.4.2 Neighbor-based Inter-cluster Distance 244; 15.5 Hierarchical Clustering 244; 15.5.1 Single Linkage Method 245; 15.5.2 Complete Linkage Method 245; 15.5.3 Average Linkage Clustering 245; 15.5.4 Centroid Linkage Method 246; 15.5.5 Median Linkage Clustering 246; 15.5.6 Ward's Clustering Method 246; 15.5.7 Applications 246; 15.5.8 Comparisons of Clustering Algorithms 247; 15.6 K-means Clustering 247; 15.7 Bayesian Cluster Analysis 248; 15.8 Two-way Clustering Methods 248; 15.9 Reliability of Clustering Patterns for Microarray Data 249; 16. Principal Components and Singular Value Decomposition 251; 16.1 Principal Component Analysis 251; 16.1.1 Applications of Dominant Principal Components 253; 16.2 Singular-value Decomposition 254; 16.3 Computational Procedures for SVD 255; 16.4 Eigengenes and Eigenarrays 256; 16.5 Fraction of Eigenexpression 256; 16.6 Generalized Singular Value Decomposition 257; 16.7 Robust Singular Value Decomposition 257; 17. Self-Organizing Maps 261; 17.1 The Basic Logic of a SOM 261; 17.2 The SOM Updating Algorithm 265; 17.3 Program GENECLUSTER 267; 17.4 Supervised SOM 268; 17.5 Applications 268; 17.5.1 Using SOM to Cluster Genes 268; 17.5.2 Using SOM to Cluster Tumors 269; 17.5.3 Multiclass Cancer Diagnosis 270; Part IV Supervised Learning Methods; 18. Discrimination and Classification 277; 18.1 Fisher's Linear Discriminant Analysis 278; 18.2 Maximum Likelihood Discriminant Rules 279; 18.3 Bayesian Classification 280; 18.4 k-Nearest Neighbor Classifier 281; 18.5 Neighborhood Analysis 282; 18.6 A Gene-casting Weighted Voting Scheme 283; 18.7 Example: Classification of Leukemia Samples 284; 19. Artificial Neural Networks 287; 19.1 Single-layer Neural Network 288; 19.1.1 Separating Hyperplanes 288; 19.1.2 Class Labels 289; 19.1.3 Decision Rules 290; 19.1.4 Risk Functions 290; 19.1.5 Gradient Descent Procedures 290; 19.1.6 Rosenblatt's Perceptron Method 291; 19.2 General Structure of Multilayer Neural Networks 292; 19.3 Training a Multilayer Neural Network 294; 19.3.1 Sigmoid Functions 294; 19.3.2 Mathematical Formulation 295; 19.3.3 Training Algorithm 296; 19.4 Cancer Classification Using Neural Networks 298; 20. Support Vector Machines 301; 20.1 Geometric Margins for Linearly Separable Groups 301; 20.2 Convex Optimization in the Dual Space 305; 20.3 Support Vectors 306; 20.4 Linearly Nonseparable Groups 307; 20.5 Nonlinear Separating Boundary 308; 20.5.1 Kernel Functions 309; 20.5.2 Kernels Defined by Symmetric Functions 309; 20.5.3 Use of SVM for Classifying Genes 310; 20.6.1 Functional Classification of Genes 311; 20.6.2 SVM and One-versus-All Classification Scheme 313; Sample Size Table for Treatment-control Designs 317; Power Table for Multiple-treatment Designs 327.
Notes:: Includes bibliographical references (pages [351]-365) and indexes.
ISBN:: 0792370872; 1402077890; 1402077882
OCLC:: 54081725

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

2 options

Analysis of microarray gene expression data / Mei-Ling Ting Lee.

Find

My Account

Guides