My Account Log in

3 options

The phylogenetic handbook : a practical approach to DNA and protein phylogeny / edited by Marco Salemi and Anne-Mieke Vandamme.

Sample text Available online

View online

Table of contents Available online

View online
Chemistry Library - Books QP624 .P485 2003
Loading location information...

Available This item is available for access.

Log in to request item
Format:
Book
Contributor:
Salemi, Marco, 1968-
Vandamme, Anne-Mieke, 1960-
Language:
English
Subjects (All):
DNA--Analysis--Handbooks, manuals, etc.
DNA.
Proteins--Analysis--Handbooks, manuals, etc.
Proteins.
Cladistic analysis--Handbooks, manuals, etc.
Cladistic analysis.
Proteins--Analysis.
DNA--Analysis.
Genre:
Handbooks and manuals.
Physical Description:
xxiv, 406 pages : illustrations (some color) ; 26 cm
Place of Publication:
Cambridge, UK ; New York : Cambridge University Press, 2003.
Summary:
The Phylogentic Handbook is a broad introduction to the theory and practice of nucleotide and amino-acid phylogenetic analysis. As a unique feature of this book, each chapter contains an extensive practical section, in which step-by-step exercises on real data sets introduce the reader to the most widely used phylogeny software, including CLUSTAL, PHYLIP, PAUP*, DAMBE, TREE-PUZZLE, TREECON, SplitsTree, TreeView, MEGA2, PAML, and SimPlot. Chapters 1 through 10 provide a strong background in basic topics such as the use of sequence databases, alignment algorithms, tree-building methods, estimation of genetic distances, and testing models of evolution. Additional chapters briefly survey special topics in evolution; for example, modeling evolution with networks, studying recombination, testing for positive selection, and methods in population genetics. The book will be an invaluable resource for advanced-level undergraduate and graduate students, as well as for professionals working in the fields of molecular biology and evolution.
Contents:
1 Basic concepts of molecular evolution / Anne-Mieke Vandamme 1
1.1 Genetic information 1
1.2 Population dynamics 6
1.3 Data used for molecular phylogenetic analysis 10
1.4 What is a phylogenetic tree? 14
1.5 Methods to infer phylogenetic trees 17
1.6 Is evolution always tree-like? 21
2 Sequence databases 24
Theory / Guy Bottu, Marc Van Ranst 24
2.1 General nucleic acid sequence databases 24
2.2 General protein sequence databases 26
2.3 Nonredundant sequence databases 27
2.4 Specialized sequence databases 28
2.5 Databases with aligned protein sequences 29
2.6 Database documentation search 30
2.6.1 Text-string searching 30
2.6.2 Searching by index 30
2.7 ENTREZ database 32
2.8 Sequence similarity searching: BLAST 33
Practice / Marco Salemi 37
2.9 File formats 37
2.10 Three example data sets 40
2.10.1 Preparing input files: HIV/SIV example data set 41
3 Multiple alignment 45
Theory / Des Higgins 45
3.2 The problem of repeats 46
3.3 The problem of substitutions 47
3.4 The problem of gaps 50
3.5 Testing multiple-alignment methods 51
3.6 Multiple-alignment algorithms 52
3.6.1 Dot-matrix sequence comparison 52
3.6.2 Dynamic programming 54
3.6.3 Genetic algorithms 55
3.6.4 Other algorithms 55
3.7 Progressive alignment 55
3.7.1 Clustal 57
3.7.2 T-Coffee 58
3.8 Hidden Markov models 58
3.9 Nucleotide sequences versus amino-acid sequences 59
Practice / Des Higgins, Marco Salemi 61
3.10 Searching for homologous sequences with BioEdit 61
3.11 File formats for Clustal 63
3.12 Access to ClustalW and ClustalX 64
3.13 Aligning the HIV/SIV sequences with ClustalX 64
3.14 Aligning nucleotide sequences in a coding region with DAMBE 66
3.15 Adding sequences to preexisting alignments 67
3.16 Editing and viewing multiple alignments 68
3.17 Databases of alignments 69
4 Nucleotide substitution models 72
Theory / Korbinian Strimmer, Arndt von Haeseler 72
4.2 Observed and expected distances 73
4.3 Number of mutations in a given time interval *(optional) 74
4.4 Nucleotide substitutions as a homogeneous Markov process 77
4.4.1 The Jukes and Cantor (JC69) model 79
4.5 Derivation of Markov process *(optional) 80
4.5.1 Inferring the expected distances 83
4.6 Nucleotide substitution models 83
4.6.1 Rate heterogeneity over sites 85
Practice: The PHYLIP and TREE-PUZZLE software packages / Marco Salemi 88
4.7 Software packages 88
4.8 Jukes and Cantor (JC69) genetic distances 90
4.9 Kimura 2-parameters (K80) and F84 genetic distances 91
4.10 More complex models 92
4.10.1 Modeling rate heterogeneity over sites 93
4.11 The problem of substitution saturation 95
4.12 Choosing among different evolutionary models 97
5 Phylogeny inference based on distance methods 101
Theory / Yves Van de Peer 101
5.2 Tree-inferring methods based on genetic distances 103
5.2.1 Cluster analysis (UPGMA and WPGMA) 103
5.2.2 Minimum evolution and neighbor-joining 107
5.2.3 Other distance methods 113
5.3 Evaluating the reliability of inferred trees 115
5.3.1 Bootstrap analysis 115
5.3.2 Jackknifing 118
Practice / Marco Salemi 120
5.5 The TreeView program 120
5.6 Procedure to estimate distance-based phylogenetic trees with PHYLIP 120
5.7 Inferring an NJ tree for the mtDNA data set 121
5.8 Inferring a Fitch-Margoliash tree for the mtDNA data set 125
5.9 Inferring an NJ tree for the HIV-1 data set 125
5.10 Bootstrap analysis with PHYLIP 126
5.11 Other programs 133
6 Phylogeny inference based on maximum-likelihood methods with TREE-PUZZLE 137
Theory / Arndt von Haeseler, Korbinian Strimmer 137
6.2 The formal framework 140
6.2.1 The simple case: Maximum-likelihood tree for two sequences 140
6.2.2 The complex case 141
6.3 Computing the probability of an alignment for a fixed tree 142
6.3.1 Felsenstein's pruning algorithm 144
6.4 Finding a maximum-likelihood tree 145
6.4.1 The quartet-puzzling algorithm 146
6.5 Estimating the model parameters with maximum likelihood 149
6.6 Likelihood-mapping analysis 150
Practice / Arndt von Haeseler, Korbinian Strimmer 153
6.7 Software packages 153
6.8 An illustrative example of quartet-puzzling tree reconstruction 153
6.9 Likelihood-mapping analysis of the HIV data set 156
7 Phylogeny inference based on parsimony and other methods using PAUP 160
Theory / David L. Swofford, Jack Sullivan 160
7.2 Parsimony analysis - background 161
7.3 Parsimony analysis - methodology 163
7.3.1 Calculating the length of a given tree under the parsimony criterion 163
7.4 Searching for optimal trees 166
7.4.1 Exact methods 171
7.4.2 Approximate methods 175
Practice / David L. Swofford, Jack Sullivan 182
7.5 Analyzing data with PAUP* through the command-line interface 182
7.6 Basic parsimony analysis and tree-searching 186
7.7 Analysis using distance methods 193
7.8 Analysis using maximum-likelihood methods 196
8 Phylogenetic analysis using protein sequences 207
Theory / Fred R. Opperdoes 207
8.2 Why protein sequences? 209
8.2.1 The genetic code 210
8.2.2 Codon bias 210
8.2.3 Long time horizon 210
8.2.4 Phylogenetic noise reduction 211
8.2.5 Introns and noncoding DNA 211
8.2.6 Multigene families and post-transcriptional editing 212
8.3 Measurement of sequence divergence in proteins: The PAM 213
8.4 Alignment of protein sequences 215
8.4.1 Sequence retrieval and multiple-sequence alignment 219
8.4.2 Secondary-structure-based alignment 219
8.4.3 Prodom, Pfam, and Blocks databases 220
8.4.4 Manual adjustment of a protein alignment 220
8.5 Tree-building methods for protein phylogeny 221
8.6 Some good advice 224
Practice / Fred R. Opperdoes 226
8.7 A phylogenetic analysis of the Leismanial GPD gene carried out via the Internet 226
8.8 A comparison of the trypanosomatid phylogeny from nucleotide and protein sequences 230
8.9 Implementing different evolutionary models with DAMBE and TREE-PUZZLE 233
9 Analysis of nucleotide sequences using TREECON 236
Theory / Yves Van de Peer 236
9.2 TREECON, distance trees, and among-site rate variation 236
9.2.1 Taking into account among-site rate variation: An example 241
Practice / Yves Van de Peer 246
9.4 The TREECON software package 246
9.5 Implementation 246
9.6 Substitution rate calibration 251
10 Selecting models of evolution 256
Theory / David Posada 256
10.1 Models of evolution and phylogeny reconstruction 256
10.2 The relevance of models of evolution 257
10.3 Selecting models of evolution 257
10.4 The likelihood ratio test 258
10.4.1 LRTs and parametric bootstrapping 259
10.4.2 Hierarchical LRTs 260
10.4.3 Dynamical LRTs 261
10.5 Information criteria 263
10.5.1 AIC 264
10.5.2 BIC 264
10.6 Fit of a single model to the data 264
10.7 Testing the molecular clock hypothesis 265
10.7.1 The relative rate test 266
10.7.2 LRT of the global molecular clock 267
Practice / David Posada 270
10.8 The model-selection procedure 270
10.9 The program MODELTEST 273
10.10 Implementing the LRT of the molecular clock using PAUP* 275
10.11 Selecting the best-fit model in the example data sets 276
10.11.1 Vertebrate mtDNA 277
10.11.2 HIV envelope gene 278
10.11.3 G3PDH protein 279
11 Analysis of coding sequences 283
Theory / Yoshiyuki Suzuki, Takashi Gojobori 283
11.2 Mutation fraction methods 285
11.2.1 Method of Nei and Gojobori (NG86 method) 285
11.2.2 Method of Zhang et al. (ZRN98 method) 287
11.2.3 Method of Ina (I95 method) 288
11.3 Degenerate site methods 290
11.3.1 Method of Li et al.
(LWL85 method) 291
11.3.2 Method of Pamilo and Bianchi, and Li (PBL93 method) 294
11.4 Codon model methods 294
11.4.1 Method of Muse (M96 method) 295
11.4.2 Method of Yang and Nielsen (YN98 method) 296
11.5 Methods for estimating d[subscript S] and d[subscript N] at single codon sites 296
11.5.1 Method of Suzuki and Gojobori (SG99 method) 297
11.6 Test of neutrality for two sequences 298
11.6.1 Z test 298
11.6.2 Likelihood ratio test (LRT) 298
11.6.3 Window analysis 299
11.7 Test of neutrality at single codon sites 299
11.7.1 Method of Nielsen and Yang (1998) (NY98 method) 300
11.7.2 SG99 method 300
Practice / Yoshiyuki Suzuki, Takashi Gojobori 302
11.8 Software for analyzing coding sequences 302
11.9 Estimation of d[subscript S] and d[subscript N] in an HCV data set 302
11.9.1 Estimation of d[subscript S] and d[subscript N] with NG86, ZRN98, LWL85, and PBL93 methods (MEGA2) 303
11.9.2 Estimation of d[subscript S] and d[subscript N] with YN98 method (PAML) 304
11.9.3 Comparing different estimates of d[subscript S] and d[subscript N] 305
11.10 An example of window analysis 306
11.11 Detection of positive selection at single amino acid sites 307
12 SplitsTree: A network-based tool for exploring evolutionary relationships in molecular data 312
Theory / Vincent Moulton 312
12.1 Exploring evolutionary relationships through networks 312
12.2 An introduction to split-decomposition theory 314
12.2.1 The Buneman tree 315
12.2.2 Split decomposition 316
12.3 From weakly compatible splits to networks 318
Practice / Vincent Moulton 320
12.4 The SplitsTree program 320
12.5 Using SplitsTree on the mtDNA data set 320
12.6 Using SplitsTree on the HIV-1 data set 324
13 Tetrapod phylogeny and data exploration using DAMBE 329
Theory / Xuhua Xia, Zheng Xie 329
13.1 The phylogenetic problem and the sequence data 329
13.2 Results of routine phylogenetic analyses without data exploration 330
13.3 Distance-based statistical test of alternative phylogenetic trees (optional) 332
13.4 Likelihood-based statistical tests of alternative phylogenetic trees 333
13.5 Data exploration 335
13.5.1 Nucleotide frequencies 335
13.5.2 Substitution saturation and the rate heterogeneity over sites 337
13.5.3 The pattern of nucleotide substitution 338
13.5.4 Insertion and deletion as phylogenetic characters 339
Practice / Xuhua Xia, Zheng Xie 342
13.6 Data exploration with DAMBE 342
13.6.1 Nucleotide frequencies 342
13.6.2 Basic phylogenetic reconstruction 342
13.6.3 Rate heterogeneity over sites estimated through reconstruction of ancestral sequences 343
13.6.4 Empirical substitution pattern 344
13.6.5 Testing alternative phylogenetic hypotheses with the distance-based method 344
13.6.6 Testing alternative phylogenetic hypotheses with the likelihood-based method 345
14 Detecting recombination in viral sequences 348
Theory / Mika Salminen 348
14.1 Introduction and theoretical background to exploring recombination in viral sequences 348
14.2 Requirements for detecting recombination 349
14.3 Theoretical basis for methods to detect recombination 351
14.4 Examples of viral recombination 360
Practice / Mika Salminen 362
14.5 Existing tools for analysis of recombination 362
14.6 Analyzing example sequences to visualize recombination 364
14.6.1 Exercise 1: Working with Simplot 364
14.6.2 Exercise 2: Mapping recombination with Simplot 368
14.6.3 Exercise 3: Using the "groups" feature of Simplot 369
14.6.4 Exercise 4: Using SplitsTree to visualize recombination 373
15 Lamarc: Estimating population genetic parameters from molecular data 378
Theory / Mary K. Kuhner 378
15.2 Basis of the Metropolis-Hastings MCMC sampler 379
15.2.1 Random sample 381
15.2.2 Stability 381
15.2.3 No other forces 381
15.2.4 Evolutionary model 381
15.2.5 Large population relative to sample 382
15.2.6 Adequate run time 382
Practice / Mary K. Kuhner 384
15.3 The LAMARC software package 384
15.3.1 Fluctuate (Coalesce) 384
15.3.2 Migrate 384
15.3.3 Recombine 385
15.3.4 Lamarc 386
15.4 Starting values 386
15.5 Space and time 387
15.6 Sample size considerations 387
15.7 Virus-specific issues 388
15.7.1 Multiple loci 388
15.7.2 Rapid growth rates 388
15.7.3 Sequential samples 389
15.8 An exercise with LAMARC 389
15.8.1 Exercise using FLUCTUATE 390
15.8.2 Exercise using RECOMBINE 395.
Notes:
Includes bibliographical references and index.
ISBN:
052180390X
OCLC:
50155249

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account