2 options
Genomic Perl : from bioinformatics basics to working code / Rex A. Dwyer.
LIBRA QA76.73.P22 D88 2003 1 v. + disc
Available from offsite location
- Format:
- Book
- Author/Creator:
- Dwyer, Rex A.
- Language:
- English
- Subjects (All):
- Perl (Computer program language).
- Molecular biology--Data processing.
- Molecular biology.
- Bioinformatics.
- Physical Description:
- xvii, 334 pages : illustrations ; 26 cm + 1 CD-ROM (4 3/4 in.)
- Place of Publication:
- Cambridge, UK ; New York : Cambridge University Press, 2003.
- System Details:
- System requirements: ISO 9660 format for Macintosh, Windows, UNIX, and LINUX.
- text file
- Summary:
- This introduction to computational molecular biology will help programmers and biologists learn the skills they need to start work in this important, expanding field. The author explains many of the basic computational problems and gives concise, working programs to solve them in the Perl programming language. With minimal prerequisites, the author explains the biological background for each problem, develops a model for the solution, and then introduces the Perl concepts needed to implement the solution. The book covers pairwise and multiple sequence alignment, fast database searches for homologous sequences, protein motif identification, genome rearrangement, physical mapping, phylogeny reconstruction, satellite identification, sequence assembly, gene finding, and RNA secondary structure. The author focuses on one or two practical approaches for each problem rather than an exhaustive catalog of ideas. His concrete examples and step-by-step approach make it easy to grasp the computational and statistical methods, including dynamic programming, branch-and-bound optimization, greedy methods, maximum likelihood methods, substitution matrices, BLAST searching, and Karlin-Altschul statistics.
- Contents:
- 1 The Central Dogma 1
- 1.1 DNA and RNA 1
- 1.2 Chromosomes 2
- 1.3 Proteins 4
- 1.4 The Central Dogma 5
- 1.5 Transcription and Translation in Perl 7
- 2 RNA Secondary Structure 16
- 2.1 Messenger and Catalytic RNA 16
- 2.2 Levels of RNA Structure 17
- 2.3 Constraints on Secondary Structure 18
- 2.4 RNA Secondary Structures in Perl 20
- 2.4.1 Counting Hydrogen Bonds 21
- 2.4.2 Folding RNA 24
- 3 Comparing DNA Sequences 30
- 3.1 DNA Sequencing and Sequence Assembly 30
- 3.2 Alignments and Similarity 32
- 3.3 Alignment and Similarity in Perl 36
- 4 Predicting Species: Statistical Models 44
- 4.1 Perl Subroutine Libraries 49
- 4.2 Species Prediction in Perl 51
- 5 Substitution Matrices for Amino Acids 55
- 5.1 More on Homology 57
- 5.2 Deriving Substitution Matrices from Alignments 57
- 5.3 Substitution Matrices in Perl 60
- 5.4 The PAM Matrices 65
- 5.5 PAM Matrices in Perl 68
- 6 Sequence Databases 72
- 6.1 FASTA Format 73
- 6.2 GenBank Format 73
- 6.3 GenBank's Feature Locations 75
- 6.4 Reading Sequence Files in Perl 79
- 6.4.1 Object-Oriented Programming in Perl 80
- 6.4.2 The SimpleReader Class 81
- 6.4.3 Hiding File Formats with Method Inheritance 85
- 7 Local Alignment and the BLAST Heuristic 93
- 7.1 The Smith-Waterman Algorithm 94
- 7.2 The BLAST Heuristic 96
- 7.2.1 Preprocessing the Query String 98
- 7.2.2 Scanning the Target String 99
- 7.3 Implementing BLAST in Perl 100
- 8 Statistics of BLAST Database Searches 109
- 8.1 BLAST Scores for Random DNA 109
- 8.2 BLAST Scores for Random Residues 114
- 8.3 BLAST Statistics in Perl 116
- 8.4 Interpreting BLAST Output 123
- 9 Multiple Sequence Alignment I 127
- 9.1 Extending the Needleman-Wunsch Algorithm 128
- 9.2 NP-Completeness 131
- 9.3 Alignment Merging: A Building Block for Heuristics 132
- 9.4 Merging Alignments in Perl 133
- 9.5 Finding a Good Merge Order 137
- 10 Multiple Sequence Alignment II 141
- 10.1 Pushing through the Matrix by Layers 141
- 10.2 Tunnel Alignments 147
- 10.3 A Branch-and-Bound Method 149
- 10.4 The Branch-and-Bound Method in Perl 152
- 11 Phylogeny Reconstruction 155
- 11.1 Parsimonious Phylogenies 155
- 11.2 Assigning Sequences to Branch Nodes 157
- 11.3 Pruning the Trees 160
- 11.4 Implementing Phylogenies in Perl 162
- 11.5 Building the Trees in Perl 168
- 12 Protein Motifs and PROSITE 173
- 12.1 The PROSITE Database Format 174
- 12.2 Patterns in PROSITE and Perl 175
- 12.3 Suffix Trees 177
- 12.3.1 Suffix Links 184
- 12.3.2 The Efficiency of Adding 188
- 12.4 Suffix Trees for PROSITE Searching 189
- 13 Fragment Assembly 196
- 13.1 Shortest Common Superstrings 196
- 13.2 Practical Issues and the PHRAP Program 202
- 13.3 Reading Inputs for Assembly 204
- 13.4 Aligning Reads 207
- 13.5 Adjusting Qualities 212
- 13.6 Assigning Reads to Contigs 217
- 13.7 Developing Consensus Sequences 222
- 14 Coding Sequence Prediction with Dicodons 231
- 14.1 A Simple Trigram Model 232
- 14.2 A Hexagram Model 235
- 14.3 Predicting All Genes 237
- 14.4 Gene Finding in Perl 237
- 15 Satellite Identification 245
- 15.1 Finding Satellites Efficiently 246
- 15.1.1 Suffix Testing 247
- 15.1.2 Satellite Testing 249
- 15.2 Finding Satellites in Perl 251
- 16 Restriction Mapping 257
- 16.1 A Backtracking Algorithm for Partial Digests 258
- 16.2 Partial Digests in Perl 260
- 16.3 Uncertain Measurement and Interval Arithmetic 262
- 16.3.1 Backtracking with Intervals 263
- 16.3.2 Interval Arithmetic in Perl 265
- 16.3.3 Partial Digests with Uncertainty in Perl 267
- 16.3.4 A Final Check for Interval Consistency 269
- 17 Rearranging Genomes: Gates and Hurdles 275
- 17.1 Sorting by Reversals 276
- 17.2 Making a Wish List 278
- 17.3 Analyzing the Interaction Relation 279
- 17.4 Clearing the Hurdles 280
- 17.5 Happy Cliques 284
- 17.6 Sorting by Reversals in Perl 287
- 17.8 Appendix: Correctness of Choice of Wish from Happy Clique 298
- A Drawing RNA Cloverleaves 300
- B Space-Saving Strategies for Alignment 307
- B.1 Finding Similarity Scores Compactly 307
- B.2 Finding Alignments Compactly 309
- C A Data Structure for Disjoint Sets 313
- C.1 Union by Rank 314
- C.2 Path Compression 315.
- Notes:
- Includes bibliographical references (pages 318-323) and index.
- Local Notes:
- Acquired for the Penn Libraries with assistance from the Class of 1924 Book Fund.
- ISBN:
- 052180177X
- OCLC:
- 50054317
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.