1 option
Mastering regular expressions / Jeffrey E.F. Friedl.
LIBRA QA76.9.T48 F75 2002
Available from offsite location
- Format:
- Book
- Author/Creator:
- Friedl, Jeffrey E. F.
- Language:
- English
- Subjects (All):
- Text processing (Computer science).
- Programming languages (Electronic computers).
- Electronic data processing.
- Physical Description:
- xxii, 460 pages : illustrations ; 24 cm
- Edition:
- Second edition.
- Place of Publication:
- Beijing ; Sebastopol, CA : O'Reilly, 2002.
- Summary:
- Regular expressions are an extremely powerful tool for manipulating text and data. They have spread like wildfire in recent years, and are now offered as standard features in Perl, Java, VB.NET and C# (and any language using the .NET Framework), PHP, Python, Ruby, Tcl, MySQL, awk, Emacs, and many other popular tools and languages. If you don't use regular expressions yet, you will discover in this book a whole new world of mastery over your data. If you already use them, you'll appreciate this book's unprecedented detail and breadth of coverage. If you think you know all you need to know about regular expressions, this book is a stunning eye-opener.
- Despite their wide availability, flexibility, and unparalleled power, regular expressions are frequently underutilized. Regular expressions allow you to code complex and subtle text processing that you never imagined could be automated. Regular expressions can save you time and aggravation. They can be used to craft elegant solutions to a wide range of problems.
- A command of regular expressions is an invaluable skill. Yet what is power in the hands of an expert can be fraught with peril for the unwary. Mastering Regular Expressions will help you navigate the minefield to becoming an expert. Once you've mastered regular expressions, they'll become an invaluable part of your toolkit. You will wonder how you ever got by without them.
- Mastering Regular Expressions, Second Edition, has been thoroughly updated to include all the new features of Perl 5.8 as well as several other languages, including Java, VB.NET, C#, Python, JavaScript, Tcl, and Ruby. Written in the lucid, entertaining tone that makes a complex, dry topic crystal-clear to thousands of programmers, and sprinkled with solutions to complex real-world problems, Mastering Regular Expressions offers a wealth of information that you can put to immediate use.
- Contents:
- Solving Real Problems 2
- Regular Expressions as a Language 4
- The Filename Analogy 4
- The Language Analogy 5
- The Regular-Expression Frame of Mind 6
- If You Have Some Regular-Expression Experience 6
- Searching Text Files: Egrep 6
- Egrep Metacharacters 8
- Start and End of the Line 8
- Character Classes 9
- Matching Any Character with Dot 11
- Alternation 13
- Ignoring Differences in Capitalization 14
- Word Boundaries 15
- Optional Items 17
- Other Quantifiers: Repetition 18
- Parentheses and Backreferences 20
- The Great Escape 22
- Expanding the Foundation 23
- Linguistic Diversification 23
- The Goal of a Regular Expression 23
- Regular Expression Nomenclature 27
- Improving on the Status Quo 30
- 2 Extended Introductory Examples 35
- A Short Introduction to Perl 37
- Matching Text with Regular Expressions 38
- Toward a More Real-World Example 40
- Side Effects of a Successful Match 40
- Intertwined Regular Expressions 43
- Intermission 49
- Modifying Text with Regular Expressions 50
- Example: Form Letter 50
- Example: Prettifying a Stock Price 51
- Automated Editing 53
- A Small Mail Utility 53
- Adding Commas to a Number with Lookaround 59
- Text-to-HTML Conversion 67
- That Doubled-Word Thing 77
- 3 Overview of Regular Expression Features and Flavors 83
- A Casual Stroll Across the Regex Landscape 85
- The Origins of Regular Expressions 85
- Care and Handling of Regular Expressions 93
- Integrated Handling 94
- Procedural and Object-Oriented Handling 95
- A Search-and-Replace Example 97
- Search and Replace in Other Languages 99
- Care and Handling: Summary 101
- Strings, Character Encodings, and Modes 101
- Strings as Regular Expressions 101
- Character-Encoding Issues 105
- Regex Modes and Match Modes 109
- Common Metacharacters and Features 112
- Character Representations 114
- Character Classes and Class-Like Constructs 117
- Anchors and Other "Zero-Width Assertions" 127
- Comments and Mode Modifiers 133
- Grouping, Capturing, Conditionals, and Control 135
- Guide to the Advanced Chapters 141
- 4 The Mechanics of Expression Processing 143
- Start Your Engines! 143
- Two Kinds of Engines 144
- New Standards 144
- Regex Engine Types 145
- From the Department of Redundancy Department 146
- Testing the Engine Type 146
- Match Basics 147
- Rule 1 The Match That Begins Earliest Wins 148
- Engine Pieces and Parts 149
- Rule 2 The Standard Quantifiers Are Greedy 151
- Regex-Directed Versus Text-Directed 153
- NFA Engine: Regex-Directed 153
- DFA Engine: Text-Directed 155
- First Thoughts: NFA and DFA in Comparison 156
- Backtracking 157
- A Really Crummy Analogy 158
- Two Important Points on Backtracking 159
- Saved States 159
- Backtracking and Greediness 162
- More About Greediness and Backtracking 163
- Problems of Greediness 164
- Multi-Character "Quotes" 165
- Using Lazy Quantifiers 166
- Greediness and Laziness Always Favor a Match 167
- The Essence of Greediness, Laziness, and Backtracking 168
- Possessive Quantifiers and Atomic Grouping 169
- Possessive Quantifiers, ?+, *+, ++, and {m,n}+ 172
- The Backtracking of Lookaround 173
- Is Alternation Greedy? 174
- Taking Advantage of Ordered Alternation 175
- NFA, DFA, and POSIX 177
- "The Longest-Leftmost" 177
- POSIX and the Longest-Leftmost Rule 178
- Speed and Efficiency 179
- Summary: NFA and DFA in Comparison 180
- 5 Practical Regex Techniques 185
- Regex Balancing Act 186
- Continuing with Continuation Lines 186
- Matching an IP Address 187
- Working with Filenames 190
- Matching Balanced Sets of Parentheses 193
- Watching Out for Unwanted Matches 194
- Matching Delimited Text 196
- Knowing Your Data and Making Assumptions 198
- Stripping Leading and Trailing Whitespace 199
- HTML-Related Examples 200
- Matching an HTML Tag 200
- Matching an HTML Link 201
- Examining an HTTP URL 203
- Validating a Hostname 203
- Plucking Out a URL in the Real World 205
- Keeping in Sync with Your Data 208
- Parsing CSV Files 212
- 6 Crafting an Efficient Expression 221
- A Simple Change
- Placing Your Best Foot Forward 223
- Efficiency Verses Correctness 223
- Advancing Further
- Localizing the Greediness 225
- A Global View of Backtracking 228
- More Work for a POSIX NFA 229
- Work Required During a Non-Match 230
- Being More Specific 231
- Alternation Can Be Expensive 231
- Benchmarking 232
- Know What You're Measuring 234
- Benchmarking with Java 234
- Benchmarking with VB.NET 236
- Benchmarking with Python 237
- Benchmarking with Ruby 238
- Benchmarking with Tcl 239
- Common Optimizations 239
- No Free Lunch 240
- Everyone's Lunch is Different 240
- The Mechanics of Regex Application 241
- Pre-Application Optimizations 242
- Optimizations with the Transmission 245
- Optimizations of the Regex Itself 247
- Techniques for Faster Expressions 252
- Common Sense Techniques 254
- Expose Literal Text 255
- Expose Anchors 255
- Lazy Versus Greedy: Be Specific 256
- Split Into Multiple Regular Expressions 257
- Mimic Initial-Character Discrimination 258
- Use Atomic Grouping and Possessive Quantifiers 259
- Lead the Engine to a Match 260
- Unrolling the Loop 261
- Method 1 Building a Regex From Past Experiences 262
- The Real "Unrolling-the-Loop" Pattern 263
- Method 2 A Top-Down View 266
- Method 3 An Internet Hostname 267
- Observations 268
- Using Atomic Grouping and Possessive Quantifiers 268
- Short Unrolling Examples 270
- Unrolling C Comments 272
- The Freeflowing Regex 277
- A Helping Hand to Guide the Match 277
- A Well-Guided Regex is a Fast Regex 279
- 7 Perl 283
- Regular Expressions as a Language Component 285
- Perl's Greatest Strength 286
- Perl's Greatest Weakness 286
- Perl's Regex Flavor 286
- Regex Operands and Regex Literals 288
- How Regex Literals Are Parsed 292
- Regex Modifiers 292
- Regex-Related Perlisms 293
- Expression Context 294
- Dynamic Scope and Regex Match Effects 295
- Special Variables Modified by a Match 299
- The qr/.../ Operator and Regex Objects 303
- Building and Using Regex Objects 303
- Viewing Regex Objects 305
- Using Regex Objects for Efficiency 306
- The Match Operator 306
- Match's Regex Operand 307
- Specifying the Match Target Operand 308
- Different Uses of the Match Operator 309
- Iterative Matching: Scalar Context, with /g 312
- The Match Operator's Environmental Relations 316
- The Substitution Operator 318
- The Replacement Operand 319
- The /e Modifier 319
- Context and Return Value 321
- The Split Operator 321
- Basic Split 322
- Returning Empty Elements 324
- Split's Special Regex Operands 325
- Split's Match Operand with Capturing Parentheses 326
- Fun with Perl Enhancements 326
- Using a Dynamic Regex to Match Nested Pairs 328
- Using the Embedded-Code Construct 331
- Using local in an Embedded-Code Construct 335
- A Warning About Embedded Code and my Variables 338
- Matching Nested Constructs with Embedded Code 340
- Overloading Regex Literals 341
- Problems with Regex-Literal Overloading 344
- Mimicking Named Capture 344
- Perl Efficiency Issues 347
- "There's More Than One Way to Do It" 348
- Regex Compilation, the /o Modifier, qr/.../, and Efficiency 348
- Understanding the "Pre-Match" Copy 355
- The Study Function 359
- Benchmarking 360
- Regex Debugging Information 361
- 8 Java 365
- Judging a Regex Package 366
- Technical Issues 366
- Social and Political Issues 367
- Object Models 368
- A Few Abstract Object Models 368
- Growing Complexity 372
- Packages, Packages, Packages 372
- Why So Many "Perl5" Flavors? 375
- Lies, Damn Lies, and Benchmarks 375
- Recommendations 377
- Sun's Regex Package 378
- Regex Flavor 378
- Using java.util.regex 381
- The Pattern.compile() Factory 383
- The Matcher Object 384
- Other Pattern Methods 390
- A Quick Look at Jakarta-ORO 392
- ORO's Perl5Util 392
- A Mini Perl5Util Reference 393
- Using ORO's Underlying Classes 397
- 9 .NET 399
- .NET's Regex Flavor 400
- Using .NET Regular Expressions 407
- Regex Quickstart 407
- Core Object Details 412
- Creating Regex Objects 413
- Using Regex Objects 415
- Using Match Objects 421
- Using Group Objects 424
- Static "Convenience" Functions 425
- Regex Caching 426
- Support Functions 426
- Advanced .NET 427
- Regex Assemblies 428
- Matching Nested Constructs 430
- Capture Objects 431.
- Notes:
- Includes index.
- Local Notes:
- Acquired for the Penn Libraries with assistance from the Ellis D. Williams, College 1865, Endowment Fund.
- ISBN:
- 0596002890
- OCLC:
- 50713300
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.