Electronic Review of Computer Books

[ ERCB Home | New | Feature | Brief | DDJ | Letters | Links ]

[an error occurred while processing this directive]

Vital Statistics

Title Mastering Regular Expressions
Author Jeffrey E. F. Friedl
Publisher O'Reilly and Associates
Sebastopol, California
http://www.ora.com
Copyright 1997
ISBN 1-56592-257-3
Pages 342
Price $29.95


Staying Mentally Regular in the World According to Grep

In the forties, Warren McCulluch and Walter Pitts created neuron-level models of how the nervous system operates. The mathematician, Stephen Kleene, later described these models using his mathematical notation called regular expressions. Ken Thompson incorporated that system of notation into qed (the grandfather of the UNIX ed) and eventually into grep. Ever since that time, regular expressions have constantly seeped into UNIX and UNIX-like utilities. These are the regular expressions that are not used to explain nerves; these are ones used to get on them.

Grep, egrep, vi, sed, lex, awk, emacs, Perl, Tcl, and Python support regular expressions. In fact, regular expressions (regexes) are an essential part of these utilities. Unfortunately, regexes usually don't seem very important in the documentation. Man pages only casually mention regexes with usually no theoretical explanation and very limited practical discussion. This leaves most of us crafting regexes like we're playing the board game Battleship -- keep guessing until we sink a solution.

Mastering Regular Expressions is an important work about an often overlooked concept that permeates UNIX: the regular expression. It is clear and practical. The regular expression is explained by concept and not by rote. Friedl is trying to get us to think in regular expressions and not to blindly parrot (or pirate) his examples.

The book uses quick do-you-really-understand questions at key spots in the text. The book is constructed so that you have to turn the current page to see the answer to a question. Unfortunately, this prevents the "accidental" hint. Be honest. It's human nature to have wondering eyes when a question is too hard. If you can't answer a question in this book, then you know for a fact that you can't answer it. It's too hard to "accidentally" turn a page.

That's only one example of its thoughtful design. The typography is the best use of typography as a communication tool that I have seen. With code snippets like "if ($input =~ m/^([-+]?[0-9]+(\.[0-9]*)?)([CF])$/)", the typography takes you by the hand with a series of underlining, shading, and contrived characters that highlight important sections of the code. Whereas in most books the "Typographical Conventions Used" section is fluff, read and memorize this section before you cross the river regex with Friedl. You can't journey to the other side without it.

So, what exactly is a regular expression? What makes it regular? Don't look for those answers here. Friedl gives you practical concepts, not a theoretical framework. He defines regular expressions as "the key to powerful, flexible, and efficient text processing." He clarifies that with "regular expressions...allow you to describe and parse text." That is the closest thing to a definition of regular expressions in the book. That is also like defining the sun as our source of heat and light, but failing to mention that it is the ball of flame in the center of the solar system. The title of the book is not The Theory of Regular Expressions but Mastering Regular Expressions. Look for practical advice on crafting efficient regexes at your desk and not for conversation tidbits in the break room.

-- Don Bryson (dbryson@tclock.com)

 


Quick Rating

Readability Star Star Star
Originality Star Star Star
Organization Star Star Star
Accuracy Star Star Star
Consistency Star Star Star
Depth Star Star
Timeliness Star Star Star
Editing Star Star
Design Star Star Star Star
Overall Value Star Star Star

Explanation of ERCB rating scale: No stars = unacceptable, 1 Star = marginal, 2 Stars = average, 3 Stars = above average, 4 Stars = exceptional.


Copyright © 1997 Electronic Review of Computer Books
Created 09/20/97 / Last modified 09/20/97 / webmaster@ercb.com