start page | rating of books | rating of authors | reviews | copyrights

Unix Power ToolsUnix Power ToolsSearch this book

41.12. Make Custom grep Commands (etc.) with Perl

All of the various grep-like utilities perform pretty much the same function, with minor differences -- they search for a specified pattern in some or all of a file and display that pattern with varying amounts of surrounding context.

As you use Unix more and more, you will find yourself wanting to do an increasing number of grep-like tasks, but no particular Unix utility will quite suit them all (hence the need for the various grep utilities discussed earlier). You'll start accumulating C programs, awk scripts, and shell scripts to do these different tasks, and you'll be craving one utility that can easily encompass them all so you don't have to waste the disk space for all of those binaries. That utility is Perl (Section 41.1), the "Practical Extraction and Report Language" developed by Larry Wall. According to the documentation accompanying Perl, it is "an interpreted language optimized for scanning arbitrary text files, extracting information from those text files, and printing reports based on that information."

For example, to search for a pattern in the header of a Usenet message:

perl -ne 'exit if (/^$/); print if (/pattern/);' filename

[This works because mail and Usenet (Section 1.21) messages always use a blank line -- indicated by ^$ in regular expression syntax -- to separate the header from the body of the message. -- TOR]

[The -n flag tells perl to wrap the contents of -e into the body of a while(<>){ ... } loop. -- JJ]

To do a search for a pattern and print the paragraphs in which it appears:

perl -ne '$/ = "\n\n"; print if (/pattern/);' filename

[This assumes that paragraphs are delimited by a double linefeed -- that is, a blank line. You'd have to adjust this script for a troff or TEX document where paragraphs are separated by special codes. -- TOR]

Searching through files is one of Perl's strengths, but certainly not its only strength. Perl encompasses all the functionality of sed, awk, grep, find, and other Unix utilities. Furthermore, a Perl program to do something originally done with one or more of these utilities is usually faster and easier to read than the non-Perl solution.

-- JIK



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.