BIT-101 [2017-2023]

Grep and Sed, Demystified


I’ve kind of half understood grep for a while, but assumed that I didn’t really get it at all. I thought I knew nothing at all about sed. I took some time this weekend to sit down and actually learn about these two commands and discovered I already knew a good deal about both of them and filled in some of what I didn’t know pretty easily. Both are a lot more simple and straightforward than I thought they were.

Grep

grep comes from “global regular expression print”. This is not really an acronym, but comes from the old time ed line editor tool. In that tool, if you wanted to globally search a file you were editing and print the lines that matched, you’d type g/re/p (where re is the regular expression you are using to search. The functionality got pulled out of ed and made into a standalone tool, grep.

Basics

grep, in its simplest use, searches all the lines of a given file or files and prints all the lines that match a regular expression. Syntax:

grep <options> <expression> <file(s)>

So if you want to search the file animals.txt for all the lines that contain the word dog, you just type:

grep dog animals.txt

It’s usually suggested that you include the expression in single quotes. This prevents a lot of potential problems, such as misinterpretations of spaces or unintentional expansion:

grep 'flying squirrel' animals.txt

It’s also a good idea to explicitly use the -e flag before the expression. This explicitly tells grep that the thing coming next is the expression. Say you had a file that was list of items each preceded with a dash, and you wanted to search for -dog

grep '-dog' animals.txt

Even with the quotes, grep will try to parse -dog as a command line flag. This handles it:

grep -e '-dog' animals.txt

You can search multiple files at the same time with wildcards:

grep -e 'dog' *

This will find all the lines that contain dog in any file in the current directory.

You can also recurse directories using the -r (or -R) flag:

grep -r -e 'dog' *

You can combine flags, but make sure that e is the last one before the expression:

grep -re 'dog' *

A very common use of grep is to pipe the output of one command into grep.

cat animals.txt | grep -e 'dog'

This simple example is exactly the same as just using grep with the file name, so is an unnecessary use of cat, but if you have some other command that generates a bunch of text, this is very useful.

grep simply outputs its results to stdout – the terminal. You could pipe that into another command or save it to a new file…

grep -e 'dog' animals.txt > dogs.txt

Extended

When you get into more complex regular expressions, you’ll need to start escaping the special characters you use to construct them, like parentheses and brackets:

grep -e '\(flying \)\?squirrel' animals.txt

This can quickly become a pain. Time for extended regular expressions, using the -E flag:

grep -Ee '(flying )?squirrel' animals.txt

Much easier. Note that -E had nothing at all to do with -e. That confused me earlier. You should use both in this case. You may have heard of the tool egrep. This is simply grep -E. In some systems egrep is literally a shell script that calls grep -E. In others it’s a separate executable, but it’s just grep -E under the hood.

egrep -e '(flying )?squirrel' animals.txt

Other Stuff

The above covers most of what you need to know to use basic grep. There are some other useful flags you can check into as well:

-o prints only the text that matches the expression, instead of the whole line

-h suppresses the file name from printing

-n prints the line number of each printed line.

If the thing you are searching for is simple text, you can use grep -F or fgrep. All the same, but you can’t use regular expressions, just search for a simple string.

grep -Fe 'dog' animals.text
fgrep -e 'dog' animals.text

Perl

There’s also grep with Perl syntax for regular expressions. This is a lot more powerful than normal grep regex syntax, but a lot more complex, so only use it if you really need it. It’s also not supported on every system. To use it, use the -P flag.

grep -Pe 'dog' animals.text

In this simple case, using Perl syntax gives us nothing beyond the usual syntax. Also note that pgrep is NOT an alternate form of grep -P. So much for consistency.

Sed

I thought that I knew next to nothing about sed, but it turns out that I’ve been using it for a few years for text replacement in vim! sed stands for “stream editor” and also hails from the ed line editor program. The syntax is:

sed <options> command <file(s)>

The two main options you’ll use most of the time are -e and -E which work the same way they do in grep.

The most common use of sed is to replace text in files. There are other uses which can edit the files in other ways, but I’ll stick to the basic replacement use case.

Like grep, sed reads each line of text in a file or files and looks for a match. It then performs a replacement on the matched text and prints out the resulting lines. The expression to use for replacement is

s/x/y/

where x is the text you are looking for, and y is what to replace it with. So to replace all instances of cat with the word feline in animals.txt

sed -e 's/cat/feline/' animals.txt

Note that sed will print every line of text in the file, whether or not it found a match or not. But the lines that it matched will be changed the way you specified.

After the final slash in the expression, you can add other regex flags like g for global or i for case insensitivity.

Like grep, sed just outputs to stdout. You can redirect that to another file using > or pipe it to another process using |. But do NOT save the output back to the original file. Try it out some time on a test file that you don’t care about and see what happens.

There are lots of other options you can use with sed but the above will probably get you by for a while to come. As you need more, just read up.

Summary

The biggest thing I took away from the couple of hours I spent on this was actually how easy these two commands were to learn. I’d been avoiding doing so for a long time and now wish that I had spend the effort on this much earlier.

« Previous Post
Next Post »

Comments? Best way to shout at me is on Mastodon

Or share this post directly on Mastodon