A sed Quirk

The sed utility is a useful, if slightly obscure, tool. (Fortunately, there are excellent guides available to its use.) Today, however, I ran across an annoying incompatibility in one of its features, which I want to briefly discuss.

Why It’s Cool

The sed utility is a great way to do simple (or substantial) text processing from the UNIX command line. For instance, if you want to replace all instances of the string “egg” with the string “duck” in a program’s output, you could do something like this:

other_program | sed s/egg/duck/

More practically, I use it to install Google Analytics JavaScript on webpages during the deployment process. The relevant snippet of the shell script looks like this:

# Insert GATC into all markup files
find $wrk \( -name *.html -o -name *.php \) -exec sed -e "/<!-- GATC -->/ {
	r gat.txt
	d
}" -i {} \;

This is a little complicated; to decompose it a little:

  • The find utility searches a directory specified by the shell variable $wrk
  • The find utility locates all files with .html or .php suffixes under that directory
  • For every such file, the find utility invokes the sed utility
  • The sed utility searches for lines containing the string “<!– GATC –>”
  • For every such line, sed inserts the contents of the gat.txt file, and deletes the original line
  • The sed utility runs in “in-place” mode; it writes its changes back to its input file, instead of dumping them to stdout.

The Problem

Unfortunately, the “-i” option isn’t consistently implemented. To see how, it’s necessary to understand that the “-i” option can take an argument; this argument specifies an extension which is used to create backups of the files modified by sed. The difficulty arises when one doesn’t want to create any backups.

Some implementations of sed (e.g. on Mac OS X) require that an argument be supplied for “-i”; if one wishes to suppress backups, one simply supplies an empty string, i.e.:

# Insert GATC into all markup files
find $wrk \( -name *.html -o -name *.php \) -exec sed -e "/<!-- GATC -->/ {
	r gat.txt
	d
}" -i '' {} \;

Other implementations will throw an error if such an argument is passed to the “-i” argument; these implementations (e.g. on Ubuntu Linux) treat the “-i” argument as optional.

Small Potatoes?

Is this a big deal? No. It’s a minor nuisance if you’re trying to get the same script running on incompatible systems, but hardly insurmountable. It is, however, puzzling the first time you run across it; that’s why I wanted to mention it.

Share and Enjoy:
  • Twitter
  • Facebook
  • Digg
  • Reddit
  • HackerNews
  • del.icio.us
  • Google Bookmarks
  • Slashdot
This entry was posted in UNIX. Bookmark the permalink.

Comments are closed.