Notes on sed, the stream editor

Table of contents:

Removal example
Appending example
Operating on a list of files, with the find command
About -i
Cross-platform bash calling to sed

Removal example

Removing JavaDoc author tags:

sed -i -e "/^[[:space:]]*\*[[:space:]]*@author.*/d" file.java

-e means "the next argument is an expression to evaluate". This particular sed expression has two parts.

There is a left-hand part (the input part) that contains a regular expression (regex). sed will try to match on that regex.

Let's take a look at the regex. This will allow us to match any whole line containing author tags like * @author. ^ means "at the start of the line". [[:space:]] means "any space", and we use it here to allow matching with both spaces and tabs. The * after that means "any amount of the previous", and together they will match any number of spaces and tabs. Next, we see \*. Here we want to match on an actual * character. We must "escape" the * with a \ to allow sed to interpret it as an actual character. The @author refers to the actual author tag. .* refers to any amount (*) of any character (.). This combination will match any author name.

When sed finds a match we move to the right-hand part of the sed expression (the output part). The output part contains instructions on what to do when we find a match. Thed in the output part of the expression means that the matched regex, the input part of the expression, will be deleted. Below we also have examples of s for substituting and r for replacing.

-i means "in-place", in the input file. There is further discussion of -i below.

Appending example

Here we append company information to the author tags.

sed -i -e "s/^[[:space:]]*\*[[:space:]]*@author.*/& (CompanyName)/" file.java

The s at the start of the sed expression means "substitute". We will substitute the input regex with the output expression. The & in the output expression refers to the text that matched the expression. The end result is that we substitute the matched text with the matched text itself, plust " (CompanyName)".

Operating on a list of files, with the find command

Example use with find, as I have used it in a make file:

find $(OUTPUTDIR) -type f -name "*.html" -exec sed -i '.wip' -e "/<!-- insert file here -->/r $(FILE)" -e "//d" {} \;

We find HTML files and pass them to sed. Occurrences of  will be replaced with the file contents, using the pattern r $(FILE). The r means replace. The //d part instructs sed to delete the matched pattern, .

About -i

-i allows us to edit a whole list files in-place, independently. Handy when using the output of find!

Edit files in-place similarly to -I, but treat each file independently from other files.  In particular, line numbers in each file start at 1, the “$” address matches
the last line of the current file, and address ranges are limited to the current file.  (See Sed Addresses.) The net result is as though each file were edited by a
separate sed instance.

Note that in some situations sed on macOS requires that you pass a filename extension for the temporary working file. Without that we get strange errors that don't clarify the problem. I like the idea of passing ".wip" as the extension. We can easily recursively remove the .wip files afterwards, with the following command:

find path/to/target/directory -type f -name "*.wip" -delete Be careful, that command will remove all files with names ending in ".wip". Any unrelated files that have a filename that ends with ".wip" will also be gone.

Cross-platform bash calling to sed

Below is an example of using the OSTYPE environment variable to change the sed argument, based on whether we are on macOS or not. We pass -i '.wip' if we are on macOS, and only -i otherwise.

if [[ "$OSTYPE" == "darwin"* ]]; then
    find "$1" -type f -name "*.html" -exec sed -i '.wip' -e "/$2/r $3" -e "//d" {} \;
else
    find "$1" -type f -name "*.html" -exec sed -i -e "/$2/r $3" -e "//d" {} \;
fi