there is no place like ~

sed introduction

sed

sed Synopsis

bash$ sed [options] program [inputfile]

This simple program consists of the command 'd'. It tells sed to delete the pattern buffer.

bash$ sed -e 'd' /etc/hosts

Another command is 'p'. It tells sed to print the pattern buffer. (Every line is printed twice)

bash$ sed -e 'p' /etc/hosts

We don't always want to work on the whole document → There must be a mechanism to address a line or several lines

Addresses

n selects the line n
$ selects the last line
/re/ selects the lines matching the RE re
\crec selects the lines matching the RE re. The c may be any character
first~step GNU extension! Selects every step'th line starting with line first
addr1,addr2 Address range: selects all input lines which match the inclusive range of lines starting from the first address and continuing to the second address
addr! select lines that do not match addr

Examples

The command = prints the current line number. A substitute program for "wc -l" might be:

bash$ sed -n -e '$='

This one emulates "head":

bash$ sed -n -e '1,10p'bash$ sed -e '10q'

Substitution Command

Eliminate comments

bash$ sed -e 's/#.*//' /etc/inetd

Eliminate comments and empty lines

bash$ sed -e 's/#.*//;/^$/d' /etc/inetd

Have a 133t prompt

bash$ ls -l | sed -e 's/o/0/;s/l/1/;s/e/3/'
bash$ ls -l | sed -e 's/o/0/g;s/l/1/g;s/e/3/g'
bash$ ls -l | sed -e 'y/ole/013/g'

Convert a file from DOS to UNIX and vice versa

# Under UNIX: convert DOS newlines (CR/LF) to Unix format
bash$ sed 's/.$//' file    # assumes that all lines end with CR/LF
bash$ sed 's/^M$// file    # in bash/tcsh, press Ctrl-V then Ctrl-M
 # Under DOS: convert Unix newlines (LF) to DOS format
C:\> sed 's/$//' file    # method 1
C:\> sed -n p file       # method 2

Alternatively use the utilities dos2unix and unix2dos, or the command

tr -d [^M] < inputfile > outputfile

for a conversion from DOS to UNIX, or

:set fileformat=dos:set fileformat=unix

from within vim, or...

Comments

The character "#" is a command (which cannot have any address). Ths is useful if the sed-program is stored in a file. The whole program can be executed with

bash$ sed -f programfile < inputdata

The "{" and "}" commands group different commands. "}" is a command → it must be preceded by a semicolon.

bash$ sed -ne '/gimme this line number/{=;q;}'

The command "n" reads a new line from stdin

/skip this line/{d;n;}
 # do some ugly stuff
 ...

REs are greedy

Example: eliminating HTML-tags from a file

bash$ sed -e 's/<.*>//g' text.html

If the file contains a line like:

This <b> is </b> a <i>example</i>.

then the result will be:

This.

Solution:

bash$ sed -e 's/<[^>]*>//g' text.html

References

The "elleff"-Language:

Every vocale c in a word is substituted with clcfc. → The ampersand (&) holds the matched string:

bash$ sed -e 's/[aeiou]\+/&l&f&/g'

Referencing a substring

Substrings enclosed with "\(" and "\)" can be referenced with "\n" (n is a digit from 1 to 9)

bash$ sed -e 's/\([^ ]\+\)  *\([^ ]\+\)  *\([^ ]\+\)/\3 \2 \1/'

The "elleff"-Backtransform

The RE following matches strings which are not "ellef"-vokales.

[aeiou]l[aeiou]f[aeiou]

Basic REs can use the backreference in the RE itself!

bash$ sed -e 's/\([aeiou]\+\)l\1f\1/\1/g'

Space Balls

D Delete text in the pattern space up to the first newline
N Add a newline to the pattern space, then append the next line of input to the pattern space
P Print out the portion of the pattern space up to the first newline
h Replace the contents of the hold space with the contents of the pattern space
H Append a newline to the contents of the hold space, and then append the contents of the pattern space to that of the hold space
g Replace the contents of the pattern space with the contents of the hold space
G Append a newline to the contents of the pattern space, and then append the contents of the hold space to that of the pattern space
x Exchange the contents of the hold and pattern spaces

Space Balls: Example

Print the first line as last

bash$ sed -n -e '1h;1!p;${g;p;}'

h: hold space <- pattern space

g: pattern space <- hold space

Emulation of tac

bash$ sed -n -e 'G;h;$p'

G: pattern space <<- '\n' hold space

Problem: The output shows a exceeding newline at the end: it is because "G" adds a newline followed by the content of the hold buffer to the pattern buffer, even in the first line (which is printed at the end).

tac improved

bash$ sed -n -e 'G;h;$s/.$//p'
bash$ sed -n -e '1!G;h;$p'

Example: a counter in sed

/^[[:digit:]][[:digit:]]*$/!n;         # the line must contain only digits
x;s/.*//;x;                            # clear the hold space
: add
/9$/{s/9$//;x;s/.*/0&/;x;b add;};      # eliminate the last 9 from the p.s.
                                       # and add a 0 in front of the h.s.
s/8$/9/
s/7$/8/
s/6$/7/
s/5$/6/
s/4$/5/
s/3$/4/
s/2$/3/
s/1$/2/
s/0$/1/
s/^$/1/
G;s/\n//g;            # add the content of the h.s to the p.s

Branches

: label Definition of label (up to 8 characters)
b label unconditionally branch to label
t label branch to label only if there has been a successful 's'ubstitution since the last input line was read or 't' branch was taken

If label is ommitted in the b or t command, then the next cycle ist started.

Eliminate K/K++ comments

#!/bin/sed -f

# delete K++ comments
/^[[:blank:]]*kk.*/d
s/kk.*//

# If no comment is found, then start a new cicle
: test
/ko/!b

# Append new lines to the pattern space until a entire K-comment is in the
# pattern space
: append
/ok/!{N;b append;}

# delete every K-comment (but don't be greedy!)
s/ko\([^o]\|o[^k]\)*o\?ok//g

t test