there is no place like ~

Regular Expressions Tutorial

Regular Expressions

Atoms

Atoms are the basic components of a RE

x the character 'x' itself
\X if X is an 'a', 'b', 'f', 'n', 'r', 't', or 'v', then the ANSI-C interpretation of \x. Otherwise, a literal 'X' (used to escape operators such as '*')
\123 the character with octal value 123
\xe5 the character with hexadecimal value e5
. any character (byte) except newline
[xyz] a "character class": x OR y OR z
[ako-sP] a "character class" with a range in it; matches an 'a', a 'k', any letter from 'k' through 's', or a 'P'
[^A-Z] a "negated character class": i.e., any character but those in the class. In our example, any character EXCEPT an uppercase letter
[:str:] a "character class expression": Allowed only within another character class. The valid contents of str are: alnum, alpha, blank, cntrl, digit, graph, lower, print, punct, space, upper, xdigit

Pieces

Pieces are used to concatenate one or more REs, or to specify how often a precedent piece must be repeated

(r) the RE r itself
rs the RE r followed by the RE s
r|s the RE r OR the RE s
r* the RE r zero or more times
r+ the RE r one or more time
r? the RE r zero or one time
r{2,6} the RE r anywhere from two to six times
r{2,} the RE r two or more times
r{,6} the RE r up to six times
r{4} the RE r exactly for times

Regular Examples

(x|y|z) is equivalent to the RE [xyz]

(a|b) is equivalent to (b|a)

The RE (Sch|Pr)eis{2} matches both the strings "Scheiss" and "Preiss".

Regular Examples

Real numbers (simple)

[0-9]+\.[0-9]*([eE][+-]?[0-9]+)?

Real numbers (character class)

[[:digit:]]+\.[[:digit:]]*([eE][+-]?[[:digit:]]+)?

Problem: numbers like "3." are acepted, but not ".3".

Real numbers (catch all)

(([[:digit:]]+\.[[:digit:]]*)|(\.[[:digit:]]+))([eE][+-]?[[:digit:]]+)?

Basic REs

Real numbers as Extended RE

[0-9]+\.[0-9]*([eE][+-]?[0-9]+)?

Real numbers as Basic RE

[0-9][0-9]*\.[0-9]*\([eE][+-]\{0,1\}[0-9][0-9]*\)\{0,1\}

Real numbers as Basic RE, written in the shell

\[0-9\]\[0-9\]\*\\.\[0-9\]\*\\\(\[eE\]\[+-\]\\\{0,1\\\}\[0-9\]\[0-9\]\*\\\)\\\{0,1\\\}