sed: Escapes

 
 5.8 Escape Sequences - specifying special characters
 ====================================================
 
 Until this chapter, we have only encountered escapes of the form '\^',
 which tell 'sed' not to interpret the circumflex as a special character,
 but rather to take it literally.  For example, '\*' matches a single
 asterisk rather than zero or more backslashes.
 
    This chapter introduces another kind of escape(1)--that is, escapes
 that are applied to a character or sequence of characters that
 ordinarily are taken literally, and that 'sed' replaces with a special
 character.  This provides a way of encoding non-printable characters in
 patterns in a visible manner.  There is no restriction on the appearance
 of non-printing characters in a 'sed' script but when a script is being
 prepared in the shell or by text editing, it is usually easier to use
 one of the following escape sequences than the binary character it
 represents:
 
    The list of these escapes is:
 
 '\a'
      Produces or matches a BEL character, that is an "alert" (ASCII 7).
 
 '\f'
      Produces or matches a form feed (ASCII 12).
 
 '\n'
      Produces or matches a newline (ASCII 10).
 
 '\r'
      Produces or matches a carriage return (ASCII 13).
 
 '\t'
      Produces or matches a horizontal tab (ASCII 9).
 
 '\v'
      Produces or matches a so called "vertical tab" (ASCII 11).
 
 '\cX'
      Produces or matches 'CONTROL-X', where X is any character.  The
      precise effect of '\cX' is as follows: if X is a lower case letter,
      it is converted to upper case.  Then bit 6 of the character (hex
      40) is inverted.  Thus '\cz' becomes hex 1A, but '\c{' becomes hex
      3B, while '\c;' becomes hex 7B.
 
 '\dXXX'
      Produces or matches a character whose decimal ASCII value is XXX.
 
 '\oXXX'
      Produces or matches a character whose octal ASCII value is XXX.
 
 '\xXX'
      Produces or matches a character whose hexadecimal ASCII value is
      XX.
 
    '\b' (backspace) was omitted because of the conflict with the
 existing "word boundary" meaning.
 
 5.8.1 Escaping Precedence
 -------------------------
 
 GNU 'sed' processes escape sequences _before_ passing the text onto the
 regular-expression matching of the 's///' command and Address matching.
 Thus the follwing two commands are equivalent ('0x5e' is the hexadecimal
 ASCII value of the character '^'):
 
      $ echo 'a^c' | sed 's/^/b/'
      ba^c
 
      $ echo 'a^c' | sed 's/\x5e/b/'
      ba^c
 
    As are the following ('0x5b','0x5d' are the hexadecimal ASCII values
 of '[',']', respectively):
 
      $ echo abc | sed 's/[a]/x/'
      Xbc
      $ echo abc | sed 's/\x5ba\x5d/x/'
      Xbc
 
    However it is recommended to avoid such special characters due to
 unexpected edge-cases.  For example, the following are not equivalent:
 
      $ echo 'a^c' | sed 's/\^/b/'
      abc
 
      $ echo 'a^c' | sed 's/\\\x5e/b/'
      a^c
 
    ---------- Footnotes ----------
 
    (1) All the escapes introduced here are GNU extensions, with the
 exception of '\n'.  In basic regular expression mode, setting
 'POSIXLY_CORRECT' disables them inside bracket expressions.