sed: Back-references and Subexpressions

 
 5.7 Back-references and Subexpressions
 ======================================
 
 "back-references" are regular expression commands which refer to a
 previous part of the matched regular expression.  Back-references are
 specified with backslash and a single digit (e.g.  '\1').  The part of
 the regular expression they refer to is called a "subexpression", and is
 designated with parentheses.
 
    Back-references and subexpressions are used in two cases: in the
 regular expression search pattern, and in the REPLACEMENT part of the
 's' command (⇒Regular Expression Addresses Regexp Addresses. and
 ⇒The "s" Command).
 
    In a regular expression pattern, back-references are used to match
 the same content as a previously matched subexpression.  In the
 following example, the subexpression is '.' - any single character
 (being surrounded by parentheses makes it a subexpression).  The
 back-reference '\1' asks to match the same content (same character) as
 the sub-expression.
 
    The command below matches words starting with any character, followed
 by the letter 'o', followed by the same character as the first.
 
      $ sed -E -n '/^(.)o\1$/p' /usr/share/dict/words
      bob
      mom
      non
      pop
      sos
      tot
      wow
 
    Multiple subexpressions are automatically numbered from
 left-to-right.  This command searches for 6-letter palindromes (the
 first three letters are 3 subexpressions, followed by 3 back-references
 in reverse order):
 
      $ sed -E -n '/^(.)(.)(.)\3\2\1$/p' /usr/share/dict/words
      redder
 
    In the 's' command, back-references can be used in the REPLACEMENT
 part to refer back to subexpressions in the REGEXP part.
 
    The following example uses two subexpressions in the regular
 expression to match two space-separated words.  The back-references in
 the REPLACEMENT part prints the words in a different order:
 
      $ echo "James Bond" | sed -E 's/(.*) (.*)/The name is \2, \1 \2./'
      The name is Bond, James Bond.
 
    When used with alternation, if the group does not participate in the
 match then the back-reference makes the whole match fail.  For example,
 'a(.)|b\1' will not match 'ba'.  When multiple regular expressions are
 given with '-e' or from a file ('-f FILE'), back-references are local to
 each expression.