Info: (sed) Multiline techniques

Info Catalog
sed: Hold and Pattern Buffers
sed: advanced sed
sed: Branching and flow control
sed: Multiline techniques

 
 6.3 Multiline techniques - using D,G,H,N,P to process multiple lines
 ====================================================================
 
 Multiple lines can be processed as one buffer using the
 'D','G','H','N','P'.  They are similar to their lowercase counterparts
 ('d','g', 'h','n','p'), except that these commands append or subtract
 data while respecting embedded newlines - allowing adding and removing
 lines from the pattern and hold spaces.
 
    They operate as follows:
 'D'
      _deletes_ line from the pattern space until the first newline, and
      restarts the cycle.
 
 'G'
      _appends_ line from the hold space to the pattern space, with a
      newline before it.
 
 'H'
      _appends_ line from the pattern space to the hold space, with a
      newline before it.
 
 'N'
      _appends_ line from the input file to the pattern space.
 
 'P'
      _prints_ line from the pattern space until the first newline.
 
    The following example illustrates the operation of 'N' and 'D'
 commands:
 
      $ seq 6 | sed -n 'N;l;D'
      1\n2$
      2\n3$
      3\n4$
      4\n5$
      5\n6$
 
   1. 'sed' starts by reading the first line into the pattern space (i.e.
      '1').
   2. At the beginning of every cycle, the 'N' command appends a newline
      and the next line to the pattern space (i.e.  '1', '\n', '2' in the
      first cycle).
   3. The 'l' command prints the content of the pattern space
      unambiguously.
   4. The 'D' command then removes the content of pattern space up to the
      first newline (leaving '2' at the end of the first cycle).
   5. At the next cycle the 'N' command appends a newline and the next
      input line to the pattern space (e.g.  '2', '\n', '3').
 
    A common technique to process blocks of text such as paragraphs
 (instead of line-by-line) is using the following construct:
 
      sed '/./{H;$!d} ; x ; s/REGEXP/REPLACEMENT/'
 
   1. The first expression, '/./{H;$!d}' operates on all non-empty lines,
      and adds the current line (in the pattern space) to the hold space.
      On all lines except the last, the pattern space is deleted and the
      cycle is restarted.
 
   2. The other expressions 'x' and 's' are executed only on empty lines
      (i.e.  paragraph separators).  The 'x' command fetches the
      accumulated lines from the hold space back to the pattern space.
      The 's///' command then operates on all the text in the paragraph
      (including the embedded newlines).
 
    The following example demonstrates this technique:
      $ cat input.txt
      a a a aa aaa
      aaaa aaaa aa
      aaaa aaa aaa
 
      bbbb bbb bbb
      bb bb bbb bb
      bbbbbbbb bbb
 
      ccc ccc cccc
      cccc ccccc c
      cc cc cc cc
 
      $ sed '/./{H;$!d} ; x ; s/^/\nSTART-->/ ; s/$/\n<--END/' input.txt
 
      START-->
      a a a aa aaa
      aaaa aaaa aa
      aaaa aaa aaa
      <--END
 
      START-->
      bbbb bbb bbb
      bb bb bbb bb
      bbbbbbbb bbb
      <--END
 
      START-->
      ccc ccc cccc
      cccc ccccc c
      cc cc cc cc
      <--END
 
    For more annotated examples, ⇒Text search across multiple
 lines and ⇒Line length adjustment.
Info Catalog
sed: Hold and Pattern Buffers
sed: advanced sed
sed: Branching and flow control