m4: Incompatibilities

 
 16.2 Facilities in System V 'm4' not in GNU 'm4'
 ================================================
 
 The version of 'm4' from System V contains a few facilities that have
 not been implemented in GNU 'm4' yet.  Additionally, POSIX requires some
 behaviors that GNU 'm4' has not implemented yet.  Relying on these
 behaviors is non-portable, as a future release of GNU 'm4' may change.
 
    * POSIX requires support for multiple arguments to 'defn', without
      any clarification on how 'defn' behaves when one of the multiple
      arguments names a builtin.  System V 'm4' and some other
      implementations allow mixing builtins and text macros into a single
      macro.  GNU 'm4' only supports joining multiple text arguments,
      although a future implementation may lift this restriction to
      behave more like System V.  The only portable way to join text
      macros with builtins is via helper macros and implicit
      concatenation of macro results.
 
    * POSIX requires an application to exit with non-zero status if it
      wrote an error message to stderr.  This has not yet been
      consistently implemented for the various builtins that are required
      to issue an error (such as 'eval' (⇒Eval) when an argument
      cannot be parsed).
 
    * Some traditional implementations only allow reading standard input
      once, but GNU 'm4' correctly handles multiple instances of '-' on
      the command line.
 
    * POSIX requires 'm4wrap' (⇒M4wrap) to act in FIFO (first-in,
      first-out) order, but GNU 'm4' currently uses LIFO order.
      Furthermore, POSIX states that only the first argument to 'm4wrap'
      is saved for later evaluation, but GNU 'm4' saves and processes all
      arguments, with output separated by spaces.
 
    * POSIX states that builtins that require arguments, but are called
      without arguments, have undefined behavior.  Traditional
      implementations simply behave as though empty strings had been
      passed.  For example, 'a`'define`'b' would expand to 'ab'.  But GNU
      'm4' ignores certain builtins if they have missing arguments,
      giving 'adefineb' for the above example.
 
    * Traditional implementations handle 'define(`f',`1')' (⇒
      Define) by undefining the entire stack of previous definitions,
      and if doing 'undefine(`f')' first.  GNU 'm4' replaces just the top
      definition on the stack, as if doing 'popdef(`f')' followed by
      'pushdef(`f',`1')'.  POSIX allows either behavior.
 
    * POSIX 2001 requires 'syscmd' (⇒Syscmd) to evaluate command
      output for macro expansion, but this was a mistake that is
      anticipated to be corrected in the next version of POSIX. GNU 'm4'
      follows traditional behavior in 'syscmd' where output is not
      rescanned, and provides the extension 'esyscmd' that does scan the
      output.
 
    * At one point, POSIX required 'changequote(ARG)' (⇒
      Changequote) to use newline as the close quote, but this was a
      bug, and the next version of POSIX is anticipated to state that
      using empty strings or just one argument is unspecified.
      Meanwhile, the GNU 'm4' behavior of treating an empty end-quote
      delimiter as ''' is not portable, as Solaris treats it as repeating
      the start-quote delimiter, and BSD treats it as leaving the
      previous end-quote delimiter unchanged.  For predictable results,
      never call changequote with just one argument, or with empty
      strings for arguments.
 
    * At one point, POSIX required 'changecom(ARG,)' (⇒Changecom)
      to make it impossible to end a comment, but this is a bug, and the
      next version of POSIX is anticipated to state that using empty
      strings is unspecified.  Meanwhile, the GNU 'm4' behavior of
      treating an empty end-comment delimiter as newline is not portable,
      as BSD treats it as leaving the previous end-comment delimiter
      unchanged.  It is also impossible in BSD implementations to disable
      comments, even though that is required by POSIX. For predictable
      results, never call changecom with empty strings for arguments.
 
    * Most implementations of 'm4' give macros a higher precedence than
      comments when parsing, meaning that if the start delimiter given to
      'changecom' (⇒Changecom) starts with a macro name, comments
      are effectively disabled.  POSIX does not specify what the
      precedence is, so this version of GNU 'm4' parser recognizes
      comments, then macros, then quoted strings.
 
    * Traditional implementations allow argument collection, but not
      string and comment processing, to span file boundaries.  Thus, if
      'a.m4' contains 'len(', and 'b.m4' contains 'abc)', 'm4 a.m4 b.m4'
      outputs '3' with traditional 'm4', but gives an error message that
      the end of file was encountered inside a macro with GNU 'm4'.  On
      the other hand, traditional implementations do end of file
      processing for files included with 'include' or 'sinclude' (⇒
      Include), while GNU 'm4' seamlessly integrates the content of
      those files.  Thus 'include(`a.m4')include(`b.m4')' will output '3'
      instead of giving an error.
 
    * Traditional 'm4' treats 'traceon' (⇒Trace) without arguments
      as a global variable, independent of named macro tracing.  Also,
      once a macro is undefined, named tracing of that macro is lost.  On
      the other hand, when GNU 'm4' encounters 'traceon' without
      arguments, it turns tracing on for all existing definitions at the
      time, but does not trace future definitions; 'traceoff' without
      arguments turns tracing off for all definitions regardless of
      whether they were also traced by name; and tracing by name, such as
      with '-tfoo' at the command line or 'traceon(`foo')' in the input,
      is an attribute that is preserved even if the macro is currently
      undefined.
 
      Additionally, while POSIX requires trace output, it makes no
      demands on the formatting of that output.  Parsing trace output is
      not guaranteed to be reliable, even between different releases of
      GNU M4; however, the intent is that any future changes in trace
      output will only occur under the direction of additional
      'debugmode' flags (⇒Debug Levels).
 
    * POSIX requires 'eval' (⇒Eval) to treat all operators with
      the same precedence as C.  However, earlier versions of GNU 'm4'
      followed the traditional behavior of other 'm4' implementations,
      where bitwise and logical negation ('~' and '!') have lower
      precedence than equality operators; and where equality operators
      ('==' and '!=') had the same precedence as relational operators
      (such as '<').  Use explicit parentheses to ensure proper
      precedence.  As extensions to POSIX, GNU 'm4' gives well-defined
      semantics to operations that C leaves undefined, such as when
      overflow occurs, when shifting negative numbers, or when performing
      division by zero.  POSIX also requires '=' to cause an error, but
      many traditional implementations allowed it as an alias for '=='.
 
    * POSIX 2001 requires 'translit' (⇒Translit) to treat each
      character of the second and third arguments literally.  However, it
      is anticipated that the next version of POSIX will allow the GNU
      'm4' behavior of treating '-' as a range operator.
 
    * POSIX requires 'm4' to honor the locale environment variables of
      'LANG', 'LC_ALL', 'LC_CTYPE', 'LC_MESSAGES', and 'NLSPATH', but
      this has not yet been implemented in GNU 'm4'.
 
    * POSIX states that only unquoted leading newlines and blanks (that
      is, space and tab) are ignored when collecting macro arguments.
      However, this appears to be a bug in POSIX, since most traditional
      implementations also ignore all whitespace (formfeed, carriage
      return, and vertical tab).  GNU 'm4' follows tradition and ignores
      all leading unquoted whitespace.
 
    * A strictly-compliant POSIX client is not allowed to use
      command-line arguments not specified by POSIX. However, since this
      version of M4 ignores 'POSIXLY_CORRECT' and enables the option
      '--gnu' by default (⇒Invoking m4 Limits control.), a client
      desiring to be strictly compliant has no way to disable GNU
      extensions that conflict with POSIX when directly invoking the
      compiled 'm4'.  A future version of 'GNU' M4 will honor the
      environment variable 'POSIXLY_CORRECT', implicitly enabling
      '--traditional' if it is set, in order to allow a
      strictly-compliant client.  In the meantime, a client needing
      strict POSIX compliance can use the workaround of invoking a shell
      script wrapper, where the wrapper then adds '--traditional' to the
      arguments passed to the compiled 'm4'.