m4: Improved foreach

 
 17.3 Solution for 'foreach'
 ===========================
 
 The 'foreach' and 'foreachq' macros (⇒Foreach) as presented
 earlier each have flaws.  First, we will examine and fix the quadratic
 behavior of 'foreachq':
 
      $ m4 -I examples
      include(`foreachq.m4')
      =>
      traceon(`shift')debugmode(`aq')
      =>
      foreachq(`x', ``1', `2', `3', `4'', `x
      ')dnl
      =>1
      error->m4trace: -3- shift(`1', `2', `3', `4')
      error->m4trace: -2- shift(`1', `2', `3', `4')
      =>2
      error->m4trace: -4- shift(`1', `2', `3', `4')
      error->m4trace: -3- shift(`2', `3', `4')
      error->m4trace: -3- shift(`1', `2', `3', `4')
      error->m4trace: -2- shift(`2', `3', `4')
      =>3
      error->m4trace: -5- shift(`1', `2', `3', `4')
      error->m4trace: -4- shift(`2', `3', `4')
      error->m4trace: -3- shift(`3', `4')
      error->m4trace: -4- shift(`1', `2', `3', `4')
      error->m4trace: -3- shift(`2', `3', `4')
      error->m4trace: -2- shift(`3', `4')
      =>4
      error->m4trace: -6- shift(`1', `2', `3', `4')
      error->m4trace: -5- shift(`2', `3', `4')
      error->m4trace: -4- shift(`3', `4')
      error->m4trace: -3- shift(`4')
 
    Each successive iteration was adding more quoted 'shift' invocations,
 and the entire list contents were passing through every iteration.  In
 general, when recursing, it is a good idea to make the recursion use
 fewer arguments, rather than adding additional quoted uses of 'shift'.
 By doing so, 'm4' uses less memory, invokes fewer macros, is less likely
 to run into machine limits, and most importantly, performs faster.  The
 fixed version of 'foreachq' can be found in
 'm4-1.4.18/examples/foreachq2.m4':
 
      $ m4 -I examples
      include(`foreachq2.m4')
      =>
      undivert(`foreachq2.m4')dnl
      =>include(`quote.m4')dnl
      =>divert(`-1')
      =># foreachq(x, `item_1, item_2, ..., item_n', stmt)
      =>#   quoted list, improved version
      =>define(`foreachq', `pushdef(`$1')_$0($@)popdef(`$1')')
      =>define(`_arg1q', ``$1'')
      =>define(`_rest', `ifelse(`$#', `1', `', `dquote(shift($@))')')
      =>define(`_foreachq', `ifelse(`$2', `', `',
      =>  `define(`$1', _arg1q($2))$3`'$0(`$1', _rest($2), `$3')')')
      =>divert`'dnl
      traceon(`shift')debugmode(`aq')
      =>
      foreachq(`x', ``1', `2', `3', `4'', `x
      ')dnl
      =>1
      error->m4trace: -3- shift(`1', `2', `3', `4')
      =>2
      error->m4trace: -3- shift(`2', `3', `4')
      =>3
      error->m4trace: -3- shift(`3', `4')
      =>4
 
    Note that the fixed version calls unquoted helper macros in
 '_foreachq' to trim elements immediately; those helper macros in turn
 must re-supply the layer of quotes lost in the macro invocation.
 Contrast the use of '_arg1q', which quotes the first list element, with
 '_arg1' of the earlier implementation that returned the first list
 element directly.  Additionally, by calling the helper method
 immediately, the 'defn(`ITERATOR')' no longer contains unexpanded
 macros.
 
    The astute m4 programmer might notice that the solution above still
 uses more memory and macro invocations, and thus more time, than
 strictly necessary.  Note that '$2', which contains an arbitrarily long
 quoted list, is expanded and rescanned three times per iteration of
 '_foreachq'.  Furthermore, every iteration of the algorithm effectively
 unboxes then reboxes the list, which costs a couple of macro
 invocations.  It is possible to rewrite the algorithm for a bit more
 speed by swapping the order of the arguments to '_foreachq' in order to
 operate on an unboxed list in the first place, and by using the
 fixed-length '$#' instead of an arbitrary length list as the key to end
 recursion.  The result is an overhead of six macro invocations per loop
 (excluding any macros in TEXT), instead of eight.  This alternative
 approach is available as 'm4-1.4.18/examples/foreach3.m4':
 
      $ m4 -I examples
      include(`foreachq3.m4')
      =>
      undivert(`foreachq3.m4')dnl
      =>divert(`-1')
      =># foreachq(x, `item_1, item_2, ..., item_n', stmt)
      =>#   quoted list, alternate improved version
      =>define(`foreachq', `ifelse(`$2', `', `',
      =>  `pushdef(`$1')_$0(`$1', `$3', `', $2)popdef(`$1')')')
      =>define(`_foreachq', `ifelse(`$#', `3', `',
      =>  `define(`$1', `$4')$2`'$0(`$1', `$2',
      =>    shift(shift(shift($@))))')')
      =>divert`'dnl
      traceon(`shift')debugmode(`aq')
      =>
      foreachq(`x', ``1', `2', `3', `4'', `x
      ')dnl
      =>1
      error->m4trace: -4- shift(`x', `x
      error->', `', `1', `2', `3', `4')
      error->m4trace: -3- shift(`x
      error->', `', `1', `2', `3', `4')
      error->m4trace: -2- shift(`', `1', `2', `3', `4')
      =>2
      error->m4trace: -4- shift(`x', `x
      error->', `1', `2', `3', `4')
      error->m4trace: -3- shift(`x
      error->', `1', `2', `3', `4')
      error->m4trace: -2- shift(`1', `2', `3', `4')
      =>3
      error->m4trace: -4- shift(`x', `x
      error->', `2', `3', `4')
      error->m4trace: -3- shift(`x
      error->', `2', `3', `4')
      error->m4trace: -2- shift(`2', `3', `4')
      =>4
      error->m4trace: -4- shift(`x', `x
      error->', `3', `4')
      error->m4trace: -3- shift(`x
      error->', `3', `4')
      error->m4trace: -2- shift(`3', `4')
 
    In the current version of M4, every instance of '$@' is rescanned as
 it is encountered.  Thus, the 'foreachq3.m4' alternative uses much less
 memory than 'foreachq2.m4', and executes as much as 10% faster, since
 each iteration encounters fewer '$@'.  However, the implementation of
 rescanning every byte in '$@' is quadratic in the number of bytes
 scanned (for example, making the broken version in 'foreachq.m4' cubic,
 rather than quadratic, in behavior).  A future release of M4 will
 improve the underlying implementation by reusing results of previous
 scans, so that both styles of 'foreachq' can become linear in the number
 of bytes scanned.  Notice how the implementation injects an empty
 argument prior to expanding '$2' within 'foreachq'; the helper macro
 '_foreachq' then ignores the third argument altogether, and ends
 recursion when there are three arguments left because there was nothing
 left to pass through 'shift'.  Thus, each iteration only needs one
 'ifelse', rather than the two conditionals used in the version from
 'foreachq2.m4'.
 
    So far, all of the implementations of 'foreachq' presented have been
 quadratic with M4 1.4.x.  But 'forloop' is linear, because each
 iteration parses a constant amount of arguments.  So, it is possible to
 design a variant that uses 'forloop' to do the iteration, then uses '$@'
 only once at the end, giving a linear result even with older M4
 implementations.  This implementation relies on the GNU extension that
 '$10' expands to the tenth argument rather than the first argument
 concatenated with '0'.  The trick is to define an intermediate macro
 that repeats the text 'm4_define(`$1', `$N')$2`'', with 'n' set to
 successive integers corresponding to each argument.  The helper macro
 '_foreachq_' is needed in order to generate the literal sequences such
 as '$1' into the intermediate macro, rather than expanding them as the
 arguments of '_foreachq'.  With this approach, no 'shift' calls are even
 needed!  Even though there are seven macros of overhead per iteration
 instead of six in 'foreachq3.m4', the linear scaling is apparent at
 relatively small list sizes.  However, this approach will need
 adjustment when a future version of M4 follows POSIX by no longer
 treating '$10' as the tenth argument; the anticipation is that '${10}'
 can be used instead, although that alternative syntax is not yet
 supported.
 
      $ m4 -I examples
      include(`foreachq4.m4')
      =>
      undivert(`foreachq4.m4')dnl
      =>include(`forloop2.m4')dnl
      =>divert(`-1')
      =># foreachq(x, `item_1, item_2, ..., item_n', stmt)
      =>#   quoted list, version based on forloop
      =>define(`foreachq',
      =>`ifelse(`$2', `', `', `_$0(`$1', `$3', $2)')')
      =>define(`_foreachq',
      =>`pushdef(`$1', forloop(`$1', `3', `$#',
      =>  `$0_(`1', `2', indir(`$1'))')`popdef(
      =>    `$1')')indir(`$1', $@)')
      =>define(`_foreachq_',
      =>``define(`$$1', `$$3')$$2`''')
      =>divert`'dnl
      traceon(`shift')debugmode(`aq')
      =>
      foreachq(`x', ``1', `2', `3', `4'', `x
      ')dnl
      =>1
      =>2
      =>3
      =>4
 
    For yet another approach, the improved version of 'foreach',
 available in 'm4-1.4.18/examples/foreach2.m4', simply overquotes the
 arguments to '_foreach' to begin with, using 'dquote_elt'.  Then
 '_foreach' can just use '_arg1' to remove the extra layer of quoting
 that was added up front:
 
      $ m4 -I examples
      include(`foreach2.m4')
      =>
      undivert(`foreach2.m4')dnl
      =>include(`quote.m4')dnl
      =>divert(`-1')
      =># foreach(x, (item_1, item_2, ..., item_n), stmt)
      =>#   parenthesized list, improved version
      =>define(`foreach', `pushdef(`$1')_$0(`$1',
      =>  (dquote(dquote_elt$2)), `$3')popdef(`$1')')
      =>define(`_arg1', `$1')
      =>define(`_foreach', `ifelse(`$2', `(`')', `',
      =>  `define(`$1', _arg1$2)$3`'$0(`$1', (dquote(shift$2)), `$3')')')
      =>divert`'dnl
      traceon(`shift')debugmode(`aq')
      =>
      foreach(`x', `(`1', `2', `3', `4')', `x
      ')dnl
      error->m4trace: -4- shift(`1', `2', `3', `4')
      error->m4trace: -4- shift(`2', `3', `4')
      error->m4trace: -4- shift(`3', `4')
      =>1
      error->m4trace: -3- shift(``1'', ``2'', ``3'', ``4'')
      =>2
      error->m4trace: -3- shift(``2'', ``3'', ``4'')
      =>3
      error->m4trace: -3- shift(``3'', ``4'')
      =>4
      error->m4trace: -3- shift(``4'')
 
    It is likewise possible to write a variant of 'foreach' that performs
 in linear time on M4 1.4.x; the easiest method is probably writing a
 version of 'foreach' that unboxes its list, then invokes '_foreachq' as
 previously defined in 'foreachq4.m4'.
 
    In summary, recursion over list elements is trickier than it appeared
 at first glance, but provides a powerful idiom within 'm4' processing.
 As a final demonstration, both list styles are now able to handle
 several scenarios that would wreak havoc on one or both of the original
 implementations.  This points out one other difference between the list
 styles.  'foreach' evaluates unquoted list elements only once, in
 preparation for calling '_foreach', similary for 'foreachq' as provided
 by 'foreachq3.m4' or 'foreachq4.m4'.  But 'foreachq', as provided by
 'foreachq2.m4', evaluates unquoted list elements twice while visiting
 the first list element, once in '_arg1q' and once in '_rest'.  When
 deciding which list style to use, one must take into account whether
 repeating the side effects of unquoted list elements will have any
 detrimental effects.
 
      $ m4 -I examples
      include(`foreach2.m4')
      =>
      include(`foreachq2.m4')
      =>
      dnl 0-element list:
      foreach(`x', `', `<x>') / foreachq(`x', `', `<x>')
      => / 
      dnl 1-element list of empty element
      foreach(`x', `()', `<x>') / foreachq(`x', ``'', `<x>')
      =><> / <>
      dnl 2-element list of empty elements
      foreach(`x', `(`',`')', `<x>') / foreachq(`x', ``',`'', `<x>')
      =><><> / <><>
      dnl 1-element list of a comma
      foreach(`x', `(`,')', `<x>') / foreachq(`x', ``,'', `<x>')
      =><,> / <,>
      dnl 2-element list of unbalanced parentheses
      foreach(`x', `(`(', `)')', `<x>') / foreachq(`x', ``(', `)'', `<x>')
      =><(><)> / <(><)>
      define(`ab', `oops')dnl using defn(`iterator')
      foreach(`x', `(`a', `b')', `defn(`x')') /dnl
       foreachq(`x', ``a', `b'', `defn(`x')')
      =>ab / ab
      define(`active', `ACT, IVE')
      =>
      traceon(`active')
      =>
      dnl list of unquoted macros; expansion occurs before recursion
      foreach(`x', `(active, active)', `<x>
      ')dnl
      error->m4trace: -4- active -> `ACT, IVE'
      error->m4trace: -4- active -> `ACT, IVE'
      =><ACT>
      =><IVE>
      =><ACT>
      =><IVE>
      foreachq(`x', `active, active', `<x>
      ')dnl
      error->m4trace: -3- active -> `ACT, IVE'
      error->m4trace: -3- active -> `ACT, IVE'
      =><ACT>
      error->m4trace: -3- active -> `ACT, IVE'
      error->m4trace: -3- active -> `ACT, IVE'
      =><IVE>
      =><ACT>
      =><IVE>
      dnl list of quoted macros; expansion occurs during recursion
      foreach(`x', `(`active', `active')', `<x>
      ')dnl
      error->m4trace: -1- active -> `ACT, IVE'
      =><ACT, IVE>
      error->m4trace: -1- active -> `ACT, IVE'
      =><ACT, IVE>
      foreachq(`x', ``active', `active'', `<x>
      ')dnl
      error->m4trace: -1- active -> `ACT, IVE'
      =><ACT, IVE>
      error->m4trace: -1- active -> `ACT, IVE'
      =><ACT, IVE>
      dnl list of double-quoted macro names; no expansion
      foreach(`x', `(``active'', ``active'')', `<x>
      ')dnl
      =><active>
      =><active>
      foreachq(`x', ```active'', ``active''', `<x>
      ')dnl
      =><active>
      =><active>