At the March 2024 MediaWiki Engineering Offsite, User:MatmaRex proposed a lightning talk titled "What the for-each loop is missing". Without spoiling his talk too much, his observations boil down to two common features programmers have to manually implement on top of standard for-loops:
1. Some way to detect whether you are at the first or last iteration of the loop. For example:
function totuple(list)
local s = ''
for _,item in ipairs(list) with first, last do
if first then
s = s .. '('
end
s = s .. tostring(item)
if not last then
s = s .. ", "
else
s = s .. ')'
end
end
return s
end
2. Code which executes "only if the loop iterated at least once" or "only if the loop never executed". For example:
function max(list)
local result = nil
for _,item in ipairs(list) with first, last do
if first or item > result then
result = item
end
then
-- only if loop executed at least once
return result
else
-- if loop never executed (no items in list)
error("maximum of zero length list")
end
end
MatmaRex presented his proposal in a language-independent way, and partial implementations for various languages exist. For example, python3 has an else clause used with for-loops which executes "only if loop completes normally". A custom iterator in PHP was also written that provides the first iteration and last iteration booleans. Since I had a Lua grammar and interpreter handy, I decided to take a shot at a Lua implementation of the full proposal.
As shown in the above examples, there are two additions to the lua grammar. First, an optional with <Id>, <Id>
clause is added in both the for-in and for-num productions. The first Id
names a boolean local variable which is true during the first iteration, and the second Id
names a boolean local variable which is true during the final iteration. Note that these can be named anything, although in most of my examples they will be named first
and last
for clarity. But for nested for loops you may very well have outer_first
and inner_first
, or first1
and first2
, etc.
For simplicity and clarity I've chosen to always make both the "first" and "last" identifiers mandatory; there is no way to ask for only "first" and not "last", or only "last" and not "first". This does have some runtime implications in the for-in case: with lua's implementation of iterators/generators, we can't determine whether we are on the last element without actually requesting it. Thus, when a with
clause is present, a for-in loop always executes "one iteration behind"; that is, it requests element N+1 before executing the loop with element N. In some corner cases with user-implemented generator functions this behavior might be observable. Consider this example, adapted from the lua manual's description of how ipairs
is implemented:
function iter(a, i)
i = i + 1
local v = a[i]
if v then
print(i)
return i, v
end
end
function myipairs(a)
return iter, a, 0
end
for i,v in myipairs({"one", "two", "three"}) with first, last do
print(v)
end
Without the with first, last
in the for-in loop this prints:
1
one
2
two
3
three
But when with first, last
is added this prints:
1
2
one
3
two
three
It would be possible to add with first
(without the , last
) as an alternative production, and when only the "first" boolean is required we wouldn't need to execute one iteration behind, but I haven't done that in this implementation.
The second grammar feature is adding optional then
and else
clauses to the for-in and for-num loops. We have a number of design questions here: what local variables should be visible in the scope of the then
and/or else
block, and what should their values be? What should the behavior of the then
and else
block be when break
is used in the for
loop? (Lua does not have a continue
statement.) And, finally: our choice regarding break
behavior made it desirable sometimes to combine the "more than zero iterations" (then
) and "zero iterations" (else
) cases; how should this be done?
I made the following choices:
1. In then
blocks, the iteration variable is visible and it is reset to the value it had on the last iteration of the loop. (Any local writes to this variable in the do
block are discarded.) In else
blocks, the iteration variable is not defined; it would not have a useful value in this case at any rate. This makes this example adapted from Python work in Lua as well:
for _,item in ipairs(list) do
print(item)
item = nil
then
-- note that 'item' is still bound here, and reset to the final item
print("Final item is:", item)
end
If a with
clause is present, neither of its local variables is defined in the then
or else
block.
2. When executing a break
statement, then
and else
blocks are skipped. This tweaks the semantics for then
and else
blocks: they are executed only on normal completion (non-break
) of non-zero/zero iterations of the for
loop. This makes this example adapted from Python work in Lua as well:
-- Primality testing
for n = 2, 10 do
for x = 2, n-1 do
if n % x == 0 then
print(n, "equals", x, "*", n/x)
break
end
then
-- executes only if loop completed normally (ie, no break)
print(n, "is a prime number")
end
end
3. If a then
block is present without an else
block, then the then
block is executed on any normal completion of the loop, even if it had zero iterations. This seems to match the common use case when only a then
block is present, as in the example above. You can think of this as effectively duplicating the then
block and using it as the else
block as well, but note that (unlike usually in an else
block) the loop variables are declared in the body of the then
block; if the loop was not executed they will all be set to nil
. It could be argued that we should use a different grammatical marker for this case, perhaps something like thenelse
(as a single keyword), but we've opted to keep it simple in our implementation.
The final grammar, using the LPegRex grammar formalism, looks like:
[==[
ForNum <== `for` Id `=` @expr @`,` @expr ((`,` @expr) / $false) (ForWith / $false) @ForBody
ForIn <== `for` @idlist `in` @exprlist (ForWith / $false) @ForBody
ForWith <== `with` @Id @`,` @Id
ForBody <== `do` Block (`then` Block / $false) (`else` Block / $false) @`end`
]==]
The Lua grammar and interpreter is written to be compatible with Scribunto and can be used on wiki. One caveat is that Scribunto enforces syntax-checking on Lua code stored in the Module
namespace, which means that Lua code using with/for-then/for-else can't be successfully saved in that namespace. However, we can parse and execute modules from other namespaces; my examples will use Lua code stored under my user namespace.
To execute code using mlua, which supports this extended for-loop syntax, you just need to replace {{#invoke:
with {{#invoke:User:Cscott/mlua|invokeUser|
in your wikitext. Note that mlua's invoke
method defaults to the Module
namespace like Scribunto's #invoke
does; because our extended syntax "is not syntactically-correct lua" we need to use invokeUser
which can execute from the User
(or other) namespace. The arguments after invokeUser
are the title of the module and then the function name within that module, just as with #invoke
.
Live examples using mlua:
...example1|max|1|2|42
: 42...example1|isprime|<N>
:
...example1|bignum_digits_to_string
(see below):
This works in French as well (use mlua|invokeFr
):
...example1/fr|max|1|2|42
: 42...example1/fr|estPremier|<N>
:
...example1/fr|chiffresÀChaîner
(see below):
function bignum_digits_to_string(digit_list)
s = ''
for _,d in ipairs(digit_list) do
s = s .. tostring(d)
else
-- if there are no digits in the digit list
s = '0'
end
return s
end