Volume 25, No.3

A way to write functions

by Phil Last (phil.last@ntlworld.com)

A companion article to Phil Last’s Co-operators paper to explain some of his more unusual or obscure programming techniques.

Someone recently stopped to look at my Dyalog development session. But only for a moment. He made a gesture to indicate his bewilderment and moved on. This is a man who writes most of his code in the direct (formerly known as “dynamic”) functional subset of Dyalog APL (D-fns or D:).

As do I.

So what was the problem?

Having bewildered many more in a presentation at Dyalog’11 at Boston and having been rebuked by a number of them since, I’ve come to the conclusion that it’s because I’ve discovered and use various techniques for doing common tasks in APL that have more verbose and perhaps more familiar counterparts.

As I’ve come to value all of these idioms and now find them highly recognisable I’d come to assume their self-explanatory nature but now accept that they present a relative impenetrability to many. This is hopefully to counter the latter and realise the former. As mentioned above almost all of my code is in D notation about which you can read in John Scholes' paper, “Dynamic Functions in Dyalog APL” [1] so I’ll only mention here the parts whose possible unfamiliarity is the reason for this article.

Ambivalence

I probably should start where most of my functions do. All D-fns are potentially ambivalent: they can be called with or without a left argument. Whether the code actually refers to that argument is another matter. The simplest D-fn is: {} that doesn’t refer to either of its arguments or anything else for that matter. Nevertheless most functions need to do something and if they expect a left argument the avoidance of a value error is something to be considered.

Traditional ways of doing this mostly check the name-class ⎕NC of the argument name for zero and conditionally assign an array value.

When I started using IBM’s APL2, which was the first time the problem arose for me, an earlier observation made me realise that a more generally useful course would be to attempt to fix a right identity function with the argument name:

         ∇ RES←LARG F00 RARG
      [1]  0 0⍴⎕FX,⊂'R←LARG R'
      ...
         ∇

If LARG is supplied, ⎕FX fails gracefully returning ⎕IO because it won’t fix a function in place of an array. But if LARG has been elided then ⎕FX creates it as an identity function and any use of the name within F00 as a left argument behaves as if it isn’t there. In [2] below if LARGis an identity function, F01 has no left argument and its result is passed to RES. So for elided LARG:

      [2] RES←LARG F01 RARG

is exactly the same as:

      [2] RES←F01 RARG

I went so far as to create a utility function whose argument was the left argument-name of the function that called it:

         ∇ AMBIVALENT AMBIVALENT
      [1]  AMBIVALENT←⎕FX,⊂'R←',AMBIVALENT,' R'
         ∇

It reassigns its eponymous argument with the result of ⎕FX possibly having created a semi-global identity function with the name contained in that argument.

         ∇ R←LEFT F02 RIGHT
      [1]  AMBIVALENT'LEFT'
      [2]  R←LEFT F03 …
      …
         ∇

It was published in Vector 6.2 as a tongue-in-cheek response to something much worse in Vector 5.4 but didn’t go down well with the other correspondents. I thought it was nicely declarative and self-documenting.

I have to admit there are some who are surprised by the fact that what they expect to be an array could be a function instead. Given the flexibility of infix notation what surprises me is that most will immediately assume the expression e t y, which constitutes three letters randomly chosen from the line above, to represent a three item vector rather than any of the other eleven valid three-fold array or function expressions that can be formed from permutations of arrays, functions or operators.

Ambivalence in D-fns

The D method to accommodate an elided left argument ⍺ is a single assignment that is executed conditionally on the argument’s absence.

I believe John Scholes was surprised to find that the equivalent of the above:

      [1] ⍺←{⍵}

actually worked. It has exactly the same effect creating the identity function as ⎕FX did. I used this until Dyalog Version 13.0 finally introduced the right function ⊢ that implements the right identity as a primitive. Thus many of my functions now start with the even more terse but equally useful and recognisable:

      [1] ⍺←⊢

In twenty-five years I’ve found no situation where this technique fails to work and none where any other excels it.

Assigning a default

Some might object here that it’s easier to assign a known value to the argument as default.

      [2] ⍺←default

This can still be done after [1] above if we are content to use another name:

      [2] larg←⍺⊣default

This uses the other new identity function of Dyalog 13.0, left ⊣ that returns its left argument if it has one or its right otherwise – following the J implementation of left [ that I believe was Ken Iverson’s later preference, rather than that of SAPL and APLX that don’t return a result from the monad. Thus in our D-fn if ⍺ is supplied then ⊣ in the above expression gets its value as left argument and returns it to be assigned as larg; if not then ⍺ is ⊢ from [1]; ⊣ is called as a monad and returns its right argument default instead, to be passed by ⊢ and assigned to larg.

Very often though, assigning a default to be passed to another function is not the best thing to do. If a called function f04 is designed to be ambivalent we can assume it must be able to take care of its own defaults. Were an ambivalent caller f03 to provide another default it would override whatever was the intention of the author of f04. This must be decided case by case.

All this becomes much more important in the case of operators. If a derived function, an operator with its operand(s), is called without a left argument it very reasonably can be inferred that, all else being equal, the function operand is also expected to be called as a monad. Each ¨ is a primitive case in point.

More ambivalence

We usually apply the word ambivalent to functions permitting the optional elision of left argument ⍺: argument ambivalence. The dichotomy above where ⍺ can be an array or a function (supplied or assigned) also occurs in operators' accepting either an array or a function as operand ⍺⍺ and/or ⍵⍵. Hence: operand ambivalence. The left operand of the / operator of APL2 and the right of the power ⍣ operator of Dyalog both have this property.

In:

      [4] ... ⍺⍺⊣⍵

the use of left ⊣ allows us to supply left operand ⍺⍺ as either a function or an array without our having to code separate cases for either of them. If ⍺⍺ is an array then ⍺⍺⊣⍵ is just ⍺⍺, the left argument of ⊣. If it’s a function then it’s applied to the result of monadic ⊣ which is ⍵.

Differential processing

Perhaps unusually, different processing is sometimes required for the monadic case of a function than the dyadic. Unusually in defined functions, that is; we should be surprised if primitives didn’t exhibit this behaviour. In a D-fn this would probably entail the using a guard. What should be the condition?

In:

      [2] ⍺←⊢
      [3] 1≡⍺ 1: expression

1≡⍺ 1 returns true and invokes expression if the value 1 matches the value of ⍺ 1. This might seem unlikely. How can a scalar 1 match a two item vector? Obviously it can’t but if ⍺ is ⊢ then ⍺ 1 is ⊢ 1 which is just 1. In fact this works equally well for any array replacing the 1 on both sides of the equality so, for instance, to make it more explicit:

      [3] 'monad'≡⍺'monad': expression

Scripting languages from very early on have permitted similar constructs:

      If . = . %1 Then expression-for-missing-%1

which behaved identically after parameter substitution for optional %1 though the explanation would be somewhat different.

Conditions and multi-line expressions

It’s possible to break any D-fn inside its enclosing left or right brace. The next two D-fns are equivalent:

      {single expression}

      {
            single expression
      }

Not only this but long lines embedding anonymous D-fns can also be broken:

      {this long line{embeds another D-fn}within it}

      {this long line{
            embeds another D-fn
      }within it}

Guards sometimes present a problem in that a quite long line is required for the terminating expression to run when the condition is true. By embedding the expression in an anonymous D-fn we can extend the statement onto more than one line:

      conditional expression: ⍺{
            long terminating expression to be triggered when true
      }⍵

Sometimes we want to run an expression conditionally or perhaps one expression or another to extract the value within, rather than terminating, the current function. In this case embedding the entire guard with its conditional, consequent and alternative expressions in an anonymous D-fn serves very well:

      value←⍺{condition: expression if true
            expression if false
      }⍵

In both these cases it’s worth supplying the original arguments so that all names and tokens inside the inner D-fn reflect those outside.

Commented one-liners

The following doesn’t define a D-fn:

      f03←{expression ⍝ comment}
SYNTAX ERROR
      f03←{expression ⍝ comment}
                                ∧

The reason is that the final brace is a part of the comment and doesn’t actually close the D-fn. This has unfortunately and unjustifiably caused some to claim that D promotes uncommented code. The ability to create functions – D-fns or otherwise – without comments; believing them to be unnecessary; or that one doesn’t have time to write them promotes uncommented code.

Fortunately the left ⊣ function comes to the rescue yet again. We can use:

      f03←{expression⊣'comment'}

or the prettier:

      f03←{expression⊣'⍝ comment'}

or even:

      nb←⊣
      f03←{expression nb' comment'}

Naming operands

This is something I’ve relearned to do only very recently having gone perhaps too far down the road of designing algorithms that required no assignments whatsoever.

It’s always useful and very often necessary or desirable to be able to define anonymous D-fns embedded within the coding of a D-fn or D-op. This entails another level of parameter substitution so for instance, operator where, that needn’t be understood, contains two levels of embedded operations:

      where←{
            (⍴⍵)⍴⍺⍺{⍵ ⍺⍺{⍵⊖⍺,[-0.1]⍵\⍺⍺ ⍵/⍺},⍵⍵⊣⍵}⍵⍵,⍵
      }
                   ↑                             ↑
                        ↑                  ↑

In order that the operands ⍺⍺and ⍵⍵ continue to refer to the same thing they must be passed in as operands at each level making the inner operations operators also. It is also quite difficult to see that the innermost operation is a monadic operator deriving a dyadic function while the next out is a dyadic operator deriving a monadic function as is the outer, named, D-op. The following version appears much simpler merely by naming the operands:

      where←{f←⍺⍺ ⋄ g←⍵⍵
            (⍴⍵)⍴{⍵{⍵⊖⍺,[-0.1]⍵\f ⍵/⍺},g⊣⍵},⍵
      }

Lexical scope rules mean that named items are visible to any D-fn or D-op defined within the one wherein they are named. Thus they don’t have to be respecified for the inner levels so we actually reduce the number of tokens in the line as well as avoiding the added complication of the embedded operations’ being promoted to D-ops.

Separating operands and arguments

Operator operands can be either functions or arrays. The power ⍣ operator takes either a terminating function or an iteration count as its right operand. In the latter case we have an array (scalar integer) operand and an array right argument. This brings up the question of the relative strengths of operand- versus vector-binding. Implementors chose differently in this regard, vector binding being the strongest of all in Dyalog APL and NARS2000, while both bracket- and right-operand binding exceed it in APL2 and APLX.

Consequently we have the following dichotomy. The expression:

      f dop a b c

where f is a function, dop is a dyadic operator and a, b and c are arrays, is parsed as follows:

APL2, APLX &c.:

      f dop a b c ←→ (f dop a)(b c)

an array expression; function with argument; a is the operand and b c is the argument.

Dyalog, NARS2000 &c.:

      f dop a b c ←→ f dop(a b c)

a function expression without an argument; a b c is the right operand.

This difference adds an extra complication to Dyalog which is well worth the cost considering the alternative:

APL2:

      f dop 1 2 3 ←→ (f dop 1) 2 3

what appears to be a numeric vector is treated as a scalar separated from the remainder.

So in Dyalog we need to separate the array right operand from the argument. The surest way to do it is to parenthesise the entire function expression:

      (f dop operand)argument

But a neater way is to insert an identity function between the operand and argument. Right-operand binding is stronger than left-argument so the identity is parsed as monadic:

      f dop operand+argument
      f dop operand,vector

Until recently plus + as above was common as a separator because its monadic definition was to return the right argument unchanged. But Dyalog 13.0 has complex numbers and monadic plus is now redefined as conjugate for that domain:

      + 12J34 'zxc'  →  12J¯34 'zxc'

so it is no longer a reliable identity in all cases. Fortunately and in part consequently Dyalog 13.0 also includes the two new identities mentioned above: ⊣ and⊢; either of which suffices for our purpose:

      f op operand⊢argument
      f op operand⊣argument

Boolean equivalents

We learn that there are ten scalar dyadic boolean functions in APL: < ≤ = ≥ > ≠ ∨ ∧ ⍱ ⍲. Each one maps each of the four pairs (0 0)(0 1)(1 0)(1 1) onto one of the pair 0 1 in a different way. So there should be exactly sixteen such functions; there are six others missing from APL. And given that there is that mapping it must be possible to enumerate them in a standard way.

Applying r ← ,0 1 ∘.f 0 1 gives us the four results. Applying 2⊥r gives us a number between 0 and 15. Here they are put into the order dictated by that result:

   .---.--------------.---------.----.
   | ∧ | ,0 1 ∘.∧ 0 1 | 0 0 0 1 |  1 |
   | > | ,0 1 ∘.> 0 1 | 0 0 1 0 |  2 |
   | < | ,0 1 ∘.< 0 1 | 0 1 0 0 |  4 |
   | ≠ | ,0 1 ∘.≠ 0 1 | 0 1 1 0 |  6 |
   | ∨ | ,0 1 ∘.∨ 0 1 | 0 1 1 1 |  7 |
   | ⍱ | ,0 1 ∘.⍱ 0 1 | 1 0 0 0 |  8 |
   | = | ,0 1 ∘.= 0 1 | 1 0 0 1 |  9 |
   | ≥ | ,0 1 ∘.≥ 0 1 | 1 0 1 1 | 11 |
   | ≤ | ,0 1 ∘.≤ 0 1 | 1 1 0 1 | 13 |
   | ⍲ | ,0 1 ∘.⍲ 0 1 | 1 1 1 0 | 14 |
   '---'--------------'---------'----'

Alternatively we can define the operator or adverb:

      fnum←{2⊥,0 1 ∘.⍺⍺ 0 1}
      ∧fnum 0
1
      ∨fnum 0
7
      ...

The six missing would be scalar pervasive equivalents of: 0:{0}; 3: ⊣; 5: ⊢; 10: {~⍵}; 12: {~⍺}; and 15: {1}.

Given so many boolean dyads it seems odd that much of the time only two of them get used. We’ve all seen such as:

      :If p∨~q ⋄ ...
      :If (~p)∧q ⋄ ...

Applying fnum to each of these:

      {⍺∨~⍵}fnum''
11
      {(~⍺)∧⍵}fnum''
4

we find that {⍺∨~⍵} is function 11, ≥ while {(~⍺)∧⍵} is function 4, < so why don’t we see:

      :If p≥q ⋄ ...
      :If p<q ⋄ ...

for boolean p and q? I’ve been told that they’re unfamiliar in this context. But so was most of APL to each one of us when we started. Nevertheless many such equivalences exist. And the reason we don’t automatically associate them with the logical domain is because of their common names.

Most people refer to ≥ as greater than or equal and some as not less than; just two obvious ways to say the same thing. Very few if any call it or not or implied by but if the comparands are propositions or logicals with boolean values 1 or 0 then it makes complete sense.

Below are some of the more common boolean equivalents many of which I’ve seen in use. This is not just a matter of terseness versus verbosity. If p and q are large then applying two, three or even four functions to them rather than one is significant.

   .-------------.--------------.----.---.-----.
   | expression  | D-fn         |fnum| f | exp |
   |-------------+--------------+----+---+-----|
   | ~p∧q        | {~⍺∧⍵}       | 14 | ⍲ | p⍲q |
   | ~p∨q        | {~⍺∨⍵}       | 8  | ⍱ | p⍱q |
   | p∧~q        | {⍺∧~⍵}       | 2  | > | p>q |
   | p∨~q        | {⍺∨~⍵}       | 11 | ≥ | p≥q |
   | ~p∧~q       | {~⍺∧~⍵}      | 13 | ≤ | p≤q |
   | ~p∨~q       | {~⍺∨~⍵}      | 4  | < | p<q |
   | (~p)∧q      | {(~⍺)∧⍵}     | 4  | < | p<q |
   | (~p)∨q      | {(~⍺)∨⍵}     | 13 | ≤ | p≤q |
   | (~p)∧~q     | {(~⍺)∧~⍵}    | 8  | ⍱ | p⍱q |
   | (~p)∨~q     | {(~⍺)∨~⍵}    | 14 | ⍲ | p⍲q |
   | ~(~p)∧~q    | {~(~⍺)∧~⍵}   | 7  | ∨ | p∨q |
   | ~(~p)∨~q    | {~(~⍺)∨~⍵}   | 1  | ∧ | p∧q |
   | (p∨q)∧~p∧q  | {(⍺∨⍵)∧~⍺∧⍵} | 6  | ≠ | p≠q |
   | (p∧q)∨~p∨q  | {(⍺∧⍵)∨~⍺∨⍵} | 9  | = | p=q |
   '-------------'--------------'----'---'-----'

Iteration

It’s often necessary to be able to repeat a function sequentially with one argument different but each time using the result of a previous iteration as the other argument. Say we have prototype p and vector v of items to be applied in order a b c d e. Depending which way round our function requires its arguments this will be:

      ((((p flr a)flr b)flr c)flr d)flr e

or:

      e frl(d frl(c frl(b frl(a frl p))))

where: frl ←→ flr⍨

Given the likely variable length of our vector most will code this as:

      :For i :In v
            p←p flr i   ⍝ or: p←i frl p
      :End

but look again at the expression with repeated frl. Not everyone will recognise it immediately but this is a precise definition of reduction:

      ⊃frl/e d c b a p

The right argument of the reduction is merely v reversed and with p stuck on the end:

      ⊃frl/(⌽v),⊂p

Applying a monad to one of a pair

Given a 2-vector it’s required occasionally to keep the first item intact while running a function between the two preferably without having to prise the two apart beforehand. The description might not immediately suggest scan but that’s exactly what it requires:

      f\x y ←→ x(x f y)
      f\two ←→ (0⌷two),f/two

At least as often I find that what I really want to do is to keep the first while applying a monad to the second. All D-fns are ambivalent. But in this case I want the monadic definition of the function to apply. If the function ignores its left argument then the above is all we need:

      {2×⍵}\two ←→ (0⌷two),{2×⍵}/two
                ←→ (0⌷two),{2×⍵}1⌷two ⍝ for a de facto monad

But if our function is properly ambivalent and works differently in the two cases this will always use the dyad. We need a way to force an ambivalent function to act as a de facto monad. Composing it with right ⊢ as ⊢∘{…} pushes the left argument out to become that of ⊢ which it will ignore. Our function never sees the argument at all:

      minus←{⍺←⊢ ⋄ ⍺-⍵}
      minus 5           ⍝ monad: negate
¯5
      8 minus 5         ⍝ dyad: subtract
3
      8 ⊢∘minus 5       ⍝ de facto monad: negate
¯5
      minus\8 5
8 3
      ⊢∘minus\8 5
8 ¯5

Afterword

We all have our own favourite tricks and tips. The first version of this was a preamble to a presentation I gave at Dyalog’11 in Boston and this lengthier version is designed to accompany a re-presentation of that paper that appears in the same edition of Vector as this. I hope it’s been of some use.

References

http://dyalog.com/download/dfns.pdf

Current issue

Volumes