At Play With J
Token Counting: APL versus J
One measure of the difference between APL\360 and J is the number of tokens needed in a function to do the same work. I’ll discuss two examples, comparing well-written, but rather old, APL solutions, to well-written J solutions.
Direct from Atomic
The first APL example is a marvellous function written by Luther Woodrum, that appears on page B.11 of the APL\360 Users Manual of 1968. Luther is best known to me by his design and implementation of the original upgrade and downgrade. His function PERM, below, is given a left argument A, the length of the permutation to be constructed, and a right argument B, the anagram index of the permutation to be constructed. It uses an algorithm first discussed, I believe, by D. H. Lehmer of the University of California in Berkeley. Here it is:
∇ Z←A PERM B;I;Y
[1] I←⍴Z←1+(⌽⍳A)⊤B-1
[2] →0×⍳0=I←I-1
[3] Z[Y]←Z[Y]+Z[I]≤Z[Y←I+⍳A-1]
[4] →2
∇
Disregarding the header, footer and line numbers, this has 55 tokens. What does it do? Lets suppose that A is 9 and N is 288918. Then the result Z is
7 1 3 2 6 4 0 5 8
and this is the 288918th permutation of order 9.
Here I should tell you that much of the material that follows is taken from my At Play With J article in Vector 12.1 (July 1995). This is not surprising, since it deals with the subject of Luthers function.
The first step is to make a function that gives the factorial digits of permutations of length A. Luthers function uses origin 1, and that obfuscates things, so I’ll use the more suitable 0-origin indigenous to J. To give you some idea of what the factorial digits number system is like, here are the six factorial digits in the system for three:
fdb =: >: @ i. @ - (fdb 3) #: i. ! 3 0 0 0 0 1 0 1 0 0 1 1 0 2 0 0 2 1 0
You can see the regularity in the rows. Notice also that every row ends in zero. This is true for all factorial digits systems.
In PERM, line 1 gives Z the factorial digits value for B.
fdb 9 NB. fdb n yields the radix digits of order 9 9 8 7 6 5 4 3 2 1 (fdb 9) #: 288918 7 1 2 1 3 1 0 0 0
Lines 2 and 4 control the executions of line 3, so that line 3 is executed only as long as I is positive. Line 3 can be defined as function g :
g =: [ , ] + ] <: [ NB. left , right + right >: left
Here Ill have to pause, and to point out that what Im doing with g is taking the clutter out of PERM line 3. What weve done so far reduces line 3 from 26 tokens to 7, yet it does precisely what line 3 does. Perhaps if I show the successive uses of g youll get the idea:
0 0 1 0 1 2 1 0 2 3 3 1 0 2 4 1 4 2 0 3 5 2 1 5 3 0 4 6 1 3 2 6 4 0 5 7 7 1 3 2 6 4 0 5 8
Successive lines are formed this way: for example, given line
1 4 2 0 3 5
the next line is formed by beginning with the corresponding factorial digit, in this case 2, and following with the previous line in which each item greater than or equal to 2 has 1 added to it.
2 1 5 3 0 4 6
Notice that each line is a permutation.
The function g is made of three forks:
g +-+-+--------------+ |[|,|+-+-+--------+| | | ||]|+|+-+--+-+|| | | || | ||]|>:|[||| | | || | |+-+--+-+|| | | |+-+-+--------+| +-+-+--------------+
The three forks are:
fz =: ] >: [ fy =: ] + fz fx =: [ , fy
Heres a J function, a functional duplicate of PERM:
sr NB. standard form from reduced /:@/:@,/ ra NB. reduced form from atomic ([: fdb [) #: ] sra =: sr@ra f. NB. standard from atomic sra /:@/:@,/@(([: >:@i.@- [) #: ])
;: sra NB. tokens of sra +--+-+--+-+-+-+-+-+-+--+--+-+--+-+-+-+-+--+-+-+ |/:|@|/:|@|,|/|@|(|(|[:|>:|@|i.|@|-|[|)|#:|]|)| +--+-+--+-+-+-+-+-+-+--+--+-+--+-+-+-+-+--+-+-+
# ;: 5!:5 <'sra' NB. count of tokens in sra 20
The function sr uses an identity I found March 9, 1970, when I was looking at Luthers PERM once more:
N =: 7 P =: 1 3 2 6 4 0 5 7 (/:/:N,P) -: (N,P+N<:P) NB. double upgrade matches addition 1 N,P+N<:P 7 1 3 2 6 4 0 5 8 /:/:N,P 7 1 3 2 6 4 0 5 8
So double-upgrade can take the place of Luthers line 3, and sra squeezes PERM down from 55 to 20 tokens.
Pyramigram
In APL Quote Quad 11.1, September, 1980, I asked for solutions to a problem posed by Linda Alvord, of Scotch Plains Fanwood High School in Scotch Plains, New Jersey. Here it is:
Write an APL function PG that takes a scalar integer argument from 1 to 26 and produces a rectangular character matrix containing a pattern like this:
PG 5 Q W Q Q E W R Q W E Q E R T W
In each row r there are r randomly selected and randomly ordered letters, separated by single spaces, arranged to form an equilateral triangle. The (n-1) letters in row n-1 are selected from the n letters in row n.
This was one of the most popular problems Id ever given, and there were a wide variety of solutions, including ones by some fairly gifted programmers, but one stood out from the rest, from Roger Hui. It took me several hundred words to describe what his function did. Roger recently told me that he had written his function without having access to an APL implementation. His function PG was not written in the conventional way that a function was defined in APL\360. Instead, it uses the alpha-omega form introduced by Ken Iverson, in which the left and right arguments are denoted by ⍺ and ⍵. Here is a J function which uses the same algorithm as his from 1980:
PG =: ([:i.[:-])(|."0 1)1j1"_ #"0 1(([:/:(([:-/\]- ~[:|.[:i.[:+:])#([:i.[:+:]) *[:*:])+[:?~[:*:]){[:,(],]){.(('ABCDEFGHIJKLMNOPQRSTUVWXYZ'"_{~]?26"_ ) {.~[:-.[:+:])$~],[:+:])$~],]
I wont dwell on this version other than to say that (1) it has 103 tokens and (2) it is in tacit form. The details are discussed in the cited issue of APL Quote Quad. It is a truth universally acknowledged that a good programming language, worked over and pondered over for a sufficiently long time by the same people who had produced the original, may very well show advantages over the original. Thus I sent an email to the J Forum list, and messages to key people in the APL community, for new solutions to the problem. I made it clear that the degree of improvement in expressiveness, as measured by token count, would be the criterion used. I received new solutions from the J and the APL communities. The shortest token count among the numerous APL solutions was 30, and there were several that used up to 60 tokens. The shortest J solution, by Roger Hui, was 20 tokens long, so Ill discuss that one only. This is it:
h =: /: # ? # pyr =: i. &. - @ # |. " _1 [: 1j1 & # @ h \ h
His h has 4 tokens, and pyr has 16, totalling 20.
The subfunction h is a hook. It randomizes its argument. Its first function is /: and its second function is the fork # ? # .
h 'qwert' ertqw
The function pyr’s structure may best be seen using box display:
+---------------+---------+--------------------------+ |+---------+-+-+|+--+-+--+|+--+-------------------+-+| ||+--+--+-+|@|#||||.|"|_1|||[:|+---------------+-+|h|| |||i.|&.|-|| | ||+--+-+--+|| ||+---------+-+-+|\|| || ||+--+--+-+| | || || |||+---+-+-+|@|h|| || || |+---------+-+-+| || ||||1j1|&|#|| | || || || | | || |||+---+-+-+| | || || || | | || ||+---------+-+-+| || || | | || |+---------------+-+| || | | |+--+-------------------+-+| +---------------+---------+--------------------------+ A B C
The three outer boxes, A, B, and C, tell us that we have a fork. Box C yields the object to be rotated, and is the most interesting part. It produces a right-triangle version of the equilateral result required.
Its copy function # is a dyad. By bonding ( & ) it on the left with 1j1 we create a monad that makes one copy of its argument item and follows this with one fill item. A further function is made by using atop ( @ ) between the aforesaid monad and h. One further step is to apply the prefix adverb ( \ ) to this combined function to yield 1j1 & # @ h \ . One last step supplies cap ( [: ) to the left of this and h to the right, and we have a function that randomizes the argument qwert, then randomizes each prefix, and provides a space after each item of the prefix, like this:
([: 1j1 & # @ h \ h) 'qwert' r r t t q r e r t q w q e r t
Rotation (box B) uses reversal, item rank (|. “ _1) so that scalar items from the left argument (box A) rotate list arguments from the right argument (box C).
Box A creates the left argument. It makes good use of negative arguments to i. and of dual. Ordinarily i. – y yields a descending list of positive integers, but we want to do right rotations, which require a negative value. Thats the reason for using dual minus. (&. - ).
i. _5 4 3 2 1 0 (i. &. -) 5 _4 _3 _2 _1 0
Thus, the whole result is given by 20 tokens:
(i. &. - @ # |. " _1 [: 1j1 & # @ h \ h)'qwert' e r e r e t e q r t r w q t e
From 103 tokens in 1980 to 20 in 2005, and by the same author, is a huge reduction.
[12 Jan 2006]