Current issue

Vol.26 No.4

Vol.26 No.4


© 1984-2024
British APL Association
All rights reserved.

Archive articles posted online on request: ask the archivist.


Volume 11, No.4

Jot-Dot-Min: an outer-product of rock-bottom APL matters

by Ian Clark, .

Robomaniacs of the Seventies used to delight in computerising out the Man-in-the-Loop (meaning the Wiener feedback loop) as the most unreliable component. This led to a proliferation of soul-destroying jobs where the Man was only there because the programmer couldn’t figure out how to do without him. Many weapons systems had a man left in them solely for the purpose of having someone to court-martial if it went off by mistake. The celebrated cybernetician Stafford Beer uttered a warning that the progress of computer systems, far from increasing wealth, would rob vast numbers of the possibility of gainful employment. Perhaps even a soul-destroying job is better than no job at all.

Then the Human-Centred movement arose and turned the computing profession on its head. Far from being a weak link in a feedback loop, a sort of liveware fuse, the human in the system (the campaign for inclusive language was underway too) was its only reason for existence. Indeed all computers were only person-to-person information pumps and a computer system nothing but a communications medium or a personal amplifier. The US Federal Communications Commission (FCC) permitted Ma Bell to compete with Big Blue, thus wiping out the legally enforced distinction between data processing and electronic communication, ARPA fostered what became the Internet, and Apple promoted truly interactive computers for the rest-of-us.

What happened? A generation of users grew up oblivious of job control languages, except in the twilight world of DOS batch language (the language never even got a proper name and was deliberately crippled by a lack of numerical variables or any sort of arithmetic). The soul-destroying job re-emerged, as personal computer users struggled to earn their crust by endlessly repeating the same few menu selections and keystrokes again and again. People rediscovered the first principle of computer use: if you ever succeed in doing anything useful with a computer, you’re going to want to do it again and again, but with variations here and there, which you hope you can anticipate. Isn’t that the whole of programming?

After trying to bolt-on interactive event recorders like MacroMaker in a vain attempt to record and replay useful sequences, Apple has just reinvented JCL (Job Control language) in the form of AppleScript, and appeals for all new applications to be made ‘scriptable’, i.e. programmable by a series of lines of text in a ‘script’ (read: batch file) as well as executable by screen interactions. Windows, the great me-too, will go the same way in due course. Ah, but scripts can be generated by user interaction, they say. Yes, but nobody writes them that way. They’re all writing programs again. And in a language that isn’t even as advanced as BASIC.

In spite of Adrian Smith’s article on control structures in VECTOR 11.1, which pondered the ins and outs of loops very nicely, perhaps it’s time for a few musings on the subject of Branch (). Since we’re all beginners together, we won’t ask whether you should be using Branch at all, but consider some of the funny things people do with it. And one of these funny things is to write loops.

Fifty years of program language development (far, far longer if you go back to Babbage and Countess Lovelace) and humans are still writing loops! It’s like the Buddhist Wheel of Becoming. Santayana – he say: one who forgets the past is condemned to relive it.

One of the joys of APL programming is to replace loops with bit-strings, which you put together by Boolean operations. It’s just like rubber-stamping a pattern instead of trying to knit it. Novices who don’t grasp that, erstwhile Fortranners and Cobollers who make straight for the right-arrow and sigh – ah! intelligent life! – are stuck in a Santayanan loop. Here’s how you write a loop if you’re a C programmer:

for(i=0; i<10; i++) { 

and you write it again and again until you can write it in your sleep (and generally do). Because if there’s one thing that C programs do to the exclusion of nearly all else, it’s step through a stream of characters, ‘filtering’ them. C arose out of the efforts to implement UNIX, and UNIX reduces the whole world to streams of characters which get piped through filters. So C is a language with a lot of plumbing in it.

It loops ten times, of course, stepping i from 0 to 9. The widget: for( ; ; ) has three slots separated by semicolons, these say what to do at the start of the first loop, the condition for another loop, and what to do at the end of each loop. If you simply must copy this loop in APL, then APL lets you (because it’s very forgiving), and the right-arrow looks comforting. “Here, I’ll show you the way”, it seems to say. And the way is? – back where you’ve come from. If you’re lucky.

Here’s the for-loop of C done into APL:

 DoSomethingWith i 

Looks uglier than it does in C, doesn’t it? In both cases, the code inside the loop can be imagined to do something like: pass forward the i’th character if it’s not blank. That’s how you filter out blanks the C way.

Instead of knitting the output, character by character, the APL way is to construct a bit-string with one bit for each character in the input string (let’s call it S), a 0 bit if it’s blank, a 1 bit otherwise. The 0 bits mark the characters we want to filter out.

S≠' '

Put it into B like this: B←(S≠' ')

Then the following is S with all the blanks squeezed out: B/S

You can put it all together in one expression: SS←(S≠' ')/S

which avoids introducing B as a receptacle for the bit-string, using it immediately it is formed.

Yes, but how far can you stretch that principle? You can do some pretty fancy tests inside a loop, to decide whether you want to pass forward the character or not. Surely the bit-string approach only works for simple things?

No. The only limit to what you can do with bit-strings is your imagination (or your command of Boolean algebra, but you need that in C too). Suppose we want to throw out, not just blanks, but periods too.

B marks the blanks with zero: B←(S≠' ')

C marks the periods with zero: C←(S≠'.')

so B∨C (read: B or C) is the bit-string we need. Or putting it all together as before: SS←((S≠' ')∨(S≠'.'))/S

In practice you’d use (~S∊' .') instead of (S≠' ')∨(S≠'.') and you could put further characters you might want to expel between the quotes. But it serves to illustrate the principle. And if you’d like to throw out periods only if they’re followed by blanks, then just shift B to the left and combine it with C: (1↓B,1)∨C which flags only those periods for which there is a blank to the right of it. All together, now:

SS←((1↓(S≠' '),1)∨(S≠'.'))/S

You can do all your filtering this way. It would make a nice car-sticker:

“APL-ers Do It With Bit-Strings”.

To quote David Diamond:

...alas for me:
My programs map to binary.
No other reason need be sought
Why half my work comes out to naught.

As a beginner I used to feel edgy about composing the whole history, in advance, of what I expected a loop to do. Wasn’t it inefficient? After all, a loop only creates the truth values it needs for the current cycle. However, APL bit-strings are efficient. Very efficient. A typical C compiler will reserve a whole integer (2 bytes, or 16 bits) for a Boolean value, even in an array, but APL literally stores bits in bits. So a 16-bit string would only occupy 2 bytes for its actual data. Of course there’s the workspace overhead for a variable, or even a transient value, but this is the same for a long bit-string as a short one. Furthermore, when you combine two bit-strings as in BŸC, the ‘or’ operation isn’t done a bit at a time but a word at a time. In 32-bit computers, that would combine 32 bits in a single cycle, the net result of 32 loops the C way.

The more advanced the computer, the more you can expect APL to start competing with C in filtering tasks. APL is about to come into its own with the advent of parallel computing.

Meanwhile there are occasions when you simply must have a loop. The commonest is the outer loop or duty cycle in a program which waits for user input, then acts on it. What in BASIC they used to call the ‘input-execute-output’ cycle. Instead of the C-faithful example above, you’re better off using a computed branch back to the start of the cycle.

 →LOOP IF carryOn

Computed branches are the mischief. There’s a hundred and one ways they can go wrong, and what every program needs like a hole in the head is a branch to goodness knows where. For that reason alone, the right-arrow should be handled like high explosive.

In this example, carryOn is supposed to be a scalar Boolean (a 0 or a 1), or a function (or expression) returning a scalar Boolean. Notice how we’ve used a dyadic function IF to make the branch more readable and provide some reassurance it isn’t going to blow up and go off on a wild branch. In I-APL you can define IF in either of two ways:

     ∇ Z←A IF B 
[1]  Z←B/A 

or: IF: ?/⍺

The use of IF is effectively the same as the commoner: →(carryOn)/LOOP because in the expression: LOOP IF carryOn takes the value: LOOP and ? takes the value: carryOn when IF starts to execute.

How does it work anyway? Well, 1/xxx is equivalent to xxx whatever xxx happens to be, char or num, vector or scalar. On the other hand, 0/xxx is always the empty vector. And right-arrow followed by the empty vector is a branch to nowhere – it does just nothing.

There’s more ways of collapsing a label to the empty vector than applying ‘/’ to it. You might come across: →(carryOn)↑LOOP which has much the same effect. You can also use a down-arrow to mean “branch if carryOn is false”, i.e. →(carryOn)↓LOOP behaves like: →(~carryOn)↑LOOP.

One of the dreams of programming is that whatever you write works first time. But take a loop like the first example we gave. Does it loop 9 times or 10? It could be a matter of life or death.

I heard about a US Army conscript who was put to work on programming. He inserted a line of code in a monthly run to print an analysis of his sergeant’s parentage the day after his demob date. Unfortunately he got his mixed up with his < and spent his last month in uniform behind bars. The moral is: if you want something to branch reliably, always use tried and tested code.

The following never lets me down:

10 repeatsOf 'HALLO'         where: 

     ∇n repeatsOf s 
[1]  ⍎s 
[2]  →×n←n-1 

but that’s because I never try to write down repeatsOf from memory, but always copy it from a workspace of my pet utilities. That’s the whole idea of reusable code. If it’s wrong, somebody’d know by now. Eat your heart out, Formal Methodology!

Speed Test: how does this work? Quickly now, quick-quick-quick...

[1]  →carryOn,DoingSomething 

That’s right. The Boolean expression carryOn is supposed to return a 1 or a 0. DoingSomething must return a numeric result, or else you have to split LoopyLoop into two lines. The catenation is a red herring, since Branch is only going to look at carryOn. So we have a Branch either to [1] (start over) or to [0] (quit). And now you know how repeatsOf works too.

That’s all just aversion therapy to make you love bitmaps. But using bitmaps for filtering really becomes interesting at the classroom level when you apply them to 2-dimensional arrays. How do you flag the neighbours of a random assortment of elements in a 2D array? How can you use the result to build a population model where individuals infect their neighbours at random with given probability? How can you spot a Dover sole underwater? (Both the sandy bottom and the sole are random bit-arrays.) All this in the next issue, plus the rudest (uncommented) APL program ever published – Conway’s Game of Life as a single-liner!

Of course, if anyone wants to submit solutions to these problems to the Editor, EV, in time for the next issue, feel free! I shall publish acknowledgements of the best solutions along with my own. Or instead of my own. And remember this – Binary Data is Fun Data, but for real laughs use a Branch!

(webpage generated: 18 October 2006, 03:53)

script began 12:05:05
caching off
debug mode off
cache time 3600 sec
indmtime not found in cache
cached index is fresh
recompiling index.xml
index compiled in 0.1869 secs
read index
read issues/index.xml
identified 26 volumes, 101 issues
array (
  'id' => '10013250',
regenerated static HTML
article source is 'HTML'
source file encoding is ''
read as 'Windows-1252'
URL: =>
URL: =>
completed in 0.2174 secs