Kalgol resurrection — status chez DHo

There is some more recent info here.

Strategy

The hope is that we can preserve the source code of Kidsgrove Algol (Kalgol) so that it can be browsed on-line and executed so as to compile Algol60 programs. The Kalgol.compiler operated within a magnetic tape based filing system called POST. The whole system ran as a series of overlays called “bricks” each of which occupied store from word 1600 upwards. The first 1600 words of store held a permanently resident module (KAB00) which supplied facilities to the overlays via a subroutine call to word 70 (JSE70; in Usercode). The API for this interface was in terms of “channels” which were I/O streams implemented on magnetic tapes.

We do not have quite all the bricks, but have had considerable success in reverse engineering missing material. So we are not confident that we shall be able to get it to work sufficiently to demonstrate compilation and execution.

Current State − June 2017

You can run the compiler on your own machine.

Download the following files:
        systape.txt, mksys2.bin, mkchan.c, KAB00.bin
These are the only files specific to the kAlgol compiler, but you also need a KDF9 emulator, and a Usercode assembler.

The emulator is in a single file of C source:
        kdf9.c
The assembler is a yacc program, but the yacc output files are available on-line, so you need to download:
        kal3.c, y.tab.c, y.tab.h.

If you want to use yacc yourself this is the input file: kal3.y.

There is a file containing test programs: test.a60. The compiler runs the first program in the file. Re-ordering the file enables others to be run. The running of the compiler has been packaged in: testpt.sh and testpt.bat.

You also need a system tape (emulated in a file) which is generated by the following commands:
        kdf9 -d mksys2.bin systape.txt
        mv -i tape4.mt tape9.mt


When you have copiled your program, the binary program will be in a file called mt.out. It can be run by the command:
        kdf9 -d mt.out
If you wish to read data from paper tape on stream 20, the name of the data file is an extra parameter. Only streams 20 and 30 work, and the only routines are open, close, read, writetext, and output.

Current State − January 2017

We have discovered a hand-written version of brick 20. Obviously it is an early version, and will have imperfections. Nonetheless, David Huxtable has been able to coax it into life, and we plan to develop it into a viable brick 20. David Holdsworth&rsquo's efforts to write brick 20 in C have been put on hold.

Quite a few test programs compile without errors, and we have made a start on producing a runtime system. The Algol60 standard functions and the KDF9 I/O routines were incorporated from library system. We have an interim implementation such that a program that starts:
        !begin !library A0;
will include the contents of a file called A0.a60 in the current directory. The layout has to be exactly as above. There is a new brick 83 written in C which takes the generated code from channel 7 and Produces a file that can be assembled with kal3.

I think that it is time to produce a demo ZIP file.

State as at October 2016

We have recovered loads of documention and put digitised images on-line here.

We now have a re-engineered brick 20 which allows us to successfully compile programs without the optimiser. We get the generated Usercode. When we try the optimiser we get an error report 1/41, which is not yet understood.

The POST system is started by running KAB00, which immediately calls KAB80 alongside a comment “CALL SETUP BRICK”. KAB80 is among the absentees. Also absent is brick 20 KAB20, which is documented as procedure classification, and brick 84, the Usercode compiler. There is also a reference in the code to brick 10, but this may be a misread awaiting our copy-typing.

We haver dealt with the absence of brick 80, the setup brick, by implementing the JSE70 API using files for the channels. This is surely much simpler than the original KAB00 which was mapping the channels onto magnetic tapes, the number of which could be different on different runs. It was some time before we found documentation of this API, but by examining the code and running KAB01 with a KAB00 replacement containing diagnostic facilities, we started learn its secrets. Since then we have the documentation, and are quietly confident that our implementation is faithful.

Our strategy for dealing with the absence of brick 84, the Usercode compiler, was originally to feed the generated Usercode into our existing assembler kal3.c, which is a modern implementation written in C and yacc. However, we may well try to modify the paper tape compiler to make a KAB84 substitute.

It is the absence of brick 20 that still gives most cause for concern, but we now have a brick 20 replacement which is proving sufficient for non-optimised compilations.

It remains to build a run-time system, which we plan to achieve by using code from the Whetstone system.

Example showing library contents -- i.e. runtime code.

Digitisation

The source code has survived as a printer listing which we have as scanned PDF images, which can be found at:
        http://sw.ccs.bcs.org/kalgol.src/index.html

We have had a partially successful OCR exercise done by Terry Froggatt, which has then been proof read to produce the source code with which we are experimenting to explore the APIs for missing bits. From this exercise, we have compiled a list of common misreads, which is used to the a correction to the raw OCR output. We originally had a programme of copy-typing from the PDF images and consolidating with the proof-read OCR source code. So far this has only detected one error, but the behaviour of the code would suggest that there are more errors to discover. However, the effort has rather fizzled out, as it rather looks as though the combination of OCR, editing of common mis-reads, and straight proof reading works quite well. From time to time we discover that an execution failure is to do a typo.

Execution

Nothing encourages a software preservation exercise more than seeing the code beginning to work. Even our unverified code has had some considerable success, both as exploring the API and finding OCR read errors.

We have now succeeded in running a trivial (very trivial) program that has been compiled.

Browsing the Source Code

The listings from assembling the various bricks are here:
        KAB00     KAB01     KAB02     KAB20     KAB22     KAB24     KAB40     KAB41     KAB45     KAB60     KAB99
Note: KAB20 is our emerging re-engineered replacement for the missing brick.

Documentation

Here is stuff from Blythe House. The latter part defines JSE70. Here is a list of the KDF9 Algol Basic Symbol code.

Kalgol error numbers for bricks 01, 02, 20 and 22:
          01 300 to 01 315      01 316 to 01 402      01 403 to 01 406      all brick 02 failures
          01/20 to 15/20      16/20 to 33/20      34/20 to 50/20      51/20 to 82/20           01/22 to 20/22      21/22 to 82/22

Library tape formats:
          page 22-2-0      page 22-3-0      page 22-4-0      page 22-5-0

Flow diagram of Kalgol bricks
          page F-1-0      page F-2-0      from the Algol Users Manual.

Also from the Algol Users Manual we have the pages describing those parts of the runtime system that can be used by code procedures:
          page 2-5-0  Start reading at §2.1.8.
          page 2-6-0      page 2-7-0      page 2-8-0      page 2-9-0      page 2-10-0      page 2-11-0      page 2-12-0      page 2-13-0     

Here is the Service Routine Library Manual. There is a full list of contents and some of pages have been scanned. My personal holding of KDF9 documentation is in this picture plus an Algo60 mini-manual.

Usercode digression

As a by product, our searches within the Science Museum revealed a listing of the Usercode Paper Tape Compiler (KAA01). Not only is this interesting in itself, but it may also form the basis of a replacement for the missing KAB84. However, incompleteness seems to be a curse. There is one short section of the code missing, This is explained at the head of this test program:
          little.txt A program to check the correct translation of Usercode
This test program avoids using any features implemented in the missing section, and can be successfully translated and run using the current version of KAA01:
          usercode2.txt
You need to translate usercode2.txt:
          kal3 usercode2.txt > usercode2.lst
The resulting binary program in mt.out can then be run with my emulator:
          kdf9 -d -B101 mt.out little.txt
The output appears in a file called punch.txt. The -B101 emulates the effect of TINT;B101.→ immediately after loading the program. For this you need the latest version of the emulator from:
          http://kdf9.settle.dtdns.net/KDF9/kdf9.c
There is a post-processed version of the listing which links to the original images in:
          usercode2.htm

Back to Kalgol

The copy-typing of the code has dwindled somewhat, but efforts continue to execute the versions that we have. It is pretty well certain that they will still have typographical errors in them, but results are nontheless encouraging. Brian Wichmann is manfully typing KAB01 and has found several typos in comments, and a real one of substance: OCR version J51P715 correct version JS1P715.

We have collected a little bunch of mostly syntactically valid, but mostly semantically non-sensical programs in:
        http://kdf9.settle.dtdns.net/KDF9/kalgol/DavidHo/test.a60

All the programs get through brick 02. The only invalid example has a call to a procedure that has not been declared, and the error is correctly reported on the printer channel, before FAIL 00/02 (which is also what should happen). The syntactically valid ones then enter our surrogate brick 20 which does almost nothing, so it is hardly surprising that we soon come to grief in brick 22, usually reporting FAIL 17/22.

It must be stressed that we do not have copy-typed confirmation of all of the code. We have corrected some typos that have been found rather by chance because the code just looked wrong.

I have changed the way that I do comparisons, and abandoned use of the Good sub-directory. Seen as we have David Hu’s proof read version of the entire text, I have written software so that I can do the comparisons directly between a copy-typed page and the amalgamated source text, which I then correct and upload afresh.

Downloads of files to reproduce the above

You can reproduce the results reported above on your own machine. The software is all run via the command line and works on a wide variety of systems. You need several downloads (right mouse click and select save link as):

There are 6 Usercode files:
          KABDH.k3 My substitute for the real KAB00
          KAB95.k3
My substitute for the real KAB95
          KAB20.k3
My substitute for the real KAB20
          KAB01.txt
The real genuine KAB01
          http://kdf9.settle.dtdns.net/KDF9/kalgol/DavidHu/KAB02.txt
The real genuine KAB02
          http://kdf9.settle.dtdns.net/KDF9/kalgol/DavidHu/KAB22.txt
The real genuine KAB22

There are 2 C programs (emulator and assembler) involving the following files:
          http://kdf9.settle.dtdns.net/KDF9/kdf9.c The emulator for KDF9
          http://kdf9.settle.dtdns.net/KDF9/kal3.c
The assembler for KDF9 Usercode, which also needs the next two
          http://kdf9.settle.dtdns.net/KDF9/y.tab.c
The yacc output from kal3s.y
          http://kdf9.settle.dtdns.net/KDF9/y.tab.h


There is a mystery input file which is read on channel 0 and consists of little more than end message
          chan0.mt

The Algol60 source text needs to be prepared as in:
          test.a60
This can then be turned into channel 6 input by:
          mkchan test.a60 chan6.mt
The mkchan.c program is here:
          mkchan.c
mkchan only reads up to the line of stars in test.a60, so it is easy to edit test.a60 in order to try different test programs, and to keep previous tests.

Here is a list of the KDF9 Algol Basic Symbol code as used for input to mkchan. In some cases alternate representations are shown. Compound symbols are shown prefixed by exclamation mark, which was the convention used on Eldon2 teletypes at Leeds University and NPL.

Compilations, etc

You will need to compile the 3 C programs, e.g.:
          gcc -g -o kdf9 kdf9.c
          gcc -g -o kal3 kal3.c y.tab.c
          gcc -g -o mkchan mkchan.c


You will need to assemble the 6 Usercode programs, e.g.:
          kal3 KABDH.k3 > lp.txt
          move mt.out KABDH.bin
          kal3 KAB95.k3 >> lp.txt
          move mt.out KAB95.bin
          kal3 KAB20.k3 >> lp.txt
          move mt.out KAB20.bin
          kal3 KAB01.txt >> lp.txt
          move mt.out KAB01.bin
          kal3 KAB02.txt >> lp.txt
          move mt.out KAB02.bin
          kal3 KAB22.txt >> lp.txt
          move mt.out KAB22.bin

Running the emulator

First take the first source text from test.a60 and make the POST format input file:
          mkchan test.a60 chan6.mt
This next command should work as described above:
          kdf9 -v0 -d -t3111/0 -a999999 KABDH.bin
This will run with minimal diagnostics (-v0) and then turn on maximum diagnostics when executions reaches the address 3111/0, which happens on entry to brick 20, so bricks 01 and 02 run without diagnostic output.
More detailed instructions for using kdf9 are at the head of the source code.

Background

In October we received news of the discovery of a listing of the Kalgol compiler in Colorado. This has been scanned and processed by a custom OCR program written by Terry Froggatt.

It was a disappointment that Terry’s OCR program for lineprinter output was not accurate enough to eliminate copy-typing, but it was sufficiently successful to combine with one copy-typed copy, thus eliminating half the drudgery.

A further disappointment comes from the discovery that we do not have all of the software environment within which the compiler operated, and it would appear that one of the overlaid sections (known as “bricks”) is missing.

We have processed the OCR output by collecting a list of common mis-reads and then automatically substituting the correct code. We then produce one copy by proof-reading this and putting it through our Usercode assembler. David Huxtable (one of Kalgol’s authors) has been invaluable, both with recollections and with processing the OCRed code. Volunteers are also copy-typing pages which are then compared with the proof-read version.

The source code contains a couple of features of Usercode that do not appear in the manual — Z-stores and H-stores. We think that we have dealt with these correctly (adequately), but time will tell.

The compiler consists of a series of passes known as bricks which are overlaid above a piece of resident code that contains routines for manipulating the magnetic tapes upon one of which the Algol source code was to be found. The other tapes were used as workspace, and the resulting binary program was output to tape. Although we have the code of this permanently resident code (brick 00, or KAB00), we do not have the overlay that initialises its operation (KAB80). There are 147 comments in 1257 lines of code, and in KDF9 Usercode you typically have 3 or 4 instructions on a line. The API to KAB00 is in terms of a number of channels, and we are experimenting with reverse engineering this API by attempting to run brick 01 (KAB01), and producing minimal implementatons of API calls as they arise.

Perhaps the biggest disappointment is the omission of one brick from the set, brick 20, which does procedure classification and computes the call matrix. It may just be possible to implement a substitute. Otherwise we shall only be able to run the first two passes.

Invalid Algol − an early success

Here is an early attempt at reverse engineering the missing I/O utility code. Brick 01 (KAB01) reads in a trivial invalid Algol source text, and correctly reports the failure.
          begin integer i; end end
          &rarr

The spaces were not actually in the input stream of Algol Basic Symbols (ABSs).
F:\dh\KDF9\kalgol>..\kdf9 -d -v7 -a9999 KABDH.bin
*** JSE70;  3174/3  0 cells N1=121003 ic=188 Open channel 3 (double)
*** JSE70;  3176/4  0 cells N1=121021 ic=201 Open channel 17 (double)
*** JSE70;  5763/5  1 cells N1=121105 ic=228 Open channel 5 (double)
*** JSE70;  6264/0  1 cells N1=121107 ic=262 Open channel 7 (double)
*** JSE70;  3203/4  1 cells N1=   106 ic=283 Ch. 6 open etc.
*** JSE70;  6264/0  1 cells N1=160007 ic=973 Write to channel 7
*** JSE70;  3314/0  1 cells N1=157400 ic=1017 Find(?) channel 0
*** JSE70;  4721/1  2 cells N1=121100 ic=1042 Open channel 0 (double)
*** JSE70;  4762/1  2 cells N1=160000 ic=1069 Read from channel 0
*** JSE70;  6264/0  2 cells N1=160007 ic=1773 Write to channel 7
*** JSE70;  4757/2  2 cells N1= 60000 ic=1806 Close ch. 0 (paper tape?)
*** JSE70;  4721/1  2 cells N1=121106 ic=1831 Open channel 6 (double)
*** JSE70;  4762/1  3 cells N1=160006 ic=1932 Read from channel 6
*** JSE70;  6043/1  4 cells N1=134321 ic=3847 Fail 01313 Line 00001 ??i;??(*)000000
*** JSE70;  3343/5  4 cells N1=   200 ic=3865 FAIL 0/1
*** JSE70;     0/0  2 cells N1=157010 ic=3886 KAB95.bin  Brick change 3125
Instruction 136 MBSKQq not yet implemented
FAILS Unimplemented instruction
LINK 03103/5
So I went to my Algol Users Manual and looked up Fail 01313 and it said:
          too many ends in Algol program
My then current implementation of monitor output only understood ABSs that can be represented as a single ASCII character. Consequently only the letter i and the semi-colon do not appear as question marks and the source text appears as: ??i;??