KDF9 character sets

This description relates to paper tape and printer and the PROMPT file system, as used within the Time-Sharing Director and the Eldon2 operating system. The encoding on punched cards was different, and is not covered here.

We have preserved four language processing systems from KDF9, and all are available on-line, viz: 2 Algol60 systems: Kidsgrove Algol (KALGOL), and Whetstone Algol (WALGOL), and 2 assembly-level languages: English Electric’s KDF9 Usercode and Leeds University’s KAL4.

This page describes a common scheme for producing input to each of these language systems, and for seeing their output as would have been seen on a real KDF9 system.

Algol Basic Symbols

Although KDF9 I/O devices used 6-bit characters, the systems which stored and manipulated files on magnetic tape (POST) and disc (PROMPT) used an 8-bit code representing Algol Basic Symbols.

The KDF9 Algol Basic Symbol code is listed in this web page, where the character form is shown as on a KDF9 Flexowriter. As well as the symbols listed in the Algol60 Revised Report §2, KDF9 Algol also includes encodings for * (asterisk), space, newline, tab, KDF9, ALGOL and EXIT. The last 3 are concerned with procedure bodies written in KDF9 Usercode. Asterisk is used as a space marker in strings. In the official Algol60 Revised Report the compound basic symbols are distinguished from normal letters by being in bold type, e.g. begin. On a KDF9 Flexowriter, this was indicated by underlining, e.g. begin, which was represented on the paper tape by having an underline character before each of the five letters. The non-advancing underline was also used for creating ≥ and ≤ as > and <.

Data read into Algol programs was also handled in terms of Algol Basic Symbols.

The Eldon2 operating system used a PROMPT compatible filestore, and operated in Algol Basic Symbols.

The input of Algol Basic Symbols will accept more than one form of some of the symbols.

Source code representation

The KALGOL compiler operated within POST or PROMPT and read the Algol source code in Algol Basic Symbols. This compiler generated Usercode which it output in Algol Basic Symbols.

The KAL4 assembler always read source code from a disc file and operated entirely in Algol Basic Symbols.

The Usercode compiler in the PROMPT and POST systems (KAZ84 and KAB84) read Algol Basic Symbols. We do not have this brick, but have made a substitute by producing a surrogate KAB84 which reads the Algol Basic Symbols and outputs paper tape code which is then fed into the English Electric paper tape Usercode compiler (KAA01PT).

The WALGOL compiler read paper tape.

Hard copy

Input to the POST/PROPMT filesystem was originally via paper tape which was produced on a Flexowriter. The introduction of the Eldon2 multi-access system added teletype capability for creation and editing of files. Here compound Algol Basic Symbols consisted of the word prefixed by exclamation mark, e.g. !BEGIN. Teletypes did not have lower-case characters. The KDF9 line-printer had a more restricted character set than the Flexoriter. Neither the line-printer nor the teletype provided two cases of letters. The English Electric documentation for character codes can be found here and at the end of this page.

KDF9 used 6-bit characters for input and output, with slightly different sets on paper tape and lineprinter. Both character codes are described towards the end of this page.

The paper tape code includes several symbols that are absent from most keyboards, normally ÷, ×, , 10 and an underline character that did not advance the character position on the official papertape processing equipment, viz Flexowriter. There was also a special character called end message for indicating the end of data. It printed as a right-pointing arrow, somewhat like or >.

The line printer was capable of printing only a subset of the characters. It lacked lower case letters and also ÷, ×, , , and the underline capability. The only character on the printer which is not found on most keyboards is the subscript ten, 10.
modern
keyboard
KDF9 Algolofficial
Algol60
'begin'beginbeginetc
>=>
<=<
!=
%÷÷ integer divide
*×× multiply
$*n/a align word in Usercode/KAL4
~ 10 10or backslash c.f. Algol 68
_*n/a space in a string in KDF9 Algol
|n/a for end message
{[ curly brackets for underlined square ones
(Algol60 string quotes).
}]

Input

The requirement here is to be able to create files which represent sequences of KDF9 paper tape characters. So the table shows how to represent KDF9 Algol Basic Symbols on a modern keyboard, and how they would have looked on a KDF9 FlexoWriter. Compound symbols can also be input "Eldon2-style" with an exclamation mark, e.g. !begin, or even with underlines, e.g. _b_e_g_i_n.

There was much agonising over multiply. There is widespread use of star for multiply (e.g. FORTRAN). Star is not a true Algol Basic Symbol, but is used in KDF9 Algol to represent a space in a string. This is the rationale behind the choice shown in the table. Also dollar can be used for a star symbol, which looks better in Usercode and KAL4 than an underline.

The expectation is that the above recipe will work on any modern keyboard independently of whether the host character set is ISO-Latin-1, UTF-8 or ASCII.

For Kidsgrove Algol, Usercode or KAL4, the user’s ASCII input is converted into the 8-bit Algol Basic Symbol code. For Whetstone Algol, the user’s ASCII input is converted into the KDF9 paper tape code.

Output

Output from programs is currently on the line printer, where we can use ASCII directly, except for subscript ten, which is not currently very consistently represented. In any case, the current output routine uses the letter E, as can Algol 68.

There is also the need to show the software on the screen. Algol Basic Symbols always looked their best on a Flexowriter, so we have chosen to show source text as on a Flexowriter using HTML entities to deal with the fancy bits. In addition subroutine calls have been made into hot links to the called routine.

21st Century assemblers

As part of the process of rescuing and resurrecting the KDF9 software two yacc-based assemblers were written, and can be used on-line. Their input is in ASCII and is believed to accept all variants of syntactically valid Algol Basic Symbols on input.

ASCII, UTF-8 and ISO-Latin-1

The term ASCII is sometimes used loosely here to refer to the facilities of commonly available keyboards. It is basically those characters that are represented in 8 bits in UTF-8. Some ISO-Latin-1 non-ASCII characters are also accepted as part of source code, notably × and ÷.

Whitespace

The Revised Report on Algol60 says:
Typegraphical features ,,, have no significance ...They may how be used freely for facilitating reading.
It would appear that the original POST would allow any amount of whitespace within Algol Basic Symbols, so that even
f
o
r

       was valid.

Our program for converting text to Algol Basic Symbols is less liberal than this. A single space is accepted in : = and also go to. Other variants of goto are 'go to' and !go to. Whether or not the space in go to should be underlined seems ambiguous in the Report, as it depends on whether the space is in bold type or plain type.