The Niche takes a first look at Basic Compilers for the Spectrum
There’s a world of difference between ZX BASIC and machine code. Programs in BASIC tend to be slow, with jerky graphics and poor sound effects. Programs in machine code run hundreds of times faster — which doesn’t mean that the graphics just jerk around the screen at Warp Factor ten. No, machine code programming permits smoother movement, ‘3D’ perspectives, simultaneous sound, animation and so on.
Eight standard BenchMark programs were used in the comparison: timings for the execution of each benchmark program are given in seconds, with the speedup ratios achieved by each of the compilers printed on a grey background.
Of course, machine code doesn’t necessarily make a game playable, and some classic games have been programmed entirely in BASIC — Mined Out, Football Manager and Velnor’s Lair for instance. But if you want to write a shoot-em-up game, or a program with sophisticated graphics, you’ll almost certainly find BASIC too slow.
Back in the olden days of Spectrum programming, when programmers assembled code in their heads, and the back pages of the old orange manual were the first to fall out, there wasn’t much alternative to learning machine code once you’d come up against the limitations of BASIC.
Learning machine code was a traumatic process: before disks and microdrives (which brought their own meaning to the term ‘random access’) every programming mistake meant a crash. Teaching yourself machine code was a frustrating process as, after each mistake, it took several minutes to re-load your assembler, debugger and program source from tape in preparation for another crack at the problem.
Predictably it wasn’t long before someone figured out that, if the computer was so fiendishly clever, it really ought to be able to make up machine code for itself. In this article we take a look at currently available BASIC to machine code translators, or ‘compilers’. Next month we hope to examine Colt and Blast, two new and aggressive-sounding BASIC compilers which are under development. In future Niches we’ll blow the dust off other Spectrum languages such as Logo, Forth, C, and Pascal.
Meanwhile, back at the keyboard....
BASIC is slow because everything you type is carefully checked to make sure it is correct. This would be fair enough if it only happened once, when the program was entered, but the exhaustive checking continues even while a program is running.
If you write a program in BASIC to add 2 and 2 a hundred times, the computer will take as long to work out the answer the last time as it did the first 99 times. The actual adding is done fairly quickly in machine code (which is the only language the Spectrum’s Z80 processor can really understand), but the overall effect is still very slow. This is partly because BASIC checks the syntax of lines over and over again, even after they have been entered (in case some stray POKE or Cosmic Ray has changed the contents of program memory?)
BASIC is also hampered by the need to cope with all sorts of special cases. The routine to add numbers in the Spectrum ROM has to be able to cope with functions, arrays, numbers and variables; these can have almost any value from minus several zillion upwards. The Z80 processor can only cope with a few digits at a time; it has to do all its arithmetic in several steps, just in case. Worse still, it can’t multiply and divide at all, so these operations must be performed ‘longhand’.
Much of the code in the Spectrum’s ROM is taken from the earlier ZX81 BASIC, which was squashed into just 8K. In order to keep the size down, parts of ZX BASIC were written using a leisurely version of the compact Forth language, rather than machine code. A new ROM for the Spectrum is planned (though not by Sinclair), but nothing has materialised yet.
One of the nice features of Spectrum BASIC is the way that it lets you type in new lines of program and scrub out old ones as you test your program. This is hell for the BASIC system, which has to keep scrabbling around in tables to keep track of shifting variables and program lines. The longer your BASIC program, the worse this gets, so that a 20K program may run at half the speed of a 2K one. In compiled BASIC, however, the position of every line and variable is fixed. This makes programs fast, but means that you have to re-compile the whole lot if you changed one line.
Finally, ZX BASIC is cursed by the stupid way humans like to write things. We write ‘X = 6 + 7’ when the computer would be much happier with ‘6 7 + = X’. It can’t do anything with the name X till it finds the equals sign (meaning that a value must be stored). Similarly, the equals sign isn’t really relevant till the computer knows what is to be stored. The plus sign means add two values — there’s no point telling the computer about it until it has found both the numbers. ZX BASIC actually performs calculations in the second sequence (which is called Reverse polish Notation), but it has to re-order them from the first sequence every time it finds them, and that is a slow process.
The Spectrum BASIC compilers are programs which read a BASIC program and produce a machine code equivalent. The compiler and both programs have to be in memory all at once, which limits the size of compiled programs to 10-20K.
Compiled code may be anything from 2 to 200 times faster than the original, depending upon the compiler you are using and the intricacy of the original program. We ran (or at least, tried to run!) the standard BASIC benchmark programs on each of the compilers. The results are shown in the Timing Table, along with the published timings for ZX BASIC.
The timings are not as fast as for pure machine code, which allows much more freedom to the programmer, but they are easily fast enough for most games programming. A number of commercial games are written in compiled BASIC (including Frank ’N’ Stein, published by PSS) and look none the worse for it, although you’d be hard put to write Knight Lore with a compiler.
A few of the positions in the table contain asterisks, because the test program could not be processed by that compiler. In order to keep compiled programs fast, and reduce the complexity of the compiler, the packages all impose restrictions on what they can compile.
Softek’s FP compiler is the only one that can cope with decimal values, for instance — this makes it much slower than the others, but means that it is the only compiler suitable for use in business programming. But who wants to run a business on a Spectrum anyway? The other compilers restrict you to whole numbers between -32767 and 32767, although you can use values up to 65535 in POKEs and suchlike.
You can switch back and forth from normal BASIC, machine code and compiled code with USR calls and RETURN instructions, so it is possible to write programs in a mixture of languages if you need speed at one point and sophistication elsewhere.
The Softek compilers (FP and IS) are the only ones which allow use full BASIC string-handling; Mcoder gives you a fairly complete set of facilities to work with short strings (up to 255 characters) but Zip and the Mehmood compiler can only offer simple routines to read and write characters. You could probably write a text adventure using Mcoder or one of the Softek compilers, but you’d be much better off using The Quill.
Array handling is similarly limited — none of the compilers allow arrays of more than one dimension, and the IS and Mehmood programs won’t allow arrays at all. You can use long variable names with Mcoder and the Softek compilers, but the cheaper packages restrict you to 52 short variables names.
The ‘core’ of ZX BASIC commands — PRINT, INPUT, PLOT, DRAW, LET, GO SUB, IF, and so on — are allowed by all the compilers. The Mehmood compiler doesn’t allow FOR loops, which meant that we couldn’t run some of the benchmark programs.
One of the snags of real machine code is the fact that you can’t ‘break in’ to programs. This suits software houses, who want to discourage piracy, but it is very inconvenient for programmers. The only way you can stop a machine code program is to pull out the plug and re-load it. Zip and Mcoder allow you to break into compiled programs at will, but the Softek compilers require a special command wherever you might wish to break into compiled programs. You can’t break into programs produced by the Mehmood compiler at all.
The Softek compilers allow you to put special instructions in REM statements. These instructions only work once a program has been compiled, which is inconvenient since you can’t test such programs fully in normal BASIC — one of the big advantages of BASIC compilers over machine code is the fact that you can test your programs interactively, with all the BASIC checks and hand-holding to help you, before you compile them.
Softek’s special instructions allow you to check for the Break key, enter machine code into the program, and move simple (character-sized) sprites smoothly around the screen. On the FP compiler you can also trap errors and simulate the ON... GO TO statement. None of the compilers let you GO TO a calculated line number — you must always GO TO a specific number.
Mcoder offers some REM instructions, but these are designed for program testing. You can turn off BREAK checks, giving marginally faster code, or turn on a ‘trace’ facility which shows the current line as it is executed. Mcoder and Zip allow you to pass variable values back and forth between BASIC and machine code.
So far we’ve taken a broad overview, looking at the compilers together. In the following section we look closely at each of the five Spectrum compilers (there were six, but the first Spectrum BASIC compiler, SUPER C is no longer available).
This compiler was featured in a trio of Popular Computing Weekly articles in April this year. It doesn’t come with any instructions, so you’ll need a copy of PopCW Volume 4 No 17, and ideally the two following issues as well. A simple demonstration game is supplied on the other side of the tape.
Compiler is written in BASIC, so it works very slowly, and line numbers below 1000 are reserved for the compiler program. As soon as an error is found a message is printed and compilation stops. The messages are usually quite helpful, but the stop is annoying since you have to start again to find the next error, and this might involve a wait of several minutes.
At £2.75 this is not a bad low-cost compiler, and would probably be useful for ‘spicing up’ BASIC program. It is a shame that it doesn’t allow FOR NEXT loops or even PRINTing of strings. The poor timings it achieves on the Benchmarks result from the fact that the Spectrum’s very slow, built-in ‘division’ routine is used.
Mcoder has the longest history of all the Spectrum compilers. It began life in the July 1983 issue of Your Computer magazine; in those days it was a 2K program for the ZX81 called ZX-GT. Like the Softek compilers, Mcoder now occupies about 6K.
The Mcoder documentation is unimpressive: seven cassette-sized pages, in the form of a brief question and answer session and a list of commands allowed by the compiler. PSS offer a three page ‘help sheet’ to users who miss the significance of some of the comments in the cassette insert. The code produced is faster than that from the IS compiler, especially when it comes to string handling.
Mcoder looks very much like the Softek compilers (or should that be the other way around?) and performs in a similar way, with the same fast compilation and simple error indication. Mcoder and IS are very similar — Mcoder handles numeric arrays and has better debugging facilities, while IS is slightly more compatible with normal BASIC and offers simple sprites.
This is yet another refugee from a computer magazine — an early version of Zip was listed (in the wrong order, mainly) by the troubleshooting goblins at Your Spectrum last year (issues 3-6). Zip is mainly written in BASIC and consequently works more slowly than Mcoder or IS, although it’s not as lethargic as the Popular Computing weekly offering. Line numbers above 5000 are used by the compiler and optimiser.
Zip produces faster code than the other compilers, as the benchmark timings show. Like IS and Mcoder it works with whole numbers only — unlike them, it doesn’t allow strings or DATA and variable names are restricted.
The documentation is better than for the other compilers, consisting of twelve pages of A5 (reduced from the A4 originals), with appendices covering benchmark performance, useful subroutines and error messages. There is also a section on customising the compiler program. A demonstration game is recorded after the compiler.
As compilation takes place your program is listed on the screen, and errors are shown in context. Zip is the only compiler which detects all the errors in a program at once — this is just as well, in view of the compilation rate. Zip produces error messages in plain English, whereas the other compilers just stop at the location of the error.
This is the most expensive compiler by a clear tenner, so it had better be good — or at least different! As the benchmark timings show, it produces fairly pedestrian code, typically 2-10 times faster than normal BASIC — but it is very flexible. You can use FP to speed up almost any ZX BASIC program that doesn’t use arrays of more than one dimension, or the VAL and VAL$ functions. The compiler also disallows calculations in DATA and GO TOs, but we wouldn’t dream of using those, would we?
The documentation is barely adequate — a single large sheet of paper with an introduction, list of compiled statements and brief technical discussion.
The FP compiler displays the current line being processed as it works. When an error is found the compiler stops and shows the line containing the problem, with a question-mark to show where the problem was found. You can’t go on to detect subsequent errors, but this doesn’t matter much since the compiler is very fast. You can compile several programs into different areas of memory by using CLEAR between one compilation and the next.
FP is a well-written program, but it is expensive and may not be useful to many Spectrum users, since it doesn’t offer a dramatic speed increase over well-written BASIC. We’ll look at it again next month, when we examine BLAST, a new compiler also designed to process ‘off the shelf’ progams.
This program is very like the FP one in presentation — it shares the same instruction sheet but it restricts itself to arithmetic using whole numbers (IS stands for Integer and String, whereas FP stood for Floating Point). This restrictions makes IS about ten times faster than its stablemate. Again, compilation is fast and you can compile several programs into different areas of memory.
The compiled code is slower than that generated by Mcoder, and quite a lot slower than Zip, but the IS compiler has the bonus of support for very simple sprites. The lack of array-handling is annoying, although not too hard to get around if you’re prepared to use PEEK and POKE or string-slicing instead.
None of these compilers really offer ‘instant translation’ for your BASIC. With the possible exception of Softek’s FP you really have to write your program with compilation in mind — it is hard work to convert existing BASIC to suit any of the compilers. Also, there are some things which are hard to do without the flexibility of real machine code. That said, the packages all produce working code pretty effortlessly, and you can be reasonably confident that compiled programs will work first time — unlike hand-coded ones!
Next month, PR companies willing, we should be able to report on two new compilers — Colt, from HiSoft, which is a development of Mcoder, and Oxford Computer Systems’ Blast, which promises to compile absolutely any ZX BASIC program, without alteration. At the moment we’re having a bit of trouble wheedling copies out of the manufacturers — they both seem to be holding back until they’ve had a chance to dismantle their competitor’s product! We’ll compile more information next Niche....