We’ve always taken a special interest in BASIC compilers in TECH TIPS. In August and September 1985 we looked at eight packages that promised to turn slow BASIC into fast, efficient machine code. One of those — The Colt, now remaindered at £9.99 — was published by Hisoft, but that has not stopped the firm launching another compiler.
If ‘compiler’ is a new word to you, don’t give up reading yet. A compiler is a programming tool that writes machine code for you. It can be used to write games, (indeed, any program) or just to ‘soup-up’ existing BASIC Software. The 1984 rave from PSS, Frank’N’Stein, was written with a BASIC compiler. You don’t have to be a programmer to use a compiler, but you do have to be vaguely interested in listings, gameplay or speed. The more interest you take in programming — at any level — the more useful a compiler will be.
Sinclair planned the Spectrum to be a machine for budding programmers, rather than for games players, so the built-in language is very ‘friendly’. It lets you alter programs as you test them. It rejects lines that are ‘obviously’ incorrect, although it can’t pick up all errors by looking at a line in isolation. You pay for this convenience, because ZX BASIC can make no assumptions. It checks every part of a program as it runs, in case a POKE has gone astray or you’ve changed something and re-started the program.
ZX BASIC is termed an ‘interpreter’ because of the way it works out everything it must do as it goes along. Hisoft BASIC is a ‘compiler’ — a program that runs once, to work out the exact meaning of a program. A compiler generates purpose-made machine code to do the same job — but with fewer checks and no time-wasting.
An analogy may help to make the point. Imagine that you hire a worker who is very fast and efficient, but who can’t speak your language. You’re the boss — the program — and the clerk is the processor.
Imagine that you’ve already got details of the clerk’s task, written on a set of filing cards. The old clerk used to read through them, in sequence, and do the indicated jobs. If your new employee was to work like an INTERPRETER he would take a dictionary and start looking up the works on the cards, one by one. When he’s translated and checked a complete card, he’d do the job — then he’d translate the next card, and so on. The worker can start almost at once, using the old cards you can read, but he is slowed by the need to translate every word every tome. It might be better to use the approach of a COMPILER. In this case you give the clerk a biro and a set of blank cards, and let him translate the instructions once and for all. You end up with a copy of the original instructions, in a different language. You can’t understand them, but the clerk can work with them more easily.
A bright clerk — or compiler — can incorporate other improvements into the new set of instructions. For instance, an old card might say ‘empty the 200th bucket’, requiring the worker to count through the buckets from the start to find the correct one. The new clerk might be able to re-write the instruction, or make a map, so that the correct bucket (or variable, or program line) can be found at once. This saves time, as long as the situation is consistent. If changes are required a new set of specific instructions must be made.
To summarise: interpreters are quick and simple while you’re working out what needs to be done, but compiled instructions are more efficient once you know what to do.
If a BASIC interpreter is very inefficient, and the underlying task is very simple, a compiler can make BASIC hundreds of times faster. In other cases — say, when evaluating trigonometric functions or printing to the screen — the interpreter and compiler work at much the same speed. The time needed to interpret such commands is small compared to the time needed to carry them out.
ZX BASIC is a ‘general purpose’ language, which makes it concise and easy to learn, but sloppy. All numbers are held in a fiddly floating point form which the computer must work with, piecemeal. An enormous range of numbers can be handled, to nine digits of precision — but that’s often a vast overkill.
The processor can handle whole numbers of up to five digits much more easily, but ZX BASIC does not take full advantage of these common ‘special cases’. The interpreter struggles on, doing everything slowly, the hard way. Every number is kept in five bytes of memory, even if it would fit into one or two.
String handling is similarly messy. ZX BASIC has no way of knowing how big an undimensioned string will become as a program runs. It chooses to shuffle values around to make room as required. This usually means that it spends more time moving other things out of the way than it does storing or recalling your text. That doesn’t matter much when you’re testing a new program, but it makes finished software much slower that it need be.
The aim of a BASIC compiler is to give you the speed of machine code — and hence the freedom to use animation, sound effects, and other time-intensive techniques — while still letting you develop your program in friendly, familiar ZX BASIC.
It’s easy to learn BASIC, and mistakes are usually obvious and easy to fix. By contrast, machine code programming is complicated, repetitive, pedantic and hard to test. It’s just the kind of job you should give to a computer — hence the existence of compilers.
Real programs, and the programmers, are hard to compare, so meaningless but standard programs (called ‘benchmarks’) are used to test interpreters and compilers in a quantitative way. I’ve tested Hisoft BASIC in this way, to see how it stands up to the competition. But benchmarks are far from the whole story, so I’ve also given Hisoft BASIC some ‘real’ programs to chew on.
Before Hisoft BASIC arrived there were two main types of ZX BASIC compiler. Some required you to adapt your program, or write it in a special way, so that it could easily be re-expressed in machine code. These compilers — such as Softek’s FP, Hisoft’s Colt, and my own ZIP — cannot cope with decimal fractions and limit the use of arrays or strings.
These ‘subset’ compilers can give very good results. They are easier to use than compilers for other languages, because you can test programs beforehand with the interpreter. But they’re really just a ‘middle ground’ between BASIC and machine code. You need quite a lot of knowledge to use subset compilers effectively. They’re not a general solution if you’ve got a slow ZX BASIC program and want to make it faster with the minimum of fuss.
The other sort of compiler is aimed at people who have already got a program, and just want to speed it up without having to study or rewrite the code. ‘Full’ compilers can cope with decimal numbers as well as whole numbers (‘integers’) but their code is usually 10 to 30 times slower than that of more restrictive compilers. There’s a trade-off between compatibility and speed, in that compiled code tends to get slower as it works more and more like the interpreter.
General-purpose compilers are hard to write. The first Spectrum one was Softek’s FP, which coped well with sums and strings but had quite a few restrictions. It couldn’t handle user-defined functions and calculations after GO TOs or DATA. It also banned arrays of more than one dimension.
BLAST was a compiler that sounded wonderful but never worked properly. The publishers, Oxford Computer Systems, went bust not long after it was launched. In many ways Hisoft BASIC is what BLAST should have been — and carries the same £25 price-tag.
But 18 months ago Mcoder III arrived, from Ere Infomatique via PSS. To my mind, Mcoder III is the best French program yet released in the UK. It will compile almost anything, although it draws the line at array re-dimensioning and ‘add-on’ (for example, Microdrive) commands.
Mcoder III is stylish — it runs entirely in video memory, shares variables with normal BASIC, and generates a mixture of compiled and interpreted code with many optimisations. But the documentation is poor, compilation is slow, with two mandatory tape loads, and the implementors have taken a few ‘short cuts’, trading compatibility for speed.
Hisoft BASIC will have to beat Mcoder III if it is to ‘surpass all others’ as Hisoft claim in their manual. It was originally written in Canada on a TMS-2068 — an American Spectrum. Hisoft have improved it and added full support for the 128. Their magic formula is a combination of the speed of the restricted compilers with the flexibility (at comparable speed) of the ‘full’ versions.
“Hisoft BASIC combines the advantages of these two types of compiler without any of the disadvantages”, the publishers crow, claiming “simultaneously the fastest integer compiler and the fastest floating-point compiler available for the Spectrums”.
These quotations from the manual sell it short — after a page and a half of hysterical self-congratulation it launches into a readable, well-designed tutorial. The compiler loads in 100 seconds from tape. It can be moved to disk or Microdrive without much hassle. It won’t work with some SAGA keyboards, but SAGA have admitted responsibility for the problem and will supply a ‘fix’ to anyone who runs into trouble.
Nine small demonstration programs follow the code on either side of the tape. The examples are well-chosen to illustrate the way directives — extra REM statements — are used to control compilation.
A simple graphics demonstration is 3.3 times faster after compilation. A compiler can speed up the number-crunching but it can’t do much to change the amount of time actually spent plotting points on the screen.
A loop PEEKing and POKEing display information is 15 times faster when first compiled; an extra directive, to say that some variables are integers, makes it 409 times faster than BASIC! This is a ‘perfect’ example, in the sense that it is something machine code does very well and BASIC very badly. A graphics program working from DATA with DRAW is about four times faster after compilation — this ratio looks more impressive than it sounds.
The Sieve of Eratosthenes is a program to find prime numbers. It’s another near-ideal example, although I found it easy to condense and speed-up Hisoft’s original BASIC. The compiler ran The Sieve about 174 times faster than ZX BASIC. ZIP — once the speed champ — accelerates the same code by a factor of 160.
The last two examples are sorting routines, which Hisoft BASIC makes 20 times faster. The lines that generate random data for sorting only speed up by about 50 per cent. I compiled several games and utilities from my own collection, and found Hisoft BASIC fast-working and reliable. I found one minor bug — STOP and RETURN are treated as the same instruction, so you can’t STOP inside a subroutine.
The 50-page A5 manual by programmer Cameron Hayne reached Hisoft in handwritten form! The publishers have neatly laser-printed it, but it is easier to read then it is to use — it’s more a collection of interesting essays than a reference guide. The tutorial is excellent and the other parts seem quite comprehensive, but examples are rare and the lack of an index is inexcusable.
The compiler is started by typing *C. Programs of up to a hundred lines or so are compiled almost instantly. A 3.5K ZX BASIC game compiled into 3.2K of integer code in about eight seconds. As Hisoft BASIC compiles your program, gibberish appears on the screen, which is used as a temporary store to free other memory for code.
Messages indicate errors, routine addresses, and the length of interpreted and compiled code. You must press a key twice to clear messages and step between the phases of compilation. The compiler stops as soon as one error is found. Hisoft BASIC works so fast that this is not an irritation. You can fix the offending line at once — just press EDIT.
Compiled code can be started with *R, or a RAND USR command. *X clears out all the code, but it doesn’t tell the system, so a subsequent *R will usually crash the computer — beware!
*T is the most innovative feature — this controls a ‘trace’ which keeps track of variable values as a program is interpreted at about half-speed. When you stop the program you can see a list showing which variables were only used to hold integers, and the maximum length strings reached. If you test a program thoroughly while using *T you can use the list to tell Hisoft BASIC which optimisations to perform, and thus get the speed of a ‘subset’ compiler with no need to study the code yourself.
Hisoft BASIC is more restrictive than Mcoder III, but probably a little more compatible. Mcoder tries to ‘guess’ at optimisations, but Hisoft expect you to indicate them by adding directives in REM statements. These indicate signed and unsigned integer variables and maximum string sizes; the default is a rather wasteful 255 characters.
Arrays of one or two dimensions — but not more — are allowed. Dimensions must be fixed, not calculated as a program runs. VAL only works with numbers in quotes, and not with numbers in strings — an annoying limitation. CLEAR and RUN are banned, as are expressions in DATA, disk and Microdrive commands. The music command PLAY works fine on any 128, and is ignored by earlier computers.
Minor quirks include a limit of 450 targeted lines for GO TOs, no coercion in READ (so you must mark integer DATA statements), and no CLS when a program starts. PLOT, DRAW and CIRCLE assume the current colour attributes, for top speed, so you may have to add PAPER 8 or BRIGHT 8 statements in a few places. Division is always done with floating-point arithmetic — even if you’re using integer variables — unless you put an INT around the division.
You can generally break into compiled code in the usual way while the compiler is loaded, but not thereafter. INPUT statements are BREAK and crash-proof. The new BREAK scan stops ZX BASIC messages, so you can’t tell what line you were on, even when you BREAK normally into the interpreter.
Code is concise, as library routines are not included unless they are required. Floating-point programs tend to grow by about 20-30 per cent as a result of compilation, whereas programs that use integers exclusively usually shrink a little. A convoluted 15K ZX BASIC program was compiled into 20K of machine code in about 80 seconds.
There is 29K free for BASIC and compiled code on a standard Spectrum. You can compile code and DATA separately if there is not room for everything together. Hisoft BASIC can also work a bit like Mcoder III, overwriting BASIC with compiled code, if you give the word. Spectrum 128 users get the best deal. They can put the compiler entirely on RAM disk, allowing up to 40K of code in memory at once. Hisoft BASIC also soups-up the 128’s BASIC editor.
We have printed two sets of Benchmark ratios for Hisoft BASIC, showing the speed of floating-point and signed integer code. The table also shows the code tested and the benchmark ratios of the two fastest compilers of 1985 — ZIP 1.5, for integer-handling, and Mcoder III, tops for flops (FLOating Point OPerationS, dummy!). Each number is the ratio of the ZX BASIC time to the compiled time. The table shows that Hisoft have succeeded in one of their aims — to produce a top-speed subset compiler — but Mcoder III is evidently an unknown quantity in Canada. You should read TECH TIPS Cameron!
|ZIP, MCODER III and Hisoft BASIC — BENCHMARK RATIOS|
|Code||Test||ZIP||HISOFT BASIC (Integer)||HISOFT BASIC (Floating point)||MCODERIII|
|4+ GO SUB||5||219||265||3.32||3.7|
The compilers have all been specially tuned to compile such arbitrary benchmarks, but it appears that Hisoft have neglected their floating-point routines, most of which just call the Spectrum ROM. In real programs Mcoder may have less of an edge, especially as it leaves DATA and functions un-compiled, but it’s still a serious competitor for Hisoft BASIC if you don’t want to bother with REM directives — especially as it’s half the price.
If you’ve got a 128, or lots of programs to compile, or you want to produce commercial software without paying royalties, Hisoft BASIC is the best compiler on the market. It is well-designed and Hisoft have a good reputation for supporting and developing their products. But the price is high, although you get a lot for your money. Hisoft BASIC is going to do well, but it won’t sweep the market.
Hands shaking from too much zapping? Feel like giving it a rest and compiling something? Instant gratification! You may already have a compiler lurking in your software collection. The first decent Spectrum BASIC compilers — WVS and Mcoder II — were effective, although limited. They were rather tatty in design, loading into a fixed 6K hole at the top of the Spectrum’s 48K address range. For some strange, undocumented reason all compiled code was tangled up with code in the compiler, so it was impossible to save a compiled program without saving the compiler as well! The budget game Nuke Lear from CDS contains a compiler, and so does Frank’n’Stein from PSS. There’s nothing, other than lack of documentation to stop you playing with these ‘free’ compilers. I’ve no right to print the original instructions here, so — like me — you’ll have to experiment to find out what the compilers can do. This should get you started.
This is a simple, fast, politically unsound game on CCS’s Charlie Charlie Sugar budget label. To load the compiler type:
CLEAR 40000 LOAD "W" CODE
Martin Lewis’s WVS 2.2 compiler will load, from part-way through the Nuke Lear tape. Type in some simple BASIC and type LET X=USR 59900 to compile it. If all goes well the address of the compiled code will be printed, and you can run it with another USR call. The compiler is integer-only and quite restrictive. If a command is rejected at first, try adding some brackets: this compiler likes LOTS of brackets!
This platform game was one of the classics of 1984, and has appeared on compilations since. Published by PSS, it was written with David Threlfall’s compiler, the original Mcoder. CRASH reader Stuart Green claims (Okay, DATEL?) to have dug out the compiler inside. This is what you type:
CLEAR 24750 LOAD "" SCREEN$ LOAD "" CODE CLEAR 40000
LET T=USR 60000 starts the compiler. You set the address where compiled code is put with a CLEAR statement. Again only whole numbers are allowed, but you can use one-dimensional arrays and most string functions. Redundant brackets may again be useful.
These are old compilers, and not of the same standard as recent releases — but they’re still fun to play with if you’re interested, but not yet convinced about compilers.