Skip to the content.

Bootstrapping Forth with Quarter

In the previous article we explored the self-hosting power of Forth. We saw how support for if/then control structures, () parenthesized comments and .s"" string literals can be implemented within Forth itself.

We continue below by seeing how the Forth text interpreter can be self-hosted, and why this is a good idea. Finally we ponder the bootstrapping issue into which we run, and consider how this can be resolved.

Writing a Forth interpreter in Forth

The Quarter Forth implementation of the Forth text interpreter is split into separate definitions for the interpreter ([) and the colon-compiler (:). This is slightly non-standard, and isn’t essential, but avoids having a global state variable. Here is the implementation of the interpreter:

: [
word dup s" ]" s= if drop exit then
dup find dup if swap drop execute recurse then
drop number? if ( number is on stack ) recurse then
." ** Interpreter: '" type ." ' ?" cr recurse
; immediate

This in written in my preferred loop-less style; using recurse for tail-recursive iteration to the current definition. The main code path is the word-find-execute loop. There is a special case for reaching the ] word, which is handled as a terminating marker, rather that being its own word. Another special case handles numbers, which is tried after a failed dictionary lookup. Then finally we have error handling for when a word is neither in the dictionary or parse-able as a number.

The interpreter builds upon lower-level Forth definitions such as word, find, s= and number?word reads a blank-delimited word from the input stream. find looks in the dictionary for the execution token associated with a named word. s= is string equality. number? attempts to parse a string as a numeric literal. These definition will in turn be defined using the really basic primitives, such as key, to read individual characters from the input stream, and emit, to write individual characters to the output stream.

Having the kernel provide implementations for basic operations such as key and emit seems entirely reasonable. These implementations will likely be tied to specifics of the host architecture (say x86 BIOS calls). But using Forth for the next layer of definitions: word, find, s= and then eventually [ and : seems a nice approach. Forth is way easier to code and understand than Asm. (Not easy, but easier!) But more importantly, the Forth code wont have to be rewritten if we re-target out system to a new architecture. Win-Win.

But here we stumble on a problem: how can we process these Forth definitions without already having an interpreter/compiler available?

The bootstrap problem

Is there any way to bootstrap a Forth system (by which we mean a working interpreter/compiler) without already having a working interpreter/compiler available? Happily, the answer to this question is YES. The details of exactly how are explored by Quarter Forth.

A first step towards solving this bootstrapping problem is figuring out how to deal with numerics. The Forth interpreter/compiler must recognize numeric literals, and convert from their string representation in a given base. In Quarter Forth this is implemented by the word number?. The implementation of number? can be written in Forth, and although it requires support for some primitive arithmetic operations, it crucially does not require support for numeric literals; except for a few specific numbers like 10 and 16. We can arrange for the kernel to provide primitives for any number we require, but it is sufficient for the kernel to provide just 1 and 0. We can build any other number we need in Forth.

1 1 +
constant 2

2 1 + dup * 1 +
constant 10

We might also conceive how to have the kernel provide us with just an interpreter, and then build the colon-compiler in Forth. The compiler is after all just a convenient way to construct the executable code for a new dictionary definition. With the right primitives, i.e. , (comma), compile, and ret,, we can build dictionary entries within interpreter mode. However, even with numerics and the colon-compiler solved, how can we avoid having to code in Asm, an interpreter which implements the full logic of word and find?

Quarter

The solution to the bootstrap problem taken by Quarter Forth is Quarter.

Quarter is a spartan version of Forth. Computation is directed using individual characters, rather than blank-delimited words. Instead of word and find, we have primitives key and dispatch:

By accessing the input stream at the level of individual characters, and avoiding the use of the standard Forth linked dictionary structure, much of the complexity required from the kernel is eliminated. The kernel need only provide a simple key-dispatch-execute-loop instead of the standard interpreter (which could be described as a word-fetch-execute-loop).

The code for the key-dispatch-execute-loop looks as follows in my x86 implementation. Even with basic error support for a failed dispatch, it is rather small. (Aside: The underscore prefix is just a convention to indicate an Asm routine which conforms to Forth style, using the parameter stack, not registers, to transmit arguments.)

.loop:
    call _key
    call _dispatch
    call _dup
    call _if
    jz .ignore
    call _execute
    jmp .loop
.ignore:
    call _drop
    jmp .loop

Next time

In the next article we dive deeper into Quarter and Quarter Code.