Let's start with the basics: what are the characteristics of a Forth? First, it's a stack-based language, so it'll need a stack. Actually, it'll need at least two stacks --- the data stack and the return stack (where return addresses are normally stored). Modern Forths also have a floating point stack.
Forth calls functions words, and the FORTH-83 standard defines a set of required words for an implementation. Note that there is an ANS Forth, but I'll target FORTH-83 first for simplicity. The required words are:
! * */ */MOD + +! - / /MOD 0< 0= 0> 1+ 1- 2+ 2- 2/ < = > >R ?DUP @ ABS AND C! C@ CMOVE CMOVE> COUNT D+ D< DEPTH DNEGATE DROP DUP EXECUTE EXIT FILL I J MAX MIN MOD NEGATE NOT OR OVER PICK R> R@ ROLL ROT SWAP U< UM* UM/MOD XOR
BLOCK BUFFER CR EMIT EXPECT FLUSH KEY SAVE-BUFFERS SPACE SPACES TYPE UPDATE
# #> #S #TIB ' ( -TRAILING . .( <# >BODY >IN ABORT BASE BLK CONVERT DECIMAL DEFINITIONS FIND FORGET FORTH FORTH-83 HERE HOLD LOAD PAD QUIT SIGN SPAN TIB U. WORD
+LOOP , ." : ; ABORT" ALLOT BEGIN COMPILE CONSTANT CREATE DO DOES> ELSE IF IMMEDIATE LEAVE LITERAL LOOP REPEAT STATE THEN UNTIL VARIABLE VOCABULARY WHILE ['] [COMPILE] ]
In a lot of cases, Forth is also the operating system for the device. This won't be a target at first, but something to keep in mind as I progress.
Eventually, I'd like to build a zero-allocation Forth that can run on an STM-32 or an MSP430, but the first goal is going to get a minimal Forth working. I'll define the stages tentatively as
- Runs on Linux (that's what my Pixelbook runs, more or less).
- Implements the nucleus layer.
- Has a REPL that works in a terminal.
- Explicit non-goal: performance. I'll build a working minimal Forth to get a baseline experience.
- Implement the compiler and interpreter layers.
- Define a block layer interface.
- Implement a Linux block layer interface.
- Build a memory management system.
- Replace all managed memory with the homebrew memory management system.
- Switch to a JPL rule #3 (no heap allocation) implementation.
I've decided to use C++ for two reasons: it's supported by all the targets I want (amd64, arm/arm64, msp430, avr), and I know it well enough (and importantly, I know the tooling) to get by. Typically, the TI compilers lag behind the others in supporting newer C++ standards, so those will be the limiting factor. Fortunately, just a few days before I started this, the TI wiki was updated to note that the latest compilers now support C++11 and C++14, so I'll target C++14.
As a reminder to myself: this is not going to be the prettiest or best or most secure or production ready code. The goal is to have fun writing some software again and to rekindle some of the joy of computing that I had before. Once I have something working, I can go back and make an exercise of cleaning it up and refactoring it. The prose in this series is also not going to be my finest writing ever --- again, it suffices just to do it. The goal is to have something to show, not to achieve perfection; it'll mostly going to be hacked on while I'm on the bus or when I have a bit of downtime here and there.
I don't really know what I'm doing, so in the next section, I'll build out the basic framework and set up the build.