2018-02-21: Revisiting Forth

At work lately I've been feeling like I'm losing my skills as a programmer. I need something to work on, something that captures my interest. One project I enjoyed working on in the past was kforth, but I never got far with it. I picked up some tiny, cheap STM32 M0 boards and now I'm trying to implement a Forth on there.

One thing I liked at Echostar was that we had a lot of components that you could build with a flag like S=PC to build for your workstation or S=STB to build for a set top box. With that in mind, I'm trying to make this Forth modular so that it's easy to test on my laptop and then deploy to the board --- the biggest difference will be setting system limits (my Linux laptop can probably deal with a larger stack than a Cortex M0, for example) and the I/O subsystem --- on my laptop, I don't want to have to interact with a serial console, but that's how I'm going to have to do it on the microcontrollers.

Anyways, I've been thinking about the problem of how to build the backbone. My first thought was something like

struct entry {
        char    word[MAX_WORD_LEN];
        void    (*fun)(void);
        struct entry *next;
};

to represent an entry. The fun field is the function that's being pointed to; I made it take no arguments as it should be taking them from the stack; I suspect void is the wrong return type, though.

From here, how do we build up a lexicon? My (what I suspect to be naïve) approach is to build out a fixed size list of entries with some room to grow (e.g. in the case of :, the word definer).

The question arises: where does the stack come from? That is, the fun field is supposed to operate on a stack, but where is this stack? I started filling out a system.h file with a set of globals, which I suspect is an acceptable approach. So what are the globals we'll want?

  • a stack
  • a binding hash table of some sort, so we can store variables

I got this far, then my immediate thought was that the binding table shouldn't just be for variables: what if it was for functions, too?

Okay, so we create a linked list of bindings, and store the tail of this list; each entry should point to the previous entry. Why? Because then you can do things like this:

: adjustment 4 ;           -- bind 'adjustment' to the value 4
: magic-box adjustment * ; -- bind 'magic-box' to multiply by adjustment
: adjustment 2 ;           -- later we want to tune down our adjustment

This is supposed to be one of the features of Forth, so this approach lets us keep it. But before I can do that, how to represent a word? A word, after all, can be either a value (a number or a string) or a function. This is where I'm stopped at (the day job requires attention), but then the dictionary is straightforward:

struct vocabulary {
        struct entry *last;
        size_t        length;
};

The length here is mostly to help with memory management, to catch possible memory overflows before they happen.

I think this makes sense. I need to figure out what a word looks like, though. I do have ideas for functions, though:

constexpr uint8_t NFUN = 1; // native fun
constexpr uint8_t LFUN = 2; // lambda fun
// native_fun points to a native function.
struct native_fun {
        uint8_t type;
        (void *)(fun)(void);
};

// fun points to either another lambda_fun (e.g. an intepreted
// function) or to a native_fun; next points to the next function (or
// NULL if there's no next function).
struct lambda_fun {
        uint8_t type;
        void *fun;
        void *next;
};

It feels good to actually think about problems again. I've so missed this kind of work.


Tags: , ,