Introducing a simple experimental Lisp compiler, written in uLisp, that will compile a Lisp function into ARM machine code. My aim was to make the compiler simple enough so that its listing will fit on a single sheet of A4 paper.
For a set of example programs, and a commented listing of the compiler, see the full article: A Lisp compiler for ARM written in Lisp.
Introduction
When I added the facility of executing machine code to uLisp I had in mind the eventual goal of being able to compile uLisp functions into machine code, and this is a first step in that direction.
The nice thing about compiling Lisp is that you don’t have to write a tokeniser or parser, because Lisp programs are already in a consistent structure that can be processed by another Lisp program.
The compiler program is written in the subset of Common Lisp supported by uLisp, and will run on an ARM board with at least 5000 objects of workspace. I used a Seeed Studio XIAO nRF52840. You can also run it using Common Lisp on a laptop or desktop computer, and display the code it generates, but of course you won’t be able to run the ARM machine code because Common Lisp doesn’t have uLisp’s defcode command.
Although I wrote this for ARM boards it should be relatively straightforward to convert it to generate AVR or RISC-V machine code.
Using the compiler
To use this compiler you simply call compile on a Lisp function; for example:
(compile 'fibonacci)
The function will be compiled into a machine code function, replacing the original Lisp code, so that calling fibonacci will now execute the ARM machine-code version.
You can also display the code generated for an expression by calling comp on the expression; for example:
> (pprint (comp '(* 13 17)))
(:integer
($mov 'r0 13)
($push '(r0))
($mov 'r0 17)
($push '(r0))
($pop '(r0 r1))
($mul 'r1 'r0)
($push '(r1)))
Specification
The compiler understands the following Lisp commands:
Defining a function: defun
Symbols: nil, t
Arithmetic functions: +, -, *, logand, logior, logxor
Arithmetic comparisons: =, <, <=, >, >=, /=
Conditionals: if, and, or
Evaluation: progn, setq
Limitations
This first version of the compiler has a few limitations, including:
- It can only compile functions of up to four integer arguments, and returning an integer result.
- It assumes that all arithmetic and logical functions take exactly two arguments.
- It assumes that the function you are compiling is small enough so that branch instructions can be used for jumps within the function.
- There is relatively little error checking.
- It generates inefficient code compared to what you could write by hand.
Some of these limitations would be simple to remedy; some of them are more fundamental.
To use the compiler you first need to load the ARM assembler, written in Lisp, from: ARM assembler in uLisp.
For more information about the assembler see ARM assembler overview.
How the compiler works
Register usage
To avoid needing to keep track of register usage the compiler makes use of the stack to pass values to an expression, and store the value returned by an expression.
The following table shows how the ARM registers are used within the compiler:
Registers | Use |
---|---|
r0 r1 r2 r3 | Used to pass the parameters to the main function’s arguments. |
r0 | Contains the value returned by the main function. |
r4 r5 r6 r7 | Contain copies of the function arguments within the function. |
r0 r1 | Used to pass the arguments to each operator. |
r1 | Used to return the value from each operator. |
r2 | Used to return the true/nil value from comparisons. |
Compiling an expression
The following steps show the sequence of compiling an expression, such as:
(* x 13)
- Code is generated to evaluate each of the arguments, in this case x and 13, and each result is pushed onto the stack.
- The two values are then popped from the stack into registers r0 and r1.
- The function, in this case *, is then evaluated for r1 and r0, with the result in r1.
- This result in r1 is then pushed onto the stack.
This stack-based approach ensures that a more complex expression, such as:
(* (- x 1) (+ x 13))
will also compile into correct code, without conflicts between registers.
Calling the function recursively
The compiler supports calling a function recursively from within the function itself. Because the registers corresponding to the parameters and local variables would be overwritten by the recursive call they are stored on the stack around the function call.
Types
For boolean operations I decided to represent nil as zero, and t as 1. A problem I hadn’t anticipated was that I would need to keep track of what type of object each function returned, integer or boolean. For example, consider the problem of compiling the statement:
(and x y)
If x has the value 0 and y has the value 7 this should return 7. However, if x has the value nil and y has the value 7 this should return nil. Representing nil as zero leads to an ambiguity.
I solved this by returning a type, :integer or :boolean, with each compiled expression, according to the following rules:
- Predicates, and t or nil, always return a :boolean.
- Arithmetic operations always return an :integer.
- A if form requires a :boolean test form and returns an :integer.
- A progn or let block returns the type of its last expression.
Examples
Here is one of the example programs I used to test the compiler:
Takeuchi function
This is the standard highly-recursive benchmark I use for comparing versions of Lisp; see Benchmarks - Takeuchi function:
(defun tak (x y z)
(if (>= y x)
z
(tak
(tak (- x 1) y z)
(tak (- y 1) z x)
(tak (- z 1) x y))))
Lisp version:
> (time (tak 18 12 6))
7
Time: 8.9 s
Compiled version
> (time (tak 18 12 6))
7
Time: 68 ms
For further example programs, and a commented listing of the compiler, see the full article: A Lisp compiler for ARM written in Lisp.