How Kal Compiles Itself
Kal is written in itself and is self-compiling. I've had a couple requests on Hacker News and Reddit to explain how this is accomplished and how the current npm packages came to exist.
This post should show you how to compile Kal, give a quick intro to how the compiler works, and a brief history of how we got here.
About Kal and Its Parser
class Statement inherits from ASTBase method parse() me.statement = me.req BlankStatement, TryCatch, ClassDefinition, ReturnStatement, WaitForStatement, IfStatement, WhileStatement, ForStatement, ThrowStatement, SuperStatement, DeclarationStatement, AssignmentStatement, ExpressionStatement class WhileStatement inherits from ASTBase method parse() me.req_val 'while' me.lock() me.expr = me.req Expression me.block = me.req Block
WhileStatment::parse is called, the first thing that happens is that the parser requires the value
"while". If the current token isn't
"while", parsing the while statement will fail. If
req_val fails, it throws an error and the parent (
Statement) will continue on and try the next option in its list (
lock indicates that we are now sure this is a
while statement. If any
req_val calls fail after this, it will throw syntax error and abort compilation.
Next, we require an
"while" (which has its own syntax declaration), then we require a code block.
How it Compiles (Now)
If you clone the repository, you'll find you get a
source directory that contains several
.kal files. If you dig deeper, you'll notice there's no
.js files anywhere. So it compiles itself, but it needs to be already compiled to do so.
In order to make this work, you need to complete the following steps:
sudo npm install -g kal.
- Compile the repository using the Kal "binary" into the
compiled/directory. This creates the "stage 1" product.
- Use the "stage 1" compiled version in
compiled/to recompile itself. This is called the "stage 2" product.
- Run the tests with the stage 2 product to make sure nothing broke.
Steps 2-4 can all be accomplished with
npm run-script bootstrap.
So where did the binary come from? To create the 0.4.5 binary, I ran these steps with the previous version's binary (0.4.4). But where did that come from?
How It Got Here
So how'd I compile the first version? Is it turtles all the way down?
At the bottom of Kal, deep in the history, still present in the code (though in memory only), is CoffeeScript.
anarbi posted a great link about how compilers can contain a memory of things that are no longer in the code.
Kal was originally written in CoffeeScript up until about version 0.2.0. If you look here, you can see an old version of the compiler written as a set of
.coffee files. I chose CoffeeScript because:
- I was familiar with it
- It's well supported and full-featured
- It's syntactically similar to Kal
- It runs on node.js which made cross-compiling easy
I use CoffeeScript extensively in my own professional work and I love it. Kal really isn't ready for prime time yet, so right now I just use it for personal projects. I point this out because CoffeeScript has its own self-build system which is very well done and it inspired a lot of this work.
My goal with the CoffeeScript version was to get it to a point where enough tests were passing and enough features were supported that I could port things over to Kal. I wanted a bare minimum feature set.
One-by-one, I hand-translated the
.coffee source files to
.kal, running tests and fixing bugs along the way. Once the Kal source was mature enough (and bug-free enough), I started switching the
require links to point to the
.kal versions. Here is the first commit where I've switched a reference. At this point, the compiler was written partly in Kal and partly in CoffeeScript. Luckily node.js allows these to play nicely together.
Once I had all the files translated over and all the tests passing, I removed the CoffeeScript source here.
Since then, I've used the previous version of the Kal compiler to bootstrap forward when making a new release. Extensive use of tests helps enormously. If you wanted to rebuild Kal without a compiled version, you'd need to go back to the original CoffeeScript source and compile up step-by-step.
GCC uses a similar bootstrap method to compile itself. According to the GCC install page, the bootstrapping process is as follows:
- Build tools necessary to build the compiler.
- Perform a 3-stage bootstrap of the compiler. This includes building three times the target tools for use by the compiler such as binutils (bfd, binutils, gas, gprof, ld, and opcodes) if they have been individually linked or moved into the top level GCC source tree before configuring.
- Perform a comparison test of the stage2 and stage3 compilers.
- Build runtime libraries using the stage3 compiler from the previous step.
Kal does not currently bother do a third stage compile/compare. This would entail using the stage 2 output of
npm run-script bootstrap to build itself again, then doing a diff. The goal is to ensure the compiler is stable and not changing its own function. I will probably add this step in the future, though for now I am content with test cases as assurance that nothing is broken.