From Abstract Syntax Trees to Machine Code with LLVM
A compiler is a program that translates source code written in a particular language into another language, for instance from C to x86 machine code. Internally, the whole process is typically split into multiple stages that handle one particular aspect of this translation. Roughly speaking, one can distinguish three main stages: parsing, semantic analysis and code generation. The first stage typically consists of transforming the input into a tree-shaped representation of the program, to which the second phase assigns a meaning. The last stage consists of rephrasing the program into the output language.
In this session, we will focus on this last stage, and use LLVM to translate abstract syntax trees (ASTs) to executable code. LLVM is a compiler toolchain that handles code generation, while being agnostic to the input language. It is based on its own internal representation, LLVM IR, which is then transformed to machine code for a particular architecture. Our goal will be to translate a program expressed in the form of an AST into LLVM IR. This will challenge us to design and implement techniques to encode high-level features, such as object-orientation, into a much more lower-level constructions.
We’ll use Swift as the implementation language in this journey and work on the code of a simple compiler, called Cocodol. You may want to install Swift 5.3+ and LLVM 11.x on your system before the tutorial.