3.4 Iteration 3: Adding Control Flow
3.4.1 Milestone Summary
The goals of this milestone are to (1) practice exposing low-level features from the target language to the compiler, (2) introduce designing abstractions to tame low-level operations, (3) demonstrate the problems that control flow causes with program analysis, (4) introduce linking, and (5) practice thinking about programming at different levels of abstraction. We’ll expose labels and jmp instructions from x64, provide abstractions for using them in the source language, compile the latter abstraction to the former implementation, and practice reasoning about our new source language.
You can use the interrogator to get limited access to the reference solution: https://soft.vub.ac.be/compilers/interrogator?an=a4.
3.4.1.1 Learning Objectives
3.4.1.2 Checklist
Provided passes
Completely new passes
Major modifications
Minor modifications
assign-homes
assign-fvars
flatten-begins
3.4.2 Reading
The reading for this week is Structured Control Flow. As usual, this milestone description links to the documentation for each exercise in the chapter for convenience, but you are responsible for the reading the entire chapter.
3.4.3 Assignment
Here are some useful notes:
As part of the exercises you will extend the register allocator to support structured control flow. However, you can also test the other compilation passes independent of the register allocator extension by using the unoptimized assign-homes instead of the optimized one assign-homes-opt.
To compile halt in generate-x64, it may help if you generate a labeled instruction that essentially does nothing. This way, you can insert a label at the end of your program. Otherwise, you’ll need to think carefully about how to arrange and generate blocks.
The new flatten-program pass flattens all basic blocks into straight-line code. This replaces the previous flatten-begins pass.
In the implementation of the expose-basic-blocks pass you are essentially accumulating a set of blocks. However once created these do not change. Instead of passing the set of generated blocks around, you may simply use mutable state. See set! or better box, set-box!, and unbox. You may want to use fresh-label from cpsc411/compiler-lib.
You may find this implementation of alpha equivalence for Values-lang v4 helpful to test your implementation of uniquify: a4-alpha-equal.rkt. The code is annotated to demonstrate idiomatic Racket style, and explain features you may be unfamiliar with.
3.4.4 Validators and interpreters
In these exercises you extend the well-formedness checkers and the interpreters for the source and target language. Generic definitions are given for these by default which you have to replace with your own implementations. You should however prioritize the exercises above, which do not have default implementation and are therefore more important for obtaining a fully functioning compiler pipeline.
Design and implement link-paren-x64, which resolves all labels into the address of the instruction in the instruction sequence.
Extend the implementation of interp-paren-x64 to support control-flow primitives.
It should use link-paren-x64 to resolve all labels, and should use a program counter to loop over the instruction sequence instead of folding over the sequence. It will help to struture the main loop to match the interface of something like interp-loop, although you are not required to define it this way.
For this exercise, it is sufficient approximate the real EFLAGS-based semantics, i.e. you can assume that you only work with signed integers, that a conditional jump always appears after a compare instruction, and these two instructions are the only control-flow relevant ones. Furthermore, instead of calculating the concrete sign, overflow, etc. flags, your compare instruction can store alternative information like an ordering (less, equal or greater) or even the compared numbers themselves.
Redesign and extend the function check-paren-x64, a validator for Paren-x64 v4.
This only needs to check the correct syntax of the language, since we cannot in general decide if a program is using uninitialized values or not. You should check that all instructions are indeed valid x64 instructions and that every used label is also declared somewhere in the program, and that labels cannot be declared twice.
Extend the function interp-values-lang, an interpreter for Values-lang v4.
Extend the function check-values-lang, a validator for Values-lang v4.