On this page:
1.6.1 Preface:   What’s wrong with our language?
1.6.2 Designing a source language with structured control flow
1.6.3 Exposing Control-Flow Primitives
generate-x64
1.6.3.1 The semantics of labels and jumps as linking
link-paren-x64
interp-paren-x64
interp-loop
1.6.3.2 Finding the next abstraction boundary
implement-fvars
patch-instructions
1.6.4 New Abstractions:   Blocks and Predicates
flatten-program
1.6.4.1 Designing A Language for Optimization
resolve-predicates
1.6.4.2 Abstracting Away Jumps
expose-basic-blocks
optimize-predicates
1.6.5 Register Allocation
assign-homes-opt
1.6.5.1 Uncovering Locals
uncover-locals
1.6.5.2 Undead Analysis
undead-analysis
1.6.5.3 Conflict Analysis
conflict-analysis
1.6.5.4 Assign Registers
assign-registers
1.6.5.5 Replace Locations
replace-locations
1.6.6 Imperative Abstractions
select-instructions
normalize-bind
sequentialize-let
uniquify
1.6.7 Polishing Version 4
interp-values-lang
check-values-lang
1.6.8 Appendix:   Overview

1.6 Structured Control Flow

1.6.1 Preface: What’s wrong with our language?

For our last abstraction, we designed the language Values-lang v3. This language is an improvement over x64, but has a significant limitation: we can only express simple, straight-line arithmetic computations. We’ll never be able to write any interesting programs!

In this chapter, we will expose a machine feature, control flow instructions in the form of labels and jmp instructions, and systematically abstract these into a structured control-flow primitive: if expressions. Control flow is complex, and adding it requires changes to nearly every pass and every intermediate language.

The overview of this version of the compiler is given in Figure 4.

1.6.2 Designing a source language with structured control flow

As usual, we’ll start with our goal and then design the compiler bottom-up. We want to extend Values-lang v3 with a structured control-flow feature: a form of if expression.

Our goal is Values-lang v4, duplicated below.

  p ::= (module tail)
     
  pred ::= (relop triv triv)
  | (true)
  | (false)
  | (not pred)
  | (let ([x value] ...) pred)
  | (if pred pred pred)
     
  tail ::= value
  | (let ([x value] ...) tail)
  | (if pred tail tail)
     
  value ::= triv
  | (binop triv triv)
  | (let ([x value] ...) value)
  | (if pred value value)
     
  triv ::= int64
  | x
     
  x ::= name?
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  int64 ::= int64?
  p ::= (module tail)
     
  pred ::= (relop triv triv)
  | (true)
  | (false)
  | (not pred)
  | (let ([x value] ...) pred)
  | (if pred pred pred)
     
  tail ::= value
  | (let ([x value] ...) tail)
  | (if pred tail tail)
     
  value ::= triv
  | (binop triv triv)
  | (let ([x value] ...) value)
  | (if pred value value)
     
  triv ::= int64
  | x
     
  x ::= name?
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  int64 ::= int64?
  p ::= (module tail)
     
  tail ::= value
  | (let ([x value] ...) tail)
     
  value ::= triv
  | (binop triv triv)
  | (let ([x value] ...) value)
     
  triv ::= int64
  | x
     
  x ::= name?
     
  binop ::= *
  | +
     
  int64 ::= int64?

Note that an if expression, (if pred e e), is limited: it cannot branch on an arbitrary expression. It is restricted so that a comparison between two values must appear in the predicate position. This is a necessary artifact we’ll inherit from x64. Even by the end of this chapter, we will not yet have enough support from the low-level language to add values that represent the result of a comparison, i.e., we won’t have booleans.

Design digression:
This is a design choice. With the abstraction we have so far, we could add pseudo-boolean expressions by giving an arbitrary interpretation to any values in the predicate position. For example, we could interpret 0 as true and 1 as false, or vice versa. Unfortunately, this would expose our source language to undefined behaviour—what happens when an if expression branches on any other value? We could remedy this slightly by making any non-zero integer true, for example. However, it also means we become unable to distinguish booleans from integers, leading to type confusion when reading and writing data. If a programmer sees the number -5 printed or in a program, is it a boolean or a number?

There are many ways to solve the problem, but to keeping with our design principles of designing clear abstraction boundaries, we will later add proper booleans as a separate primitive data type. That is a task separate from adding control flow, so we deal with it later, and instead implement a limited form of structured control-flow first that booleans can compile to later.

1.6.3 Exposing Control-Flow Primitives

When we want to add a new feature to the source language, we must always ask if there’s some existing abstraction in the target language that we can use. So far, Paren-x64 v2 exposes only intructions to move data between locations and perform simple arithmetic computation on locations. This is insufficiently expressive, so we must reach even lower and expose new primitives from the machine.

Thankfully, the machine does expose a primitive. x64 exposes labels, written l: before some instruction, and jumps, written jmp trg where trg is either a label or a register containing a label. Labels can be chained in x64, as we’ve seen in prior assignments. For example, the following assigns two labels to the same instruction:

L1:

L2:

  mov rax, 42

To use labels, we have a variety of jmp instructions, which take a label and cause control to move to that label, instead of the next instruction. In this example, rax is set to 5 at the end of the program, as the instruction that sets rax to 42 is not executed due to the jmp instruction.

  mov rax, 5

  jmp L1

  mov rax, 42

L1:

We’ll begin the next version of our compiler by designing a new Paren-x64 v4 to expose the additional features of x64 necessary to implement control flow abstractions. This extends the previous Paren-x64 v2 with comparison operations, labels, and conditional and unconditional jump operations.

  p ::= (begin s ...)
     
  s ::= (set! addr int32)
  | (set! addr trg reg)
  | (set! reg loc)
  | (set! reg triv)
  | (set! reg_1 (binop reg_1 int32))
  | (set! reg_1 (binop reg_1 loc))
  | (with-label label s)
  | (jump trg)
  | (compare reg opand)
  | (jump-if relop label)
     
  trg ::= reg
  | label
     
  triv ::= trg
  | reg
  | int64
     
  opand ::= int64
  | reg
     
  loc ::= reg
  | addr
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  int64 ::= int64?
     
  int32 ::= int32?
     
  dispoffset ::= dispoffset?
     
  label ::= label?
  p ::= (begin s ...)
     
  s ::= (set! addr int32)
  | (set! addr trg)
  | (set! reg loc)
  | (set! reg triv)
  | (set! reg_1 (binop reg_1 int32))
  | (set! reg_1 (binop reg_1 loc))
  | (with-label label s)
  | (jump trg)
  | (compare reg opand)
  | (jump-if relop label)
     
  trg ::= reg
  | label
     
  triv ::= trg
  | int64
     
  opand ::= int64
  | reg
     
  loc ::= reg
  | addr
     
  reg ::= rsp
  | rbp
  | rax
  | rbx
  | rcx
  | rdx
  | rsi
  | rdi
  | r8
  | r9
  | r10
  | r11
  | r12
  | r13
  | r14
  | r15
     
  addr ::= (fbp - dispoffset)
     
  fbp ::= frame-base-pointer-register?
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  int64 ::= int64?
     
  int32 ::= int32?
     
  dispoffset ::= dispoffset?
     
  label ::= label?
  p ::= (begin s ...)
     
  s ::= (set! addr int32)
  | (set! addr reg)
  | (set! reg loc)
  | (set! reg triv)
  | (set! reg_1 (binop reg_1 int32))
  | (set! reg_1 (binop reg_1 loc))
     
  triv ::= reg
  | int64
     
  loc ::= reg
  | addr
     
  reg ::= rsp
  | rbp
  | rax
  | rbx
  | rcx
  | rdx
  | rsi
  | rdi
  | r8
  | r9
  | r10
  | r11
  | r12
  | r13
  | r14
  | r15
     
  addr ::= (fbp - dispoffset)
     
  fbp ::= frame-base-pointer-register?
     
  binop ::= *
  | +
     
  int64 ::= int64?
     
  int32 ::= int32?
     
  dispoffset ::= dispoffset?

Labels are too complex to define by grammar; instead, they’re defined by the label? predicate in cpsc411/compiler-lib.

NASM and x64 impose some additional restriction on labels, some of which we’ve discussed already.
  • Recall that a mov instruction to an addr can only move 32-bit integers literals. While on a 64-bit machine, labels could be 64-bits, the SYS V ABI also specifies a small code model. In this model, all labels are 32-bits, so can be moved into memory directly.

  • The only valid characters in labels are letters, numbers, _, $, #, @, ~, ., and ?. The only characters which may be used as the first character of an identifier are letters, . (which has a special meaning and shouldn’t be used), _ and ?.

  • Labels cannot be longer than 4095 characters.

The library function sanitize-label turns a label? into a string that NASM will accept, as long as it is not too long.

In Paren-x64 v4, we model labels with the (with-label label s) instruction, which defines a label label at the instruction s in the instruction sequence. This corresponds to the x64 string label:\n s. Note that they can be nested, allowing the same behaviour as chaining labels in x64. For convenience, we assume all labels are symbols of the form L.<name>.<number>, and are globally unique.

Note that (with-label label s) does not behave like a define in Racket or in Values-lang v3. The instruction s gets executed after the previous instruction in the sequence, even if the previous instruction was not a jump. with-label additionally names the instruction so that we can jump to it later, the same as a label in x64.

The new comparison instruction (compare reg opand) corresponds to the x64 instruction cmp reg, opand. This instruction compares reg to opand and sets some flags in the machine describing their relation, such as whether reg is less than opand, or whether they are equal. The flags are used by the next conditional jump instruction. If the opand is an integer literal, it must restricted to 32-bits.

The conditional jump instructions in x64, in the same order as the definition of relop, are: jl label, jle label, je label, jge label, jg label, and jne label and each corresponds to "jump to trg if the comparison flag is set to ___". For example, the instruction je label jumps to label if the comparison flag "equal" is set.

In Paren-x64 v4, we abstract the various conditional jump instructions into a single instruction with multiple flags. The instruction je l corresponds to (jump-if = l). jl l jumps to l if comparison flag "less than" is set, and corresponds to (jump-if < l). The rest of the instructions follow this pattern.

We make an additional simplifying restriction in Paren-x64 v4 compared to x64. We assume that a compare instruction is always followed immediately by a jump-if, and similarly, that any jump-if is immediately preceeded by a compare. This is necessay since other instructions in x64, such as binary operations, can affect comparison flags. However, we do not want to try to reason about how the flags are affected by arbitrary comparison, and our compiler will always generate a compare-and-then-conditional-jump sequence of instructions.

To implement Paren-x64 v4, we define the procedure generate-x64, which simply converts each instruction to its x64 string form.

procedure

(generate-x64 p)  (and/c string? x64-instructions?)

  p : paren-x64-v4?
Compile the Paren-x64 v4 program into a valid sequence of x64 instructions, represented as a string.

1.6.3.1 The semantics of labels and jumps as linking

Labels and jumps are a small change to the language syntactically, but have a large effect on the semantics. We can see this by writing an interpreter for the language with labels and jumps and comparing it to an interpreter for a language without them.

We can no longer write the Paren-x64 v4 interpreter in one simple loop over the instructions. Instead, we need some way to resolve labels. That way, when running the interpreter, we must be able to jump to any expression at any time—a possibility the language now allows. This process of resolving labels is called linking.

We only cover linking local labels in this chapter. More generally, linking requires resolving references to external labels, i.e.,, linking imported libraries, and in the context of many operating systems, preparing some additional metadata so the operating system knows where to begin executing. The more general instance is conceptually similar to what we present here—the key concept is transforming the abstraction of labels to code and data into some encoding used by the underlying implementation.

We can view the process of linking as yet another compiler, and thus a language design problem. We design a simple linker for Paren-x64 v4 to give you a rough idea of how the operating system’s linker works. We use a low-level linking implementation that is similar to the operating system’s linker: we first resolve all labels to their address in memory (in our case, their index in the instruction sequence) and then implement jumps by simply setting a program counter to the instruction’s address.

To do this, we design a new language Paren-x64-rt v4 (paren-x64-rt-v4), which represents the run-time language used by the interpreter after linking.

  p ::= (begin s ...)
     
  s ::= (set! addr int32)
  | (set! addr trg)
  | (set! reg loc)
  | (set! reg triv)
  | (set! reg_1 (binop reg_1 int32))
  | (set! reg_1 (binop reg_1 loc))
  | (with-label label s)
  | (jump trg)
  | (compare reg opand)
  | (jump-if relop pc-addr label)
     
  trg ::= reg
  | pc-addr
  | label
     
  int64 ::= int64?
     
  int32 ::= int32?
     
  pc-addr ::= natural-number/c
     
  dispoffset ::= dispoffset?
     
  label ::= label?
  p ::= (begin s ...)
     
  s ::= (set! addr int32)
  | (set! addr trg)
  | (set! reg loc)
  | (set! reg triv)
  | (set! reg_1 (binop reg_1 int32))
  | (set! reg_1 (binop reg_1 loc))
  | (jump trg)
  | (compare reg opand)
  | (jump-if relop pc-addr)
     
  trg ::= reg
  | pc-addr
     
  triv ::= trg
  | int64
     
  opand ::= int64
  | reg
     
  loc ::= reg
  | addr
     
  reg ::= rsp
  | rbp
  | rax
  | rbx
  | rcx
  | rdx
  | rsi
  | rdi
  | r8
  | r9
  | r10
  | r11
  | r12
  | r13
  | r14
  | r15
     
  addr ::= (fbp - dispoffset)
     
  fbp ::= frame-base-pointer-register?
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  int64 ::= int64?
     
  int32 ::= int32?
     
  pc-addr ::= natural-number/c
     
  dispoffset ::= dispoffset?
  p ::= (begin s ...)
     
  s ::= (set! addr int32)
  | (set! addr trg)
  | (set! reg loc)
  | (set! reg triv)
  | (set! reg_1 (binop reg_1 int32))
  | (set! reg_1 (binop reg_1 loc))
  | (with-label label s)
  | (jump trg)
  | (compare reg opand)
  | (jump-if relop label)
     
  trg ::= reg
  | label
     
  triv ::= trg
  | int64
     
  opand ::= int64
  | reg
     
  loc ::= reg
  | addr
     
  reg ::= rsp
  | rbp
  | rax
  | rbx
  | rcx
  | rdx
  | rsi
  | rdi
  | r8
  | r9
  | r10
  | r11
  | r12
  | r13
  | r14
  | r15
     
  addr ::= (fbp - dispoffset)
     
  fbp ::= frame-base-pointer-register?
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  int64 ::= int64?
     
  int32 ::= int32?
     
  dispoffset ::= dispoffset?
     
  label ::= label?

We remove the instruction (with-label label s) and turn all label values into pc-addr, a representation of an address recognized by the interpreter. This encodes the idea that the linker, the compiler from Paren-x64 v4 to Paren-x64-rt v4, resolves labels into addresses. In our case, since programs are represented as lists of instructions, a pc-addr is a natural number representing the position of the instruction in the list.

To implement Paren-x64-rt v4 and thus perform linking, we define the procedure link-paren-x64.

procedure

(link-paren-x64 p)  paren-x64-rt-v4?

  p : paren-x64-v4?
Compiles Paren-x64 v4 to Paren-x64-rt v4 by resolving all labels to their position in the instruction sequence.

Now we can give a semantics to Paren-x64-rt v4, and thus give a semantics to Paren-x64 v4 and understand the meaning of labels and jumps. We define an interpreter interp-paren-x64 for Paren-x64 v4 by first linking using link-paren-x64, and then running an interpreter for Paren-x64-rt v4. The main loop of the Paren-x64-rt v4 interpreter uses a program counter to keep track of its position in an instruction sequence. To interpret a jump, we simply change the program counter to the number indicated in the jump.

procedure

(interp-paren-x64 p)  int64?

  p : paren-x64-v4?
Interpret the Paren-x64 v4 program p as a value, returning the final value of rax.

procedure

(interp-loop code env pc)  int64?

  code : (listof paren-x64-rt-v4.s)
  env : dict?
  pc : natural-number/c
The main loop of the interpreter for Paren-x64-rt v4. code does not change. env is a dict? mapping physical locations (as symbol?s) to their values (int64?). pc is the program counter, indicating the current instruction being executed as an index into the list code.

1.6.3.2 Finding the next abstraction boundary

Having exposed x64 features to our compiler internal languages, we now need to find the right boundary at which to abstract away from the low-level representation of control-flow—labels and jumps—and introduce a more structured form of control flow, if.

Our next two languages in the pipeline, bottom-up, are Paren-x64-fvars v4 and Para-asm-lang v4. Both of these languages abstract machine-specific details about physical locations. This doesn’t seem very related to control-flow, so we simply want to propagate our new primitives up through these layers of abstraction. We expose the new instructions while abstracting away from the machine constraints about which instructions work on which physical locations. Now jumps can target arbitrary locations, and compare can compare arbitrary locations. We can also move labels into arbitrary locations.

Below we typeset Paren-x64-fvars v4 (paren-x64-fvars-v4).

  p ::= (begin s ...)
     
  s ::= (set! fvar int32)
  | (set! fvar trg reg)
  | (set! reg loc)
  | (set! reg triv)
  | (set! reg_1 (binop reg_1 int32))
  | (set! reg_1 (binop reg_1 loc))
  | (with-label label s)
  | (jump trg)
  | (compare reg opand)
  | (jump-if relop label)
     
  trg ::= reg
  | label
     
  triv ::= trg
  | reg
  | int64
     
  opand ::= int64
  | reg
     
  loc ::= reg
  | fvar
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  label ::= label?
  p ::= (begin s ...)
     
  s ::= (set! fvar int32)
  | (set! fvar trg)
  | (set! reg loc)
  | (set! reg triv)
  | (set! reg_1 (binop reg_1 int32))
  | (set! reg_1 (binop reg_1 loc))
  | (with-label label s)
  | (jump trg)
  | (compare reg opand)
  | (jump-if relop label)
     
  trg ::= reg
  | label
     
  triv ::= trg
  | int64
     
  opand ::= int64
  | reg
     
  loc ::= reg
  | fvar
     
  reg ::= rsp
  | rbp
  | rax
  | rbx
  | rcx
  | rdx
  | rsi
  | rdi
  | r8
  | r9
  | r10
  | r11
  | r12
  | r13
  | r14
  | r15
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  int64 ::= int64?
     
  int32 ::= int32?
     
  fvar ::= fvar?
     
  label ::= label?
  p ::= (begin s ...)
     
  s ::= (set! fvar int32)
  | (set! fvar reg)
  | (set! reg loc)
  | (set! reg triv)
  | (set! reg_1 (binop reg_1 int32))
  | (set! reg_1 (binop reg_1 loc))
     
  triv ::= reg
  | int64
     
  loc ::= reg
  | fvar
     
  reg ::= rsp
  | rbp
  | rax
  | rbx
  | rcx
  | rdx
  | rsi
  | rdi
  | r8
  | r9
  | r10
  | r11
  | r12
  | r13
  | r14
  | r15
     
  binop ::= *
  | +
     
  int64 ::= int64?
     
  int32 ::= int32?
     
  fvar ::= fvar?

Nothing important changes in Paren-x64-fvars v4. We simply add the new control-flow primitives.

Compile the Paren-x64-fvars v4 to Paren-x64 v4 by reifying fvars into displacement mode operands. The pass should use current-frame-base-pointer-register.

Next we typeset Para-asm-lang v4.

  p ::= (begin s effect ... (halt triv))
     
  s (halt opand) ::= effect
  | (set! loc triv)
  | (set! loc_1 (binop loc_1 opand triv))
  | (jump trg)
  | (with-label label s)
  | (compare loc opand)
  | (jump-if relop trg)
     
  triv ::= opand
  | label
     
  opand triv ::= int64
  | loc
     
  trg ::= label
  | loc
     
  loc ::= reg
  | fvar
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  fvar ::= fvar?
     
  label ::= label?
  p ::= (begin s ...)
     
  s ::= (halt opand)
  | (set! loc triv)
  | (set! loc_1 (binop loc_1 opand))
  | (jump trg)
  | (with-label label s)
  | (compare loc opand)
  | (jump-if relop trg)
     
  triv ::= opand
  | label
     
  opand ::= int64
  | loc
     
  trg ::= label
  | loc
     
  loc ::= reg
  | fvar
     
  reg ::= rsp
  | rbp
  | rax
  | rbx
  | rcx
  | rdx
  | rsi
  | rdi
  | r8
  | r9
  | r12
  | r13
  | r14
  | r15
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  int64 ::= int64?
     
  fvar ::= fvar?
     
  label ::= label?
  p ::= (begin effect ... (halt triv))
     
  effect ::= (set! loc triv)
  | (set! loc_1 (binop loc_1 triv))
     
  triv ::= int64
  | loc
     
  loc ::= reg
  | fvar
     
  reg ::= rsp
  | rbp
  | rax
  | rbx
  | rcx
  | rdx
  | rsi
  | rdi
  | r8
  | r9
  | r12
  | r13
  | r14
  | r15
     
  binop ::= *
  | +
     
  int64 ::= int64?
     
  fvar ::= fvar?

While halt is still an instruction, we must handle it differently now that we have control flow. halt is a control-flow effect, immediately ending the control flow of the program, similar to exit. Previously, this did not matter, since we had no control flow that could be affected, and we ensured halt was the final instruction. Now, we have control flow.

We restrict Para-asm-lang v4, requiring that there is exactly one dynamic halt and that it is the final instruction executed in the program. We cannot restrict the syntax to require this, since we now support jumps. Jumps mean our syntax does not give us a clear indication of which instruction is executed last. It might be the case that halt is the second instruction in the instruction sequence, but is always executed last because of the control flow of the program. It could also be that there are multiple halt instructions syntactically, but only one will ever be executed due to conditional jumps.

This means compiling halt is slightly more complicated. When implementing halt, we must ensure that halt is the last instruction executed. Previously, all halt had to do was set the appropriate register, since it was always the final instruction. When compiling halt, our job is to ensure our restrictions about halt when compiling. This is simple to do by adding a unique label after all instructions in the program, and implementing halt with a jump to that label.

For convenience, the cpsc411/2c-run-time provides such a label, done, which your implementation can use instead.

To implement Para-asm-lang v4, we extend patch-instructions. The implementation is essentially similar to the definition from Value Orientation.

The tricky operation to support is (jump-if relop loc), since x64 gives us only (jump-if relop label). We can do this by generating a 2-instruction sequence and negating the relop.

Compiles Para-asm-lang v4 to Paren-x64-fvars v4 by patching each instruction that has no x64 analogue into a sequence of instructions using auxiliary register from current-patch-instructions-registers.

1.6.4 New Abstractions: Blocks and Predicates

Working our way up the pipeline, the next language from the previous version of our compiler is the Asm-lang v2 family of languages. Recall that this family of languages includes several administrative languages, including Asm-lang v2/assignments. This is the output language of the register allocator.

So here we must stop and ask: is Asm-lang v2 the right place to start from when abstracting away from labels and jumps?

For that, we need to think about what happens in Asm-lang v2. The register allocation and related analyses all happen in Asm-lang v2. If we continue to propagate the primitives up, then the register allocator will be forced to deal with labels and jumps. If we abstract away, then we can design an abstraction that might work better with the register allocator and the analyses.

When control can jump to any instruction at any time, giving semantics to programs, such as writing an interpreter or an analysis, is very difficult. At any label, we must make assumptions about the state of the machine, since the state is not affected only by the sequence of instructions that came before the label in the instruction sequence, but potentially arbitrary instructions that were executed prior to a jumping to the label. This is why we introduced a linking pass before the interpreter. We do not want to have to deal with linking during register allocation.

We therefore want to abstract away from labels and jumps before register allocation.

Design digression:
There are advantages to making the register allocator more aware of labels and jumps. We could write a more complex analysis that tries to resolve labels and jumps, essentially resolving linking and then doing the analysis over the linked program. This would give the register allocator more accurate information about control flow, allowing it to do a better job of minimizing register conflicts, but at the expense of a more complex and slower analysis.

Question: Thinking ahead, what is the problem with analyzing jump instructions? Which parts of the register allocator must change to handle them: conflict analysis, undead analysis, register allocation, or some combination of the three?

To simplify reasoning about programs with control flow, we can organize code into basic blocks, labeled blocks where control can only enter the beginning of the block and must exit at the end of the block. This gives us more structure on which to hang assumptions, and can make more assumptions about code when writing analyses. In particular, we will be able to annotate which registers are undead on entry to and on exit from a block, so our analysis does not have to resolve labels and jumps.

We need to develop this basic block abstraction before we get to the register allocator, so we introduce it next, as an abstraction of Para-asm-lang v4.

We design Block-asm-lang v4, a basic-block-structured abstract assembly language in which sequences of statements are organized into basic blocks, and code can jump between blocks. Labels are no longer instructions that can happen anywhere; instead, each block is labeled. Jumps cannot appear just anywhere; instead, they happen only at the end of a block.

  p ::= (module b ... b)
  | (begin s ...)
     
  b ::= (define label tail)
     
  tail ::= (halt opand)
  | (jump trg)
  | (begin effect ... tail)
  | (if (relop loc opand) (jump trg) (jump trg))
     
  effect s ::= (halt opand)
  | (set! loc triv)
  | (set! loc_1 (binop loc_1 opand))
  | (jump trg)
  | (with-label label s)
  | (compare loc opand)
  | (jump-if relop trg)
     
  triv ::= opand
  | label
     
  opand ::= int64
  | loc
     
  trg ::= label
  | loc
     
  loc ::= reg
  | fvar
  p ::= (module b ... b)
     
  b ::= (define label tail)
     
  tail ::= (halt opand)
  | (jump trg)
  | (begin effect ... tail)
  | (if (relop loc opand) (jump trg) (jump trg))
     
  effect ::= (set! loc triv)
  | (set! loc_1 (binop loc_1 opand))
     
  triv ::= opand
  | label
     
  opand ::= int64
  | loc
     
  trg ::= label
  | loc
     
  loc ::= reg
  | fvar
     
  reg ::= rsp
  | rbp
  | rax
  | rbx
  | rcx
  | rdx
  | rsi
  | rdi
  | r8
  | r9
  | r12
  | r13
  | r14
  | r15
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  int64 ::= int64?
     
  fvar ::= fvar?
     
  label ::= label?
  p ::= (begin s ...)
     
  s ::= (halt opand)
  | (set! loc triv)
  | (set! loc_1 (binop loc_1 opand))
  | (jump trg)
  | (with-label label s)
  | (compare loc opand)
  | (jump-if relop trg)
     
  triv ::= opand
  | label
     
  opand ::= int64
  | loc
     
  trg ::= label
  | loc
     
  loc ::= reg
  | fvar
     
  reg ::= rsp
  | rbp
  | rax
  | rbx
  | rcx
  | rdx
  | rsi
  | rdi
  | r8
  | r9
  | r12
  | r13
  | r14
  | r15
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  int64 ::= int64?
     
  fvar ::= fvar?
     
  label ::= label?

In Block-asm-lang v4, a program is a non-empty sequence of labeled blocks. We consider the first block in the sequence to be the start of the program. A tail represents a self-contained block of statements. Jumps can only appear at the end of blocks, and jumps only enter the beginning of blocks.

The basic block abstraction essentially forces us to add an if statement. We want to ensure jumps happen only at the end of a block, but how could that be if we only have separate jump-if instructions as in Para-asm-lang v4? At the very least, we would need to support a block that ends in three instruction sequences: compare, followed by a jump-if, followed by a jump. This is the low-level implementation of an if statement. Rather than trying to recognize a three-instruction sequence, we simply abstract the sequence into a single instruction: (if (relop loc opand) (jump trg) (jump trg)). This buys us simplicity in analyzing basic blocks.

The halt instruction should only be executed at the end of the final block; it cannot stop control flow, but only indicates that if the program has ended, the opand is the final value. Again, we cannot enforce this syntactically due to jumps. Instead, we require that halt appears at the end of a block, and assume only one halt instruction is ever executed during execution.

To implement Block-asm-lang v4, we simply flatten blocks, moving the label from the define to the first instruction in the block using with-label.

Compile Block-asm-lang v4 to Para-asm-lang v4 by flattening basic blocks into labeled instructions.

1.6.4.1 Designing A Language for Optimization

When introducing a new statement or expression, we should ask ourselves: what equations do we want to be true of this expression? For example, should we be able to rewrite

(if (< 0 1) (jump trg_1) (jump trg_2))

to (jump trg_1)?

This would be ideal, as it optimizes away the predicate test. What about this: are the following two programs equivalent?

(if (< 0 1) (jump trg_1) (jump trg_2))

(if (>= 0 1) (jump trg_2) (jump trg_1))

This... should be true, but it’s less obvious why we might do this. But perhaps there are cases where >= is faster than <, or perhaps for some reason we would like (jump trg_1) to be the final jump because another optimization would be able to inline that jump.

While it would be straightforward to write an analysis to support these transformations, we could do better by recognizing a pattern and introducing an abstraction. In the first case, what we really want to do is transform any expression where the predicate is obviously truewe’ll write this as (true). Then we could write a simple optimization to transform (if (true) (jump trg_1) (jump trg_2)) into (jump trg_1). Similarly, if we had a predicate that was obviously false, written (false), we could rewrite (if (false) (jump trg_1) (jump trg_2)) into (jump trg_2). If we had a language with a predicate abstraction, we could separate the analysis of which comparisons are obvious from the optimization that rewrites if statements with obvious predicates.

We therefore introduce the language Block-pred-lang v4. It introduces pred position. A pred is not a boolean; we can easily compile all preds into either a simple (relop loc opand) or eliminate them entirely. They exist as a way to express the output of some analysis over predicates and enable us to easily rewrite if statements.

  p ::= (module b ... b)
     
  b ::= (define label tail)
     
  pred ::= (relop loc opand)
  | (true)
  | (false)
  | (not pred)
     
  tail ::= (halt opand)
  | (jump trg)
  | (begin effect ... tail)
  | (if pred (relop loc opand) (jump trg) (jump trg))
     
  effect ::= (set! loc triv)
  | (set! loc_1 (binop loc_1 opand))
     
  opand ::= int64
  | loc
     
  loc ::= reg
  | fvar
  p ::= (module b ... b)
     
  b ::= (define label tail)
     
  pred ::= (relop loc opand)
  | (true)
  | (false)
  | (not pred)
     
  tail ::= (halt opand)
  | (jump trg)
  | (begin effect ... tail)
  | (if pred (jump trg) (jump trg))
     
  effect ::= (set! loc triv)
  | (set! loc_1 (binop loc_1 opand))
     
  triv ::= opand
  | label
     
  opand ::= int64
  | loc
     
  trg ::= label
  | loc
     
  loc ::= reg
  | fvar
     
  reg ::= rsp
  | rbp
  | rax
  | rbx
  | rcx
  | rdx
  | rsi
  | rdi
  | r8
  | r9
  | r12
  | r13
  | r14
  | r15
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  int64 ::= int64?
     
  fvar ::= fvar?
     
  label ::= label?
  p ::= (module b ... b)
     
  b ::= (define label tail)
     
  tail ::= (halt opand)
  | (jump trg)
  | (begin effect ... tail)
  | (if (relop loc opand) (jump trg) (jump trg))
     
  effect ::= (set! loc triv)
  | (set! loc_1 (binop loc_1 opand))
     
  triv ::= opand
  | label
     
  opand ::= int64
  | loc
     
  trg ::= label
  | loc
     
  loc ::= reg
  | fvar
     
  reg ::= rsp
  | rbp
  | rax
  | rbx
  | rcx
  | rdx
  | rsi
  | rdi
  | r8
  | r9
  | r12
  | r13
  | r14
  | r15
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  int64 ::= int64?
     
  fvar ::= fvar?
     
  label ::= label?

The pred position allows relops as before, but also obviously true and false predicates, and predicate negation. This abstraction gives some later pass the ability to optimize (> 1 0) to (true).

Obvious predicates, like (true) and (false) simply compile by transforming the if statement into either the first or second branch. The negation predicate, (not pred), swaps the branches and continues compiling (if pred (jump trg_2) (jump trg_1)). We leave the relop predicate alone.

We implement Block-pred-lang v4 with a simple compiler, resolve-predicates.

Compile the Block-pred-lang v4 to Block-asm-lang v4 by manipulating the branches of if statements to resolve branches.

Note that this pass is not an optimization. Optimization passes are intra-language. However, its existence allows us to implement an optimization pass by transforming predicates in the predicate language. We delay writing this optimization for one more language, as an additional abstraction will help us unlock further optimizations.

1.6.4.2 Abstracting Away Jumps

We have already introduced the basic-block abstraction to simplify the structure of jumps for register allocation, but can simplify further. Since our source language, Values-lang v4, doesn’t use jumps at all and exposes only an if expression, there is no point (yet) to exposing jumps further up the pipeline. If we abstract away from them here and now, then no part of the register allocator will need to deal with jumps.

To abstract away from jumps, we need to design a feature that is sufficient to express if expressions in terms of if statements without jumps. The key difference between the two can be seen clearly in the pseudo-grammar-diff below:

  e ::= (if pred e_1 e_2 (jump trg_1) (jump trg_2))

In if expressions, like other expressions, we support arbitrarily nested sub-expressions in the branches. In if statements, we restrict the branches.

To compile if expressions to if statements, we must generate new basic blocks with fresh labels from nested branches, and transform the branches into jumps. Phrased top-down, the question is if we should do that before or after register allocation? Phrased bottom-up, should we expose jumps through the register allocation languages or not?

As discussed earlier when describing the semantics of labels and jumps, jumps are difficult to give semantics to. We want to avoid analyzing them if we can. We therefore choose to not expose jumps any further up the pipeline.

We introduce Nested-asm-lang v4, which allows nesting begin and if expressions that would otherwise need to be expressed with labeled blocks and jumps. This means we could have an if statement of the form (if pred (begin s ... (halt loc)) (begin s ... (halt loc))). The nesting structure allows all compiler passes over this language, and targeting this language, to ignore jumps. The language roughly corresponds to an imperative programming language without loops, but one assembly-like feature still remains: physical locations.

  p ::= (module tail b ... b)
     
  b ::= (define label tail)
     
  pred ::= (relop loc triv opand)
  | (true)
  | (false)
  | (not pred)
  | (begin effect ... pred)
  | (if pred pred pred)
     
  tail ::= (halt triv opand)
  | (jump trg)
  | (begin effect ... tail)
  | (if pred tail tail (jump trg) (jump trg))
     
  effect ::= (set! loc triv)
  | (set! loc_1 (binop loc_1 triv opand))
  | (begin effect ... effect)
  | (if pred effect effect)
     
  triv ::= opand
  | label
     
  triv opand ::= int64
  | loc
     
  trg ::= label
  | loc
     
  label ::= label?
  p ::= (module tail)
     
  pred ::= (relop loc triv)
  | (true)
  | (false)
  | (not pred)
  | (begin effect ... pred)
  | (if pred pred pred)
     
  tail ::= (halt triv)
  | (begin effect ... tail)
  | (if pred tail tail)
     
  effect ::= (set! loc triv)
  | (set! loc_1 (binop loc_1 triv))
  | (begin effect ... effect)
  | (if pred effect effect)
     
  triv ::= int64
  | loc
     
  loc ::= reg
  | fvar
     
  reg ::= rsp
  | rbp
  | rax
  | rbx
  | rcx
  | rdx
  | rsi
  | rdi
  | r8
  | r9
  | r12
  | r13
  | r14
  | r15
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  fvar ::= fvar?
     
  int64 ::= int64?
  p ::= (module b ... b)
     
  b ::= (define label tail)
     
  pred ::= (relop loc opand)
  | (true)
  | (false)
  | (not pred)
     
  tail ::= (halt opand)
  | (jump trg)
  | (begin effect ... tail)
  | (if pred (jump trg) (jump trg))
     
  effect ::= (set! loc triv)
  | (set! loc_1 (binop loc_1 opand))
     
  triv ::= opand
  | label
     
  opand ::= int64
  | loc
     
  trg ::= label
  | loc
     
  loc ::= reg
  | fvar
     
  reg ::= rsp
  | rbp
  | rax
  | rbx
  | rcx
  | rdx
  | rsi
  | rdi
  | r8
  | r9
  | r12
  | r13
  | r14
  | r15
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  int64 ::= int64?
     
  fvar ::= fvar?
     
  label ::= label?

Note that Nested-asm-lang v4 enables much of the same nesting we find in monadic form. We skipped over a-normal form. Unnesting if requires jumps, unless we want to duplicate code; for efficiency and simplicity, it is beneficial to maintain monadic form until this very low level in the compiler.

To implement Nested-asm-lang v4, we define the procedure expose-basic-blocks. The strategy for writing this is slightly complex. Each helper for processing a non-terminal may need to introduce new basic blocks, and transforming a nested if expression requires knowing the target of each branch.

The transformation for predicates should transform predicates and generate an if statement whose branches are jumps. When processing a pred, we need two additional inputs, a "true" and a "false" label, used to generate the output if instruction. For a base predicate, such as (true) or (relop aloc triv), you can generate an if statement. When you find an if in predicate position, you’ll need to generate two new basic blocks, and rearrange the current true and false labels.

The transformer for effects should take care to unnest begin statements. This is not really related to exposing basic blocks, but it is trivial to deal with using the right abstraction, and so does not warrant a separate compiler pass.

Compile the Nested-asm-lang v4 to Block-pred-lang v4, eliminating all nested expressions by generating fresh basic blocks and jumps.

We can now express various optimizations in Nested-asm-lang v4. For example, we can express the following rewrites:

Source

   

   

Target

(begin (set! reg 1) (> reg 0))

   

   

(begin (set! reg 1)
       (true))

(begin (set! reg 1) (< reg 0))

   

   

(begin (set! reg 1)
       (false))
(begin (set! reg opand_1)
       (< reg ,(max-int 64)))

   

   

(begin (set! reg opand_1)
       (true))
(begin (set! reg opand_1)
       (= reg opand_1))

   

   

(begin (set! reg opand_1)
       (true))
(begin (set! reg int64_1) (= reg int64_2))

   

   

(begin (set! reg int64_1) (false))

The language doesn’t allow us to express relational opreations directly on opands, so we have to be a little more clever to record the possible values of abstract locations, and detect (> loc 0), when loc is surely greater than 0.

More generally, we might define an abstract interpreter. This interpreter would run during compile-time, and thus over possibly incomplete programs. This means it has to define some abstract notion of the value of a statement. In the worst case, such an abstract value will represent "any run-time value", meaning that we don’t have enough static information to predict the result. However, we might be able to evaluate a predicate to determine that in (begin (set! fv0 5) (> fv0 5)), fv0 is surely 5, and that (not (true)) is surely (false) in the abstract interpreter, and if so, this justifies optimizations.

Note that when rewriting predicates, we must be careful to preserve any effects, since we can’t (locally) know whether they’re necessary or not.

Question: Can you think of any predicates that require using nested if statements?

Optimize Nested-asm-lang v4 programs by analyzing and simplifying predicates.

1.6.5 Register Allocation

Next, we design Asm-pred-lang v4, an imperative language that supports some nested structured control-flow. Like Asm-lang v2, this language is a family of administrative languages, each differing only in its info fields.

  p ::= (module info tail)
     
  info ::= info?
     
  pred ::= (relop aloc loc triv)
  | (true)
  | (false)
  | (not pred)
  | (begin effect ... pred)
  | (if pred pred pred)
     
  tail ::= (halt triv)
  | (begin effect ... tail)
  | (if pred tail tail)
     
  effect ::= (set! aloc loc triv)
  | (set! aloc_1 loc_1 (binop aloc_1 loc_1 triv))
  | (begin effect ... effect)
  | (if pred effect effect)
     
  triv ::= int64
  | aloc
  | loc
     
  loc ::= reg
  | fvar
     
  reg ::= rsp
  | rbp
  | rax
  | rbx
  | rcx
  | rdx
  | rsi
  | rdi
  | r8
  | r9
  | r12
  | r13
  | r14
  | r15
     
  aloc ::= aloc?
     
  fvar ::= fvar?
     
  int64 ::= int64?
  p ::= (module info tail)
     
  info ::= info?
     
  pred ::= (relop aloc triv)
  | (true)
  | (false)
  | (not pred)
  | (begin effect ... pred)
  | (if pred pred pred)
     
  tail ::= (halt triv)
  | (begin effect ... tail)
  | (if pred tail tail)
     
  effect ::= (set! aloc triv)
  | (set! aloc_1 (binop aloc_1 triv))
  | (begin effect ... effect)
  | (if pred effect effect)
     
  triv ::= int64
  | aloc
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  aloc ::= aloc?
     
  int64 ::= int64?
  p ::= (module info tail)
     
  info ::= info?
     
  pred ::= (relop aloc triv)
  | (true)
  | (false)
  | (not pred)
  | (begin effect ... pred)
  | (if pred pred pred)
     
  tail ::= (halt triv)
  | (begin effect ... tail)
  | (if pred tail tail)
     
  effect ::= (set! aloc triv)
  | (set! aloc_1 (binop aloc_1 triv))
  | (begin effect ... effect)
  | (if pred effect effect)
     
  triv ::= int64
  | aloc
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  aloc ::= aloc?
     
  int64 ::= int64?
  p ::= (module tail)
     
  pred ::= (relop loc triv)
  | (true)
  | (false)
  | (not pred)
  | (begin effect ... pred)
  | (if pred pred pred)
     
  tail ::= (halt triv)
  | (begin effect ... tail)
  | (if pred tail tail)
     
  effect ::= (set! loc triv)
  | (set! loc_1 (binop loc_1 triv))
  | (begin effect ... effect)
  | (if pred effect effect)
     
  triv ::= int64
  | loc
     
  loc ::= reg
  | fvar
     
  reg ::= rsp
  | rbp
  | rax
  | rbx
  | rcx
  | rdx
  | rsi
  | rdi
  | r8
  | r9
  | r12
  | r13
  | r14
  | r15
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  fvar ::= fvar?
     
  int64 ::= int64?

The big difference is that physical locations have changed to abstract locations. Recall that this is the big abstraction register allocation buys us, so it ought to be the only big change.

As before, we treat the register allocator as a single compiler from Asm-pred-lang v4 to Nested-asm-lang v4.

Recall that in our register allocator, we are not designing layers of abstraction like most of our compiler. We are following an existing design: undead analysis, conflict analysis, graph colouring register allocation. We therefore walk through the register allocator in this order.

1.6.5.1 Uncovering Locals

First, we extend uncover-locals to analyze if. We design the administrative language Asm-pred-lang v4/locals below. As with other administrative languages, the only change is the info field for the module. It now contains a locals set, describing all variables used in the module.

  info ::= (#:from-contract (info/c (locals (aloc ...))))
  | info?

Compiles Asm-pred-lang v4 to Asm-pred-lang v4/locals, analysing which abstract locations are used in the module and decorating the module with the set of variables in an info field.

1.6.5.2 Undead Analysis

Now our undead analysis must change to follow the branches of if statements. Asm-pred-lang v4/undead defines the output of undead-analysis.

  info ::= (#:from-contract (info/c (locals (aloc ...)) (undead-out undead-set-tree?)))

The key to describing the analysis is designing a representation of undead-set tree that can representing the new structure of our statements. Now, statements can branch at if.

Our new undead-set tree is one of:
  • an undead-out set (aloc ...), corresponding to the undead-out set for a single instruction such as (halt triv) or (set! aloc triv).

  • a list of undead-set trees, (undead-set-tree?_1 ... undead-set-tree?_2), corresponding to a begin statement (begin effect_1 ... effect_2) or (begin effect_1 ... tail). The first element of the list represents undead-set tree for the first effect, the second element represents the undead-set tree for the second effect, and so on.

  • a list of exactly three undead-set trees, (undead-set-tree?_p undead-set-tree?_1 undead-set-tree?_2), corresponding to a if statement (if pred tail_1 tail_2). undead-set-tree?_p corresponds to the undead-set tree of pred, while undead-set-tree?_1 corresponds to tail_1 and undead-set-tree?_2 corresponds to tail_2.

Warning: this datatype is non-canonical. The second and third cases can overlap. We always traverse an undead-set tree together with a corresponding program; the undead-set tree cannot be interpreted in isolation.

For example, consider the following undead-set tree.

((x.1) (x.1 y.2) ((y.2 b.3) (b.3) (b.3 c.4) ((c.4) () ((c.4) ()))))

This corresponds to the following tail.
(module
  ((locals (x.1 y.2 b.3 c.4)))
  (begin
    (set! x.1 5)
    (set! y.2 x.1)
    (begin
      (set! b.3 x.1)
      (set! b.3 (+ b.3 y.2))
      (set! c.4 b.3)
      (if (= c.4 b.3)
          (halt c.4)
          (begin
            (set! x.1 c.4)
            (halt c.4))))))

Performs undeadness analysis, decorating the program with undead-set-tree. Only the info field of the program is modified.

1.6.5.3 Conflict Analysis

Next we need to compute the conflict graph.

Below, we design Asm-pred-lang v4/conflicts below with structured control-flow.

  info ::= (#:from-contract (info/c (locals (aloc ...)) (conflicts ((aloc (aloc ...)) ...)) (undead-out undead-set-tree?)))

The conflict-analysis does not change significantly. We simply extend the algorithm to support the new statements. Note that new statements only reference but never define an abstract location.

Decorates a program with its conflict graph.

1.6.5.4 Assign Registers

Finally, we design the register allocator as the administrative language Asm-pred-lang v4/assignments (pronounced "Asm-pred-lang v4, with assignments").

  info ::= (#:from-contract (info/c (locals (aloc ...)) (assignment conflicts ((aloc rloc (aloc ...)) ...))))

The allocator should run the same algorithm as before. Since the allocator doesn’t traverse programs, it shouldn’t need any changes.

Performs graph-colouring register allocation, compiling Asm-pred-lang v4/conflicts to Asm-pred-lang v4/assignments by decorating programs with their register assignments.

1.6.5.5 Replace Locations

Finally, we actually replace abstract locations with physical locations. In the process, we’re free to discard the info from the analyses.

  info ::= info?
  | (#:from-contract (info/c (locals (aloc ...)) (assignment ((aloc rloc) ...))))

Compiles Asm-pred-lang v4/assignments to Nested-asm-lang v4 by replacing all abstract location with physical locations using the assignments described in the assignment info field.

1.6.6 Imperative Abstractions

Finally, we have all the abstractions in place to abstract away from imperative statements.

First we abstract to an imperative language from our abstract assembly language. We design Imp-cmf-lang v4, a pseudo-ANF restricted imperative language.

  p ::= (module info tail)
     
  info ::= info?
     
  pred ::= (relop triv aloc triv)
  | (true)
  | (false)
  | (not pred)
  | (begin effect ... pred)
  | (if pred pred pred)
     
  tail ::= value
  | (halt triv)
  | (begin effect ... tail)
  | (if pred tail tail)
     
  value ::= triv
  | (binop triv triv)
     
  effect ::= (set! aloc value triv)
  | (set! aloc_1 (binop aloc_1 triv))
  | (begin effect ... effect)
  | (if pred effect effect)
  p ::= (module tail)
     
  pred ::= (relop triv triv)
  | (true)
  | (false)
  | (not pred)
  | (begin effect ... pred)
  | (if pred pred pred)
     
  tail ::= value
  | (begin effect ... tail)
  | (if pred tail tail)
     
  value ::= triv
  | (binop triv triv)
     
  effect ::= (set! aloc value)
  | (begin effect ... effect)
  | (if pred effect effect)
     
  triv ::= int64
  | aloc
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  aloc ::= aloc?
     
  int64 ::= int64?
  p ::= (module tail)
     
  pred ::= (relop triv triv)
  | (true)
  | (false)
  | (not pred)
  | (begin effect ... pred)
  | (if pred pred pred)
     
  tail ::= value
  | (begin effect ... tail)
  | (if pred tail tail)
     
  value ::= triv
  | (binop triv triv)
     
  effect ::= (set! aloc value)
  | (begin effect ... effect)
  | (if pred effect effect)
     
  triv ::= int64
  | aloc
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  aloc ::= aloc?
     
  int64 ::= int64?
  p ::= (module info tail)
     
  info ::= info?
     
  pred ::= (relop aloc triv)
  | (true)
  | (false)
  | (not pred)
  | (begin effect ... pred)
  | (if pred pred pred)
     
  tail ::= (halt triv)
  | (begin effect ... tail)
  | (if pred tail tail)
     
  effect ::= (set! aloc triv)
  | (set! aloc_1 (binop aloc_1 triv))
  | (begin effect ... effect)
  | (if pred effect effect)
     
  triv ::= int64
  | aloc
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  aloc ::= aloc?
     
  int64 ::= int64?

It is mostly a straightforward extension of Imp-cmf-lang v3 to include the if and pred. However, note that it breaks ANF by allowing nested effects in pred position. This means the language is even more non-canonical. The following two programs are equal:

(if (begin (set! x.1 5) (= x.1 5)) x.1 6)

(begin (set! x.1 5) (if (= x.1 5) x.1 6))

Compiles Imp-cmf-lang v4 to Asm-pred-lang v4, selecting appropriate sequences of abstract assembly instructions to implement the operations of the source language.

Similarly, we easily extend Imp-mf-lang v3 to Imp-mf-lang v4, defined below.

  p ::= (module tail)
     
  pred ::= (relop triv triv)
  | (true)
  | (false)
  | (not pred)
  | (begin effect ... pred)
  | (if pred pred pred)
     
  tail ::= value
  | (begin effect ... tail)
  | (if pred tail tail)
     
  value ::= triv
  | (binop triv triv)
  | (if pred value value)
  | (begin effect ... value)
     
  effect ::= (set! aloc value)
  | (if pred effect effect)
  | (begin effect ... effect)
  | (if pred effect effect)
     
  triv ::= int64
  | aloc
     
  aloc ::= aloc?
     
  int64 ::= int64?
  p ::= (module tail)
     
  pred ::= (relop triv triv)
  | (true)
  | (false)
  | (not pred)
  | (begin effect ... pred)
  | (if pred pred pred)
     
  tail ::= value
  | (begin effect ... tail)
  | (if pred tail tail)
     
  value ::= triv
  | (binop triv triv)
  | (if pred value value)
  | (begin effect ... value)
     
  effect ::= (set! aloc value)
  | (if pred effect effect)
  | (begin effect ... effect)
     
  triv ::= int64
  | aloc
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  aloc ::= aloc?
     
  int64 ::= int64?
  p ::= (module tail)
     
  pred ::= (relop triv triv)
  | (true)
  | (false)
  | (not pred)
  | (begin effect ... pred)
  | (if pred pred pred)
     
  tail ::= value
  | (begin effect ... tail)
  | (if pred tail tail)
     
  value ::= triv
  | (binop triv triv)
  | (if pred value value)
  | (begin effect ... value)
     
  effect ::= (set! aloc value)
  | (if pred effect effect)
  | (begin effect ... effect)
     
  triv ::= int64
  | aloc
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  aloc ::= aloc?
     
  int64 ::= int64?

procedure

(normalize-bind p)  imp-cmf-lang-v4?

  p : imp-mf-lang-v4?
Compiles Imp-mf-lang v4 to Imp-cmf-lang v4, pushing set! under begin and if so that the right-hand-side of each set! is a simple value-producing operation.

This normalizes Imp-mf-lang v4 with respect to the equations
(set! aloc
      (begin effect_1 ...
             value))

   

=

   

(begin effect_1 ...
       (set! aloc value))
(set! aloc
      (if pred
          value_1
          value_2))

   

=

   

(if pred
    (set! aloc value_1)
    (set! aloc value_2))

We describe Values-unique-lang v4 below.

  p ::= (module tail)
     
  pred ::= (relop triv triv)
  | (true)
  | (false)
  | (not pred)
  | (let ([aloc value] ...) pred)
  | (if pred pred pred)
     
  tail ::= value
  | (let ([aloc value] ...) tail)
  | (if pred tail tail)
     
  value ::= triv
  | (binop triv triv)
  | (let ([aloc value] ...) value)
  | (if pred value value)
     
  triv ::= int64
  | aloc
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  aloc ::= aloc?
     
  int64 ::= int64?
  p ::= (module tail)
     
  pred ::= (relop triv triv)
  | (true)
  | (false)
  | (not pred)
  | (let ([aloc value] ...) pred)
  | (begin effect ... pred)
  | (if pred pred pred)
     
  tail ::= value
  | (let ([aloc value] ...) tail)
  | (begin effect ... tail)
  | (if pred tail tail)
     
  value ::= triv
  | (binop triv triv)
  | (let ([aloc value] ...) value)
  | (if pred value value)
  | (begin effect ... value)
     
  effect ::= (set! aloc value)
  | (if pred effect effect)
  | (begin effect ... effect)
     
  triv ::= int64
  | aloc
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  aloc ::= aloc?
     
  int64 ::= int64?
  p ::= (module tail)
     
  pred ::= (relop triv triv)
  | (true)
  | (false)
  | (not pred)
  | (let ([aloc value] ...) pred)
  | (if pred pred pred)
     
  tail ::= value
  | (let ([aloc value] ...) tail)
  | (if pred tail tail)
     
  value ::= triv
  | (binop triv triv)
  | (let ([aloc value] ...) value)
  | (if pred value value)
     
  triv ::= int64
  | aloc
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  aloc ::= aloc?
     
  int64 ::= int64?

In Values-unique-lang v4, we extend expressions with an if expression that takes a predicate. The predicate form, or sub-language, is still not a true boolean datatype. They cannot be bound to abstract locations or returned as values. We have to restrict the predicate position this way since we have no explicit run-time representation of the value that a comparison operation produces, i.e., we don’t have booleans.

Compiles Values-unique-lang v4 to Imp-mf-lang v4 by picking a particular order to implement let expressions using set!.

Finally, we abstract away from abstract locations and introduce lexical identifiers.

We defined Values-lang v4 below.

  p ::= (module tail)
     
  pred ::= (relop triv triv)
  | (true)
  | (false)
  | (not pred)
  | (let ([x value] ...) pred)
  | (if pred pred pred)
     
  tail ::= value
  | (let ([x value] ...) tail)
  | (if pred tail tail)
     
  value ::= triv
  | (binop triv triv)
  | (let ([x value] ...) value)
  | (if pred value value)
     
  triv ::= int64
  | x
     
  x ::= name?
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  int64 ::= int64?
  p ::= (module tail)
     
  pred ::= (relop triv triv)
  | (true)
  | (false)
  | (not pred)
  | (let ([x aloc value] ...) pred)
  | (if pred pred pred)
     
  tail ::= value
  | (let ([x aloc value] ...) tail)
  | (if pred tail tail)
     
  value ::= triv
  | (binop triv triv)
  | (let ([x aloc value] ...) value)
  | (if pred value value)
     
  triv ::= int64
  | x
  | aloc
     
  x ::= name?
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  aloc ::= aloc?
     
  int64 ::= int64?
  p ::= (module tail)
     
  pred ::= (relop triv triv)
  | (true)
  | (false)
  | (not pred)
  | (let ([x value] ...) pred)
  | (if pred pred pred)
     
  tail ::= value
  | (let ([x value] ...) tail)
  | (if pred tail tail)
     
  value ::= triv
  | (binop triv triv)
  | (let ([x value] ...) value)
  | (if pred value value)
     
  triv ::= int64
  | x
     
  x ::= name?
     
  binop ::= *
  | +
     
  relop ::= <
  | <=
  | =
  | >=
  | >
  | !=
     
  int64 ::= int64?

procedure

(uniquify p)  values-unique-lang-v4?

  p : values-lang-v4?

1.6.7 Polishing Version 4

procedure

(interp-values-lang p)  int64?

  p : values-lang-v4?
Interpret the Values-lang v4 program p as a value. For all p, the value of (interp-values-lang p) should equal to (execute p).

procedure

(check-values-lang p)  values-lang-v4?

  p : any/c
Takes an arbitrary value and either returns it, if it is a valid Values-lang v4 program, or raises an error with a descriptive error message.

1.6.8 Appendix: Overview

%3L0Values-lang v4L1Values-unique-lang v4L0->L1 uniquifyL15Imp-mf-lang v4L1->L15 sequentialize-letL2Asm-pred-lang v4L7Nested-asm-lang v4L2->L7 assign-homes-optL3Asm-pred-lang v4/localsL2->L3 uncover-localsL16Imp-cmf-lang v4L15->L16 normalize-bindL16->L2 select-instructionsL7->L7 optimize-predicatesL8Block-pred-lang v4L7->L8 expose-basic-blocksL9Block-asm-lang v4L8->L9 resolve-predicatesL10Para-asm-lang v4L9->L10 flatten-programL17Paren-x64-fvars v4L11Paren-x64 v4L17->L11 implement-fvarsL10->L17 patch-instructionsL13x64L14int64L13->L14 executeL4Asm-pred-lang v4/undeadL3->L4 undead-analysisL5Asm-pred-lang v4/conflictsL4->L5 conflict-analysisL6Asm-pred-lang v4/assignmentsL5->L6 assign-registersL6->L7 replace-locationsL11->L13 generate-x64L11->L14 interp-paren-x64L12Paren-x64-rt v4L11->L12 link-paren-x64L12->L14 interp-loop

Figure 4: Overview of Compiler Version 4