3.2 Compiler Support
(require cpsc411/compiler-lib) | package: cpsc411-lib |
This library provides support functions and parameters for implementing the CPSC411 project compiler.
3.2.1 Data Types
These data types are used in the implementation of the CPSC411 compiler for representing intermediate language constructs.
procedure
(dispoffset? v) → boolean?
v : any/c
> (dispoffset? 0) #t
> (dispoffset? 1) #f
> (dispoffset? 2) #f
> (dispoffset? 3) #f
> (dispoffset? 4) #f
> (dispoffset? 8) #t
> (dispoffset? 8) #t
> (dispoffset? 15) #f
> (dispoffset? 16) #t
> (dispoffset? 17) #f
> (dispoffset? 'x) #f
3.2.1.1 Labels
> (label? 'L.start.1) #t
> (label? 'Lstart.1) #f
> (label? 'L.start1) #f
> (label? 'start.1) #f
> (label? "L.start.1") #f
procedure
(fresh-label [x]) → label?
x : (or/c string? symbol?) = 'tmp
> (fresh-label) 'L.tmp.1
> (fresh-label) 'L.tmp.2
> (fresh-label) 'L.tmp.3
> (fresh-label 'meow) 'L.meow.4
> (fresh-label "hello") 'L.hello.5
procedure
(sanitize-label l) → string?
l : label?
> (sanitize-label 'L.main.1) "L.main.1"
> (sanitize-label 'L.$!!@#*main.2) "L.$24$$21$$21$@#$2a$main.2"
3.2.1.2 Identifiers
> (name? 'L.start.1) #t
> (name? 'Lstart.1) #t
> (name? 'L.start1) #t
> (name? 'start.1) #t
> (name? 'start) #t
> (name? 'x) #t
> (name? "L.start.1") #f
> (name? "x") #f
> (name? 5) #f
> (aloc? 0) #f
> (aloc? 'x) #f
> (aloc? 'x.1) #t
> (aloc? 'L.start.1) #f
> (aloc? 'Lstart.1) #t
> (aloc? 'L.start1) #f
> (aloc? 'start.1) #t
> (aloc? 'start) #f
> (aloc? 'x) #f
> (aloc? "L.start.1") #f
> (aloc? "x") #f
> (aloc? 5) #f
Optionally takes a symbol v to represent the name part of the aloc?.
> (fresh) 'tmp.1
> (fresh) 'tmp.2
> (fresh) 'tmp.3
> (fresh 'L.start.1) '.L.start.1.4
> (fresh 'L.start.1) '.L.start.1.5
3.2.1.3 Frame Variables
> (fvar? 0) #f
> (fvar? 'x) #f
> (fvar? 'x.1) #f
> (fvar? 'fv) #f
> (fvar? 'fv.1) #f
> (fvar? 'fv1) #t
> (fvar? 'fv2) #t
procedure
i : exact-nonnegative-integer?
procedure
v : fvar?
> (fvar->index (make-fvar 1)) 1
> (fvar->index 'fv1) 1
3.2.1.4 Undead Sets
procedure
(undead-set? v) → boolean?
v : any/c
> (undead-set? '(a.1 b.2 c.3)) #t
> (undead-set? '(a.1 b.2 a.1)) #t
> (undead-set? '(5 3)) #f
procedure
(undead-set-tree? ust) → boolean?
ust : any/c
> (undead-set-tree? '(a.1 b.2 c.3)) #t
> (undead-set-tree? '((a.1 b.2 c.3))) #t
> (undead-set-tree? '((a.1 b.2 c.3) (a.1 b.2 c.3) (a.1 b.2 c.3))) #t
> (undead-set-tree? '((a.1 b.2 c.3) ((a.1 b.2 c.3) (a.1 b.2 c.3) (a.1 b.2 c.3)))) #t
> (undead-set-tree? '((5 b.2 c.3) ((a.1 b.2 c.3) (a.1 b.2 c.3) (a.1 b.2 c.3)))) #f
procedure
(undead-set-tree/rloc? ust) → boolean?
ust : any/c
> (undead-set-tree/rloc? '(a.1 b.2 c.3)) #t
> (undead-set-tree/rloc? '((a.1 b.2 c.3))) #t
> (undead-set-tree/rloc? '((a.1 b.2 c.3) (a.1 b.2 c.3) (a.1 b.2 c.3))) #t
> (undead-set-tree/rloc? '((a.1 b.2 c.3) ((a.1 b.2 c.3) (a.1 b.2 c.3) (a.1 b.2 c.3)))) #t
> (undead-set-tree/rloc? '((5 b.2 c.3) ((a.1 b.2 c.3) (a.1 b.2 c.3) (a.1 b.2 c.3)))) #f
> (undead-set-tree/rloc? '((rax rbx r8 c.3) ((a.1 r9 c.3) (a.1 b.2 c.3) (a.1 b.2 c.3)))) #t
3.2.2 Compilation Harness
This section describes abstractions for executing the compiler and compiled code in Racket.
procedure
(bin-format [type]) →
(or/c "elf64" "macho64" "win64") type : (or/c 'unix 'windows 'macosx) = (system-type)
> (bin-format) "elf64"
> (bin-format 'macosx) "macho64"
> (bin-format 'unix) "elf64"
> (ld-flags) '("-e" "start")
> (ld-flags 'macosx) '("-static" "-e" "start")
> (ld-flags 'unix) '("-e" "start")
value
start-label : string? = "start"
parameter
(current-pass-list) → (list-of (-> any/c any/c))
(current-pass-list passes) → void? passes : (list-of (-> any/c any/c))
= exn:fail
Raises an error when accessed before being initialized with a valid pass list.
> (current-pass-list) #f
> (parameterize ([current-pass-list (list (lambda (x) (displayln x)))]) (compile 'x)) x
> (current-pass-list (list (lambda (x) (displayln x)))) > (compile 'x) x
parameter
(current-run/read) → (-> string? any/c)
(current-run/read run/read) → void? run/read : (-> string? any/c)
= nasm-run/read
A run-reader, run/read, takes a string representing a complete x64 program in Intel syntax, capable of compiling via nasm, linking via ld, and running as a binary, and should return the result of the program as a Racket value. The default, nasm-run/read, compiles and executes the program and returns the result of the standard output as read by read.
See also nasm-run/exit-code, nasm-run/print-string, nasm-run/print-number, nasm-run/error-string, nasm-run/error-string+code, and nasm-run/observe.
procedure
v : any/c run/read : (-> string? any/c) = (current-run/read)
The intention is that the program is compiled to assembly, then assembled with nasm, then linked, then the binary is executed and the result returned as a Racket value, enabling testing the end-to-end compiler.
Expects e to be a valid input program to the first pass in the current-pass-list, and the compiler to produce a valid input to the run/read procedure.
> (require cpsc411/reference/a2-solution cpsc411/2c-run-time)
> (parameterize ([current-pass-list (list generate-x64 wrap-x64-run-time wrap-x64-boilerplate)]) (execute '(begin (set! rax 120)))) 120
> (parameterize ([current-pass-list (list implement-fvars generate-x64 wrap-x64-run-time wrap-x64-boilerplate)]) (execute '(begin (set! fv1 120) (set! rax fv1)))) 120
> (require cpsc411/reference/a1-solution)
> (parameterize ([current-pass-list (list generate-x64 wrap-x64-run-time wrap-x64-boilerplate)]) (execute '(begin (set! rax 120)) nasm-run/exit-code)) 120
The runner is expected to run the executable, and return an observation.
This procedure performs additional error checking and cleans up temporary files before returning the observation.
procedure
(nasm-run/exit-code s) → (in-range/c 0 256)
s : string?
procedure
(nasm-run/print-string s) → string?
s : string?
procedure
(nasm-run/print-number s) → number?
s : string?
procedure
(nasm-run/read s) → any/c
s : string?
procedure
(nasm-run/error-string s) → string?
s : string?
procedure
(trace-compiler!) → void?
procedure
3.2.3 Compiler Parameters
This section describes parameters defining various values used in the CPSC411 compiler, such as those defining design choices or target machine details.
parameter
(current-word-size-bytes size) → void? size : exact-nonnegative-integer?
= 8
parameter
(current-register-set) → (set/c symbol?)
(current-register-set set) → void? set : (set/c symbol?)
= '(rsp rbp rax rbx rcx rdx rsi rdi r8 r9 r10 r11 r12 r13 r14 r15)
parameter
(current-stack-size size) → void? size : exact-nonnegative-integer?
= (* 8 1024 1024)
parameter
(current-heap-size size) → void? size : exact-nonnegative-integer?
= (* 128 1024 1024)
parameter
(current-frame-base-pointer-register reg) → void? reg : register?
= 'rbp
procedure
v : any/c
parameter
(current-auxiliary-registers reg) → void? reg : register?
= '(r10 r11)
parameter
(current-patch-instructions-registers reg) → void? reg : register?
= '(r10 r11)
parameter
(current-return-value-register reg) → void? reg : register?
= 'rax
parameter
(current-assignable-registers reg) → void? reg : register?
= '(rsp rbx rcx rdx rsi rdi r8 r9 r13 r14 r15)
parameter
(current-parameter-registers reg) → void? reg : (set/c register?)
= '(rdi rsi rdx rcx r8 r9)
parameter
(current-return-address-register reg) → void? reg : register?
= 'r15
parameter
(current-heap-base-pointer-register reg) → void? reg : register?
= 'r12
3.2.4 Two’s Complement Integers
This section defines utilities for working with fixed-width two’s complement integers.
procedure
v : exact-nonnegative-integer?
procedure
v : exact-nonnegative-integer?
procedure
word-size : exact-nonnegative-integer? i : number?
> (int-size? 2 1) #t
> (int-size? 2 5) #f
> (int-size? 2 (max-int 2)) #t
> (int-size? 2 (min-int 2)) #t
> (int-size? 2 (- (min-int 2) 1)) #f
> (int-size? 2 (+ (min-int 2) 1)) #t
> (int-size? 32 (max-int 32)) #t
> (int-size? 32 (min-int 32)) #t
> (int-size? 32 (- (min-int 32) 1)) #f
> (int-size? 32 (+ (min-int 32) 1)) #t
> (int32? 0) #t
> (int32? 5) #t
> (int32? (max-int 32)) #t
> (int32? (min-int 32)) #t
> (int32? (- (min-int 32) 1)) #f
> (int32? (+ (min-int 32) 1)) #t
> (int32? 'x) #f
> (int32? 'x.1) #f
> (int64? 0) #t
> (int64? 5) #t
> (int64? (max-int 64)) #t
> (int64? (min-int 64)) #t
> (int64? (- (min-int 64) 1)) #f
> (int64? (+ (min-int 64) 1)) #t
> (int64? 'x) #f
> (int64? 'x.1) #f
> (int61? 0) #t
> (int61? 5) #t
> (int61? (max-int 61)) #t
> (int61? (min-int 61)) #t
> (int61? (- (min-int 61) 1)) #f
> (int61? (+ (min-int 61) 1)) #t
> (int61? 'x) #f
> (int61? 'x.1) #f
procedure
(handle-overflow word-size x) → number?
word-size : exact-nonnegative-integer? x : number?
> (handle-overflow 32 (sub1 (min-int 32))) 2147483647
> (handle-overflow 32 (min-int 32)) -2147483648
> (handle-overflow 32 (max-int 32)) 2147483647
> (handle-overflow 32 (add1 (max-int 32))) -2147483648
procedure
(twos-complement-add word-size n1 n2)
→ (in-range/c (min-int word-size) (max-int word-size)) word-size : exact-nonnegative-integer? n1 : number? n2 : number?
> (twos-complement-add 32 5 10) 15
> (twos-complement-add 32 (sub1 (max-int 32)) 1) 2147483647
> (twos-complement-add 32 (max-int 32) 1) -2147483648
> (twos-complement-add 32 (add1 (min-int 32)) -1) -2147483648
> (twos-complement-add 32 (min-int 32) -1) 2147483647
procedure
(twos-complement-sub word-size n1 n2)
→ (in-range/c (min-int word-size) (max-int word-size)) word-size : exact-nonnegative-integer? n1 : number? n2 : number?
> (twos-complement-sub 32 5 2) 3
> (twos-complement-sub 32 (max-int 32) 1) 2147483646
> (twos-complement-sub 32 (min-int 32) 1) 2147483647
procedure
(twos-complement-mul word-size n1 n2)
→ (in-range/c (min-int word-size) (max-int word-size)) word-size : exact-nonnegative-integer? n1 : number? n2 : number?
> (twos-complement-mul 32 2 5) 10
> (twos-complement-mul 32 (/ (max-int 32) 2) 2) 2147483647
> (twos-complement-mul 32 (max-int 32) 2) -2
3.2.5 Ptrs
This section describes the parameters and procedures for working with ptrs.
parameter
(current-fixnum-shift bits) → void? bits : exact-nonnegative-integer?
= 3
parameter
(current-fixnum-mask bits) → void? bits : int64?
= #b111
parameter
(current-fixnum-tag bits) → void? bits : int64?
= #b000
parameter
(current-boolean-shift bits) → void? bits : exact-nonnegative-integer?
= (current-boolean-shift)
parameter
(current-boolean-mask bits) → void? bits : int64?
= #b11110111
parameter
(current-boolean-tag bits) → void? bits : int64?
= #b110
parameter
(current-true-ptr bits) → void? bits : int64?
= #b1110
parameter
(current-false-ptr bits) → void? bits : int64?
= #b0110
parameter
(current-empty-mask bits) → void? bits : int64?
= #b11111111
parameter
(current-empty-tag bits) → void? bits : int64?
= #b10110
parameter
(current-empty-ptr bits) → void? bits : int64?
= #b10110
parameter
(current-void-mask bits) → void? bits : int64?
= #b11111111
parameter
(current-void-tag bits) → void? bits : int64?
= #b00011110
parameter
(current-void-ptr bits) → void? bits : int64?
= #b00011110
parameter
(current-ascii-char-shift bits) → void? bits : exact-nonnegative-integer?
= 8
parameter
(current-ascii-char-mask bits) → void? bits : int64?
= #b11111111
parameter
(current-ascii-char-tag bits) → void? bits : int64?
= #b00101110
parameter
(current-error-shift bits) → void? bits : exact-nonnegative-integer?
= 8
parameter
(current-error-mask bits) → void? bits : int64?
= #b11111111
parameter
(current-error-tag bits) → void? bits : int64?
= #b00111110
parameter
(current-pair-shift bits) → void? bits : exact-nonnegative-integer?
= 3
parameter
(current-pair-mask bits) → void? bits : int64?
= #b111
parameter
(current-pair-tag bits) → void? bits : int64?
= #b001
parameter
(current-car-displacement i) → void? i : int64?
= 0
parameter
(current-cdr-displacement i) → void? i : int64?
= 8
procedure
(car-offset) → int64?
procedure
(cdr-offset) → int64?
parameter
(current-pair-size i) → void? i : int64?
= 16
parameter
(current-vector-shift bits) → void? bits : exact-nonnegative-integer?
= 3
parameter
(current-vector-mask bits) → void? bits : int64?
= #b111
parameter
(current-vector-tag bits) → void? bits : int64?
= #b011
parameter
(current-vector-length-displacement bytes) → void? bytes : int64?
= 0
parameter
(current-vector-base-displacement bytes) → void? bytes : int64?
= 8
parameter
(current-procedure-shift bits) → void? bits : exact-nonnegative-integer?
= 3
parameter
(current-procedure-mask bits) → void? bits : int64?
= #b111
parameter
(current-procedure-tag bits) → void? bits : int64?
= #b010
parameter
(current-procedure-label-displacement bytes) → void? bytes : int64?
= 0
parameter
(current-procedure-arity-displacement bytes) → void? bytes : int64?
= 8
parameter
(current-procedure-environment-displacement bytes) → void? bytes : int64?
= 16
procedure
(ascii-char-literal? c) → boolean?
c : any/c
> (ascii-char-literal? #\a) #t
> (ascii-char-literal? #\b) #t
> (ascii-char-literal? #\6) #t
> (ascii-char-literal? #\?) #t
> (ascii-char-literal? #\space) #t
> (char? #\λ) #t
> (ascii-char-literal? #\λ) #f
3.2.6 Misc Compiler Helpers
procedure
(make-begin is t) → tail
is : (listof effect) t : tail
Assumes that t has already been constructed using make-begin.
> (make-begin '() '(halt 5)) '(halt 5)
> (make-begin '() '(begin (halt 5))) '(begin (halt 5))
> (make-begin '((set! rax 5)) '(begin (halt 5))) '(begin (set! rax 5) (halt 5))
> (make-begin '((set! rax 5)) '(begin (begin (halt 5)))) '(begin (set! rax 5) (begin (halt 5)))
> (make-begin '((begin (set! rax 5))) '(begin (halt 5))) '(begin (set! rax 5) (halt 5))
> (make-begin '((begin (begin (set! rax 5)))) '(begin (halt 5))) '(begin (set! rax 5) (halt 5))
> (make-begin '((begin (begin (set! rax 5)))) '(begin (begin (halt 5)))) '(begin (set! rax 5) (begin (halt 5)))
> (make-begin '((begin (begin (set! rax 5)))) (make-begin '() (make-begin '() '(halt 5)))) '(begin (set! rax 5) (halt 5))
procedure
(make-begin-effect is) → effect
is : (listof effect)
> (make-begin-effect '()) '(begin)
> (make-begin-effect '((begin (set! rax 5)))) '(begin (set! rax 5))
> (make-begin-effect '((set! rax 5))) '(begin (set! rax 5))
> (make-begin-effect '((begin (begin (set! rax 5))) (begin (set! rax 5)))) '(begin (set! rax 5) (set! rax 5))
procedure
(check-assignment p) → any/c
p : any/c
(let ([loc? (or/c register? fvar?)]) (info/c (assignment ((aloc? loc?) ...)) (conflicts ((aloc? (aloc? ...)) ...)) (locals (aloc? ...))))
Check that the given assignment is sound with respect to the conflicts graph and complete with respect the locals set. Returns p if so, or raises an error.
To be language agnostic, it doesn’t actually check that the program is valid and only depends on the info field.
> (check-assignment `(module ((locals (v.1 w.2 x.3 y.4 z.5 t.6 p.1)) (conflicts ((p.1 (z.5 t.6 y.4 x.3 w.2)) (t.6 (p.1 z.5)) (z.5 (p.1 t.6 w.2 y.4)) (y.4 (z.5 x.3 p.1 w.2)) (x.3 (y.4 p.1 w.2)) (w.2 (z.5 y.4 p.1 x.3 v.1)) (v.1 (w.2)))) (assignment ((v.1 r15) (w.2 r8) (x.3 r14) (y.4 r9) (z.5 r13) (t.6 r14) (p.1 r15))))))
'(module
((locals (v.1 w.2 x.3 y.4 z.5 t.6 p.1))
(conflicts
((p.1 (z.5 t.6 y.4 x.3 w.2))
(t.6 (p.1 z.5))
(z.5 (p.1 t.6 w.2 y.4))
(y.4 (z.5 x.3 p.1 w.2))
(x.3 (y.4 p.1 w.2))
(w.2 (z.5 y.4 p.1 x.3 v.1))
(v.1 (w.2))))
(assignment
((v.1 r15) (w.2 r8) (x.3 r14) (y.4 r9) (z.5 r13) (t.6 r14) (p.1 r15)))))
> (check-assignment `(module ((locals (v.1 w.2 x.3 y.4 z.5 t.6 p.1)) (conflicts ((p.1 (z.5 t.6 y.4 x.3 w.2)) (t.6 (p.1 z.5)) (z.5 (p.1 t.6 w.2 y.4)) (y.4 (z.5 x.3 p.1 w.2)) (x.3 (y.4 p.1 w.2)) (w.2 (z.5 y.4 p.1 x.3 v.1)) (v.1 (w.2)))) (assignment ((w.2 r8) (x.3 r14) (y.4 r9) (z.5 r13) (t.6 r14) (p.1 r15)))))) check-assignment: Some locals not assigned homes:
homeless locals: v.1
> (check-assignment `(module ((locals (v.1 w.2 x.3 y.4 z.5 t.6 p.1)) (conflicts ((p.1 (z.5 t.6 y.4 x.3 w.2)) (t.6 (p.1 z.5)) (z.5 (p.1 t.6 w.2 y.4)) (y.4 (z.5 x.3 p.1 w.2)) (x.3 (y.4 p.1 w.2)) (w.2 (z.5 y.4 p.1 x.3 v.1)) (v.1 (w.2)))) (assignment ((v.1 r8) (w.2 r8) (x.3 r14) (y.4 r9) (z.5 r13) (t.6 r14) (p.1 r15)))))) check-assignment: Produced bad assignment:
w.2 and v.1 both assigned to r8
procedure
n : any/c f : procedure? ls : list?
> (map-n 2 (lambda (x y) (values (add1 x) (sub1 y))) '(1 2 3) '(1 2 3))
'(2 3 4)
'(0 1 2)
> (map-n 3 (lambda (x) (values (add1 x) (sub1 x) (* 2 x))) '(1 2 3))
'(2 3 4)
'(0 1 2)
'(2 4 6)
procedure
f : procedure? ls : list?