COSC4400 – Compiler Construction

Assignment #2

Due: Monday, Feb 11, 2019

GOAL: Modify the translator in InToPostfix to produce stack machine code as output and expand the grammar to include certain statements. In particular, your new translator should process statements which are defined by the following grammar (the productions for expr are the ones InToPostfix currently uses):

stmt → id := expr

| if expr then stmt

| while expr do stmt

| begin stmt_list end

stmt_list → stmt_list ; stmt

| stmt

[ As we discussed in class, the productions given for stmt_list won’t work with a predictive parser. Here’s a set that will:

stmt_list → stmt stmt_rest

stmt_rest → ; stmt stmt_rest

| ε

]

Note, your program should now read a series of statements, not a series of expressions. So, input like 3+4 will no longer be valid. You would need to say something like

x := 3+4

STACK MACHINE: We will generate code for a pretend stack machine whose machine language is describe here. This machine is based on one from the Aho, Sethi, and Ullman book.

The basic stack manipulation instructions are:

push num push numeric constant num onto the stack

rvalue addr push contents of data location addr

lvalue addr push address of data location addr

pop throw away value on the top of the stack

store the r-value on top is placed in the l-value below

it and both are popped

copy push a copy of the top value on the stack

We assume there is a machine instruction for each arithmetic operator. These instructions pop the top two values off of the stack, perform the operation and push the result back onto the stack.

Data locations (addr) are simply identified by symbolic names like “x” or “day” rather than numeric addresses.

We also have the following control flow instructions:

label L target of jumps to L ; has no other effect

goto L next instruction is taken from statement with label L

gofalse L pop the top value; jump if it is zero

gotrue L pop the top value; jump if it is nonzero

For example, the input statement:

day := (1461*y) div 4 + (153*m + 2) div 5 + d

should produce the code:

lvalue day push 2

push 1461 +

rvalue y push 5

* div

push 4 +

div rvalue d

push 153 +

rvalue m store

METHOD: You will need to create new tokens for the additional keywords (in Lexan). The assignment

operator “:=” should be treated as one token (representing the store operation in machine code).

In order to generate the correct code for the statements, you need to decide what that code should look like. For instance, a semantic action for the if statement might be something like:

stmt → if expr then stmt1 { out := newlabel();

stmt.code := expr.code ||

‘gofalse’ out ||

stmt1.code ||

‘label’ out }

where the symbol “||” means to concatenate code fragments and newlabel() is a method which returns a label value that hasn’t been used before. This could be a routine with a simple counter. The labels could be simply numbers or strings like “L5”, “L14”.

This could be implemented using the following translation scheme:

stmt → if

expr {out := newlabel(); emit(‘gofalse’, out); }

then

stmt1 { emit(‘label’, out); }

You should work out similar schemes for the while and begin-end statements.

HAND-IN: Submit the code (.java files) for your modified translator using D2L by 11:59PM Monday. Be sure to include comments saying who wrote the code and why.