GOAL: Write a working compiler for the BabyGustave subset of the Gustave language. This compiler should use your earlier assignments to turn source code into syntax trees and then translate those trees into ARM assembly code which can then be assembled and run on the Raspberry Pi's. So, the output of your compiler should be ARM assembly code printed out on the standard output (screen).
METHOD: Chapters 5-12 in Appel describe one way to write a
compiler. However, Appel's approach is more complicated than we have
time to absorb this semester. Consequently, I am suggesting a
simpler approach. Rather than translating the syntax trees into intermediate
representation trees and then translating those into assembly instructions,
we will go directly from the syntax trees to assembly instructions.
An example of this approach has been discussed in class and the code
is available on morbius in the directory ~mike/cosc4400/initcomp
.
Here's my best advice on how to work on this:
CodeGenPkg/Coder.java
to
handle all of the features of BabyGustave.
Start small and try to get pieces working as soon as possible. For instance, if you write the Coder.visit() for Program, RootClass, PutInteger, and NumberExpr, you should be able to compile and run the program:
class ONE creation make method make is do io.put_integer(5) end end -- ONE
If you then add code for PutString, you can put a newline after the 5, and you can compile hello_world.e . Add features one or two at a time and then test them (and debug them) before adding more.
I would suggest implementing all of the statements and expressions without arrays, first. Then, you can add in arrays and test those features later.
Arrays involve two steps: allocation and access. To allocate an array in the heap, we will use the C runtime library routine malloc(). Malloc takes one parameter (the number of bytes desired) and returns the address of a block of memory at least that large. In assembler, you call malloc by putting the number of bytes (which is 4 times the number of integer entries) into r0, then calling malloc, then using the returned value from r0. This returned address is stored as the value of the variable representing the array.
To access an array location, you need to generate code to put the value of the array variable (base address) into a register, then add on 4 times the index, then use that value to access memory (for a load or store depending on whether you are assigning to or reading from the array).
Sample programs for testing are available here. Don't forget that you can (and should) write your own test programs as needed.