README
------
-This compiler is a big long chain of modules that transform l2 code into
+This compiler is a big long chain of modules that transform l3 code into
x86_64 assembly.
-These modules include:
+Here is a breakdown of the modules and changes from l2:
+
+ * The parser. The parser was mainly brought in from lab 2, and mainly
+ just a straight-forward extension of the l2 parser. We added the
+ ability to parse functions, function calls, and variable declarations.
+
+ * The typechecker. This module is mostly the same as that from l2. It
+ performs function-related typechecking as well now, such as ensuring
+ that the correct number of arguments is supplied in a function call,
+ that there are no multiple definitions of functions, and that there is a
+ main function that takes only one argument.
+
+ * The translator was extended with a CALL.
+
+ * The munch module was also extended with the ability to munch CALL; a
+ major improvement was made when we realized we could determine what
+ expressions had effects and what had fixed registers. Any expressions
+ that use no fixed registers and have no effects can be reordered during
+ evaluation of a function call's arguments. This enabled us to save a
+ bunch of register-register moves. Saving the caller save registers is
+ left to the liveness analyzer, which we believe results in substantially
+ better code than saving and restoring all caller saves.
+
+ * The liveness analyzer remains in more or less the same form, but with
+ substantial performance and cleanliness improvements by replacing lists
+ with maps (via BinaryMapFn) and sets (via ListSetFn). Also, a bug of
+ incredible type A was discovered through much pain and suffering, and
+ promptly fixed; it involved not realizing that a def on one line led to
+ an interference on any succeeding lines. Somehow we got away with this
+ for lab 2. Otherwise, we just explicitly state rules to generate
+ def/use/succ predicates which we then iterate over to find a fixed point
+ for livenesses using the standard rules.
+
+ * The grapher was changed to use the binary map and list set for
+ performance boosts (needed to pass certain large tests, like
+ pine-tree_print.l3). It generates an interference graph from a list of
+ livenesses at each source line.
+
+ * The color orderer had no changes.
+
+ * The coloring module was slightly updated to recognize more fixed-color
+ registers. It implements a greedy coloring algorithm.
+
+ * The solidifier was modified to change the callee save system. Now we
+ only save the registers we need to. This improvement was pushed by
+ excessively slow execution time on one of the tests.
+
+ * The peepholer is upgraded somewhat; it now eliminates more redundant
+ instructions (such as adding/subtracting 0).
- * The parser. The parser was mainly brought in from lab 1, and mainly
- just a straight-forward extension of the l1 parser. We continued to
- mark expressions, and pass marking through as needed so that we could
- produce reasonable error messages up through translation stage. We
- introduced all needed grammar with no shift/reduce conflicts, but for
- one in the IF/ELSE stage, with a construct such as:
- if (x)
- if (z)
- a
- else
- b
- (indentation intentionally omitted; there are at least two legitimate
- ways to parse that!)
- * The typechecker. This module was completely rewritten since lab1. Three
- checks are instituted: a check to see if the program has misplaced break
- or continue statements, a check to see that the program returns in all
- control paths, and a check that all variables are initialized in all
- control paths before usage.
-
- The return and break check is essentially implemented per the rules; the
- only thing of interest for the variable initialization routine is that
- there is a helper that computes all assigns to extend contexts from
- block contents. It was determined that returning 2 accumulators from
- varcheck would lead to returning 17 accumulators, which would lead to
- 1984193248148132 accumulators; and 238547854478 accumulators leads to
- the foldl, and foldl leads to anger, anger leads to hate, and hate leads
- to the Dark Side.
- * The translator is mainly intact; it was determined that the IR will have
- basic control flow instructions of labels, jumps, and jump if not
- conditional, which we deemed sufficient to implement all forms of l2
- control.
- * The munch module was fully rewritten; we now munch directly to
- pseudo-x86_64, in that it has temporaries allowed in it as well. We
- believe that this allows us to generate much more optimal code than
- munching into three op, converting from three to two, then converting
- two to x86_64; in particular, we can run liveness on the x86_64
- directions directly, which makes translation significantly easier (we do
- not have to worry about mashing necessary registers).
- * The liveness analyzer was also fully rewritten; it is now fully
- def-use-succ, giving us very pretty rules, and a lot of very ugly code
- to mash them together. Luckily, the ugly code need not be touched ever
- again.
- * The grapher had about 4 characters of inconsequential change that had
- the useful property of speeding it up by two orders of magnitude. You
- need not worry about it.
- * The orderer and colorer had no changes.
- * A new module was introduced -- in particular, the solidifier. The
- solidifier takes pseudo-x86_64 that is annotated with register locations
- and emits needed spill and unspill operations to get everything into
- real registers that the x86_64 chips can access.
- * The peepholer remains pretty simple; redundant moves are optimized out,
- and hence the code size drops by a factor of 1.5 or so.
* The stringifier is of no interest to you, for it does real things that
interact with the real world, and that is not of interest to people who
write in ML.
-We believe that it's fully functional; we have not had a case in quite some
-time that caused us to generate incorrect code (at least, when we should
-generate code). The internal debug mechanisms are very useful; often a
-line-by-line examination of dumps after each translation phase can narrow
-bugs down into single lines of ML code.
+ * Our internal representation of x86 assembly was changed. In particular,
+ conditional sets and jumps are now SETcc of cc * oper and Jcc of cc *
+ oper, instead of a separate SET or J for each condition code. This
+ simplifies other parts of the code as well.
+
+We believe that it is fully functional. We generate correct code whenever we
+are supposed to, and we pass every test that we can lay our hands on
+(including all of l2, and one of ours that killed the reference compiler).
+Of course, our last bug was caught by only one failing test, so...
\ No newline at end of file