| 1 | README |
| 2 | ------ |
| 3 | |
| 4 | This compiler is a big long chain of modules that transform l2 code into |
| 5 | x86_64 assembly. |
| 6 | |
| 7 | These modules include: |
| 8 | |
| 9 | * The parser. The parser was mainly brought in from lab 1, and mainly |
| 10 | just a straight-forward extension of the l1 parser. We continued to |
| 11 | mark expressions, and pass marking through as needed so that we could |
| 12 | produce reasonable error messages up through translation stage. We |
| 13 | introduced all needed grammar with no shift/reduce conflicts, but for |
| 14 | one in the IF/ELSE stage, with a construct such as: |
| 15 | if (x) |
| 16 | if (z) |
| 17 | a |
| 18 | else |
| 19 | b |
| 20 | (indentation intentionally omitted; there are at least two legitimate |
| 21 | ways to parse that!) |
| 22 | * The typechecker. This module was completely rewritten since lab1. Three |
| 23 | checks are instituted: a check to see if the program has misplaced break |
| 24 | or continue statements, a check to see that the program returns in all |
| 25 | control paths, and a check that all variables are initialized in all |
| 26 | control paths before usage. |
| 27 | |
| 28 | The return and break check is essentially implemented per the rules; the |
| 29 | only thing of interest for the variable initialization routine is that |
| 30 | there is a helper that computes all assigns to extend contexts from |
| 31 | block contents. It was determined that returning 2 accumulators from |
| 32 | varcheck would lead to returning 17 accumulators, which would lead to |
| 33 | 1984193248148132 accumulators; and 238547854478 accumulators leads to |
| 34 | the foldl, and foldl leads to anger, anger leads to hate, and hate leads |
| 35 | to the Dark Side. |
| 36 | * The translator is mainly intact; it was determined that the IR will have |
| 37 | basic control flow instructions of labels, jumps, and jump if not |
| 38 | conditional, which we deemed sufficient to implement all forms of l2 |
| 39 | control. |
| 40 | * The munch module was fully rewritten; we now munch directly to |
| 41 | pseudo-x86_64, in that it has temporaries allowed in it as well. We |
| 42 | believe that this allows us to generate much more optimal code than |
| 43 | munching into three op, converting from three to two, then converting |
| 44 | two to x86_64; in particular, we can run liveness on the x86_64 |
| 45 | directions directly, which makes translation significantly easier (we do |
| 46 | not have to worry about mashing necessary registers). |
| 47 | * The liveness analyzer was also fully rewritten; it is now fully |
| 48 | def-use-succ, giving us very pretty rules, and a lot of very ugly code |
| 49 | to mash them together. Luckily, the ugly code need not be touched ever |
| 50 | again. |
| 51 | * The grapher had about 4 characters of inconsequential change that had |
| 52 | the useful property of speeding it up by two orders of magnitude. You |
| 53 | need not worry about it. |
| 54 | * The orderer and colorer had no changes. |
| 55 | * A new module was introduced -- in particular, the solidifier. The |
| 56 | solidifier takes pseudo-x86_64 that is annotated with register locations |
| 57 | and emits needed spill and unspill operations to get everything into |
| 58 | real registers that the x86_64 chips can access. |
| 59 | * The peepholer remains pretty simple; redundant moves are optimized out, |
| 60 | and hence the code size drops by a factor of 1.5 or so. |
| 61 | * The stringifier is of no interest to you, for it does real things that |
| 62 | interact with the real world, and that is not of interest to people who |
| 63 | write in ML. |
| 64 | |
| 65 | We believe that it's fully functional; we have not had a case in quite some |
| 66 | time that caused us to generate incorrect code (at least, when we should |
| 67 | generate code). The internal debug mechanisms are very useful; often a |
| 68 | line-by-line examination of dumps after each translation phase can narrow |
| 69 | bugs down into single lines of ML code. |