-(* README
- * Author: Frank Pfenning <fp@cs.cmu.edu>
- *)
-
------------------------------------------------------------------------
-Welcome to 15-411 F08!
------------------------------------------------------------------------
-
-This is some starter code for the L1 compiler you have to build for
-the Lab1. It contains a lexer, parser, translator, and even a code
-generator, except that the code generator creates pseudo assembly
-language with fictitious instructions and an unlimited number of
-registers. We took some care to use good style (according to the
-instructor); you may consider this a model for your own coding. Feel
-free to modify any and all of this code as you see fit.
-
-Bug reports to the instructor Frank Pfenning <fp@cs.cmu.edu> are
-particularly welcome and will be noted in the extra credit category.
-
------------------------------------------------------------------------
-SML Notes
------------------------------------------------------------------------
-There are many different compilers for SML, perhaps the most
-popular ones are
-
- SML/NJ -- http://www.smlnj.org/
- MLton -- http://www.mlton.org/
- Poly/ML -- http://www.polyml.org/
-
-In this class we will be using SML/NJ v110.59. Please make sure your
-code compiles under specifically this version on the lab machines
-where it is the default and can be invoked simply with "sml" in a
-shell.
-
-If you develop your implementation on other machines, similar versions
-of SML/NJ are likely to be compatible, but you should certainly check
-your code on the lab machines.
-
-For (almost universal) Standard Basis Libraries, see
-http://www.standardml.org/Basis/index.html. Further resources, such
-as documentation for ML-Lex and ML-Yacc, and documentation for the SML/NJ
-specific libraries which are used in the starter code, can be found at
-
- http://www.cs.cmu.edu/~fp/courses/15411-f08/resources.html
-
-------------------------------------------------------------------------
-Source Files
-------------------------------------------------------------------------
-The following are the source files for the L1 compiler
-
-README -- this file
-
-Makefile -- makefile for the compiler
- For a quick test
-
- % make l1c (generates file bin/l1c.heap.<os-tag>)
- % bin/l1c --verbose ../tests/test1.c
-
- should generate ../tests/test1.s in pseudo assembly
-
- % make clean (removes generated files)
- % make TAGS (creates file TAGS, for Emacs tags commands)
-
-compile-l1c.sml -- SML commands that will create bin/l1c.heap.<os-tag>
-bin/l1c -- the script that will run the exported SML heap
-
-sources.cm -- lists all source files, including libraries,
- and lexer and grammar specifications
- For a quick test
-
- % sml
- - CM.make "sources.cm";
- - Top.test "--verbose ../tests/test1.c";
-
- should generate ../tests/test1.s in pseudo assembly
-
-parse/ast.sml -- definition and printer for abstract syntax trees (AST's)
-parse/l1.lex -- L1 lexer
-parse/l1.grm -- L1 grammar
-parse/parse.sml -- L1 parser
-parse/parsestate.sml -- L1 parser support for error messages
-
-type/typechecker.sml -- (trivial) type-checker for AST
-
-trans/temp.sml -- functions to generate and track temp's
-trans/tree.sml -- definition and pretty printer for IR trees
-trans/trans.sml -- translation from AST to IR trees
-
-codegen/assem.sml -- pseudo assembly format for this starter code
-codegen/codegen.sml -- pseudo code generator
-
-util/errormsg.sml -- error message utilities
-util/flag.sml -- library for defining flags
-util/mark.sml -- library for tracking source file positions
-util/safe-io.sml -- I/O utilities
-util/symbol.sml -- symbol table library
-util/word32.sml -- machine word utilities for two's complement interpretation
-
-top/top.sml -- top level function for export to binary and testing
-
-------------------------------------------------------------------------
-Debugging Hints
-------------------------------------------------------------------------
-You can use
-
- - Top.test "--verbose --dump-ast --dump-ir --dump-assem file.l1";
-
-to print information from all the phases of the current compiler.
-
-If you want to see the internal representations, you can call directly
-on SML's top level:
-
- - val ast = Parse.parse "file.l1";
- - val ir = Trans.translate ast;
- - val assem = Codegen.codgen ir;
-
-This will use SML's internal printing function to print the data
-structures. However, not everything will show.
-
-"-" means that the type is opaque. Sometimes you can replace an opaque
- signature ascription ":>" with a transparent one ":" to see the info.
- For reasons of general hygiene, however, you should change it back
- before handing in.
-
-"#" means that the printing depth is exceeded. Use
-
- - Control.Print.printDepth := 100;
-
- to increase the depth if you need to see more.
-
-"..." means that the printing length is exceeded. Use
-
- - Control.Print.printLength := 1000;
-
- to increase the length if you need to see more.
-
-------------------------------------------------------------------------
-Library Hints
-------------------------------------------------------------------------
-See util/symbol.sml for some uses of libraries provided with SML/NJ
-(and some other SML implementations). BinaryMapFn and
-BinarySetFn are likely of general use. To see their interface,
-you can check http://www.smlnj.org/doc/smlnj-lib/Manual/toc.html.
-I found binary maps and binary sets to be occasionally helpful.
+README
+------
+
+This compiler is a big long chain of modules that transform l2 code into
+x86_64 assembly.
+
+These modules include:
+
+ * The parser. The parser was mainly brought in from lab 1, and mainly
+ just a straight-forward extension of the l1 parser. We continued to
+ mark expressions, and pass marking through as needed so that we could
+ produce reasonable error messages up through translation stage. We
+ introduced all needed grammar with no shift/reduce conflicts, but for
+ one in the IF/ELSE stage, with a construct such as:
+ if (x)
+ if (z)
+ a
+ else
+ b
+ (indentation intentionally omitted; there are at least two legitimate
+ ways to parse that!)
+ * The typechecker. This module was completely rewritten since lab1. Three
+ checks are instituted: a check to see if the program has misplaced break
+ or continue statements, a check to see that the program returns in all
+ control paths, and a check that all variables are initialized in all
+ control paths before usage.
+
+ The return and break check is essentially implemented per the rules; the
+ only thing of interest for the variable initialization routine is that
+ there is a helper that computes all assigns to extend contexts from
+ block contents. It was determined that returning 2 accumulators from
+ varcheck would lead to returning 17 accumulators, which would lead to
+ 1984193248148132 accumulators; and 238547854478 accumulators leads to
+ the foldl, and foldl leads to anger, anger leads to hate, and hate leads
+ to the Dark Side.
+ * The translator is mainly intact; it was determined that the IR will have
+ basic control flow instructions of labels, jumps, and jump if not
+ conditional, which we deemed sufficient to implement all forms of l2
+ control.
+ * The munch module was fully rewritten; we now munch directly to
+ pseudo-x86_64, in that it has temporaries allowed in it as well. We
+ believe that this allows us to generate much more optimal code than
+ munching into three op, converting from three to two, then converting
+ two to x86_64; in particular, we can run liveness on the x86_64
+ directions directly, which makes translation significantly easier (we do
+ not have to worry about mashing necessary registers).
+ * The liveness analyzer was also fully rewritten; it is now fully
+ def-use-succ, giving us very pretty rules, and a lot of very ugly code
+ to mash them together. Luckily, the ugly code need not be touched ever
+ again.
+ * The grapher had about 4 characters of inconsequential change that had
+ the useful property of speeding it up by two orders of magnitude. You
+ need not worry about it.
+ * The orderer and colorer had no changes.
+ * A new module was introduced -- in particular, the solidifier. The
+ solidifier takes pseudo-x86_64 that is annotated with register locations
+ and emits needed spill and unspill operations to get everything into
+ real registers that the x86_64 chips can access.
+ * The peepholer remains pretty simple; redundant moves are optimized out,
+ and hence the code size drops by a factor of 1.5 or so.
+ * The stringifier is of no interest to you, for it does real things that
+ interact with the real world, and that is not of interest to people who
+ write in ML.
+
+We believe that it's fully functional; we have not had a case in quite some
+time that caused us to generate incorrect code (at least, when we should
+generate code). The internal debug mechanisms are very useful; often a
+line-by-line examination of dumps after each translation phase can narrow
+bugs down into single lines of ML code.