d2c was written as a research project. As such, it contains some very sophisticated optimization routines.
For some unknown reason, all of the optimization code found
in the compiler is contained in a single module named
Cheese. Furthermore, this module does not
actually provide an entrance point for optimizing a chunk of code.
Instead, it adds methods to the generic function
optimize-component, which is exported by the
Builder-Interface module. This interface
is somewhat less than amusing.
This entry-point weirdness has been cleaned up on the OPTIMIZER_HACKING branch in CVS.
To study this module, begin by reading the method
optimize-component in the file
cheese.dylan. This routine is described in
the Section called Control Flow.
The optimizer's data structures are defined in a number of modules:
The intermediate representation. The optimizer manipulates code in the format defined by
module and the
module. In the code, this is is referred to as
FER, which is short for "Front End
The optimizer normally uses the
Builder-Interface module to create
new FER structures. If you're familiar with
the underlying FER classes, you can pick up
the builder interface as you read through the optimizer. Look
for calls to
make-unknown-call and related functions. They're
The optimizer will make a lot more sense if you study the documentation and code for these modules first.
The optimizer has two modes: normal mode, and simplification
mode. Normal mode is called from the driver in
optimize top-level functions. Simplification mode is called from
several places in
library to tidy up newly-defined inline
functions. This means inline functions get optimized twice: once
when they're defined, and a second time when they're inlined.
The optimizer's main entry point is the function
optimize-component. This function contains a weird
driver loop. The goal of the loop is drain two queues:
initial-variables queue is filled by the
FER builder (and by certain routines in optimize). At first, the
queue contains all the variables in the component. The optimizer
drains the queue by passing each variable to the function
maybe-convert-to-ssa tries to
convert each variable to SSA form.
SSA variables are "static, single
assignments". Each SSA variable is defined by a single
assignment, and never changes in value. Because the values of these
variables never change, the optimizer can make lots of useful
assumptions about them.
maybe-convert-to-ssa isn't very
smart. It only does the most trivial kind of conversion—it
<initial-variable>s with one
<ssa-variable>s. (There are good algorithms for
SSA conversion which can convert every
variable. If we need these, check the literature.)
initial-variables queue has been
drained, the optimizer starts work on the
reoptimize-queue. This queue was initially set up
during conversion (by the FER builder, I think). It contains
everything we currently need to optimize.
The optimizer pops the first item off the front of the queue
and passes it to the generic function
function knows how to do special-purpose optimizations on each
<queueable-mixin>. These subclasses
include expressions, assignments and certain basic blocks.
The generic function
optimize is implemented
by methods stewn throughout the optimizer. Each of these methods
knows how to simplify a particular kind of queueable object. The
<if-region>, for example, checks to
see if the test condition is a compile-time value, and if so,
<if-region> with the appropriate
branch. Other methods do compile-time evaluation of primitives,
dispatch generic functions, and optimize slot access.
The methods on
optimize may or may not change
a particular queueable object. But if a given method
does make a change, it may open up new
opportunities to improve various other queueables. So when we
modify a queueable, we must generally reoptimize everything that
depends on it.
To reoptimize depedents, we can't use recursive calls to
optimize. Instead, we must re-insert the dependent
reoptimize-queue (remember that?) using the
reoptimize. If the dependent is already in
reoptimize does nothing. But if the
dependent isn't already in the queue,
adds it to very front.
Once we've optimized a single queueable, we return to the
top-level driver. At this point, we have to drain the
initial-variables queue again (which will almost
always be empty). Once this is done, we pop the next item off of
reoptimize-queue and make another call to
Once both queues are finally empty, the driver can move on. It first applies a number of "simplifications"—control-flow cleanups, CSE-elimination, and things like that. Any of these simplifications might add something to one of our two queues. If this happens, we need to drain our queues again and reapply our simplifications from the beginning. Fortunately, this backtracking happens very rarely, and it doesn't trigger any kind of global reoptimization.
The next step depends on our
argument. If this argument is true, we're doing
"pre-optimizion" of an inline function, and we should
stop at this point. But if
simplify-only? is false,
we're optimizing a regular function and need to get it ready for
the back end.
To prepare a function for the back end, we need to add type checks, replace certain "placeholder" queueables, do environment analysis to find closure variables, and build the external entry points for local functions.
Each of these stages works like a simplification. It may add items to our two queues. If it does, we have to go drain those queues, reapply all our simplifications, and so forth.
Dylan, like LISP, uses function calls to represent most language operations. Arithmetic oprators are functions. Slot getters and setters are functions. Even array access is done using functions. Given this, it's clear that the optimizer needs to spend a lot of time improving function calls.
Function calls are represented in FER
using subclasses of
<abstract-call>. You can find
these classes in the file
front/front.dylan. (Go ahead and read the
comments, they're actually informative.)
The are three subclasses of
<abstract-call>. Each represents a different kind
of function call. Calls to specific, known methods are represented by
<known-call>. Calls to a generic function are
represented by subclasses of
(The most common of these is
Calls which are known to be erroneous are represented by
Most function calls start life as an instance of
<unknown-call>. These are processed by the
optimize method for
<unknown-call>. (Look in
callopt.dylan.) This method does some
complicated dispatching. But in several of the most common cases,
we wind up calling
optimize-generic tries two
different optimization strategies. First, it looks for an
appropriate transformer. Transformers know
how to optimize specific generic functions. If no transformer can
be found, or the transformer declines,
optimize-generic tries to do compile-time dispatch
instead. This is done by looking for applicable methods. If
there's a single applicable method, then we can replace the
<general-call> with a
Simpler optimizations are applied to
<known-call>s. But again, it's possible to
register a transformer. This transformer will be applied to known
calls with exactly matching function signatures.
Most of the transformers live in
trans.dylan. There's also some black magic in
here to handle calls to
instance? by converting them
<instance?> placeholders, and changing them
back into real code later, when replacing placeholders. This was
probably done to hide these calls during simplification