The Compiler-Optimize Library

d2c was written as a research project. As such, it contains some very sophisticated optimization routines.

The Cheese Module

For some unknown reason, all of the optimization code found in the compiler is contained in a single module named Cheese. Furthermore, this module does not actually provide an entrance point for optimizing a chunk of code. Instead, it adds methods to the generic function optimize-component, which is exported by the Builder-Interface module. This interface is somewhat less than amusing.

This entry-point weirdness has been cleaned up on the OPTIMIZER_HACKING branch in CVS.

To study this module, begin by reading the method optimize-component in the file cheese.dylan. This routine is described in the Section called Control Flow.

Data Formats

The optimizer's data structures are defined in a number of modules:

The optimizer will make a lot more sense if you study the documentation and code for these modules first.

Control Flow

The optimizer has two modes: normal mode, and simplification mode. Normal mode is called from the driver in Compiler-Main library to optimize top-level functions. Simplification mode is called from several places in Compiler-Convert library to tidy up newly-defined inline functions. This means inline functions get optimized twice: once when they're defined, and a second time when they're inlined.

The optimizer's main entry point is the function optimize-component. This function contains a weird driver loop. The goal of the loop is drain two queues: component.initial-variables and component.reoptimize-queue.

The initial-variables queue is filled by the FER builder (and by certain routines in optimize). At first, the queue contains all the variables in the component. The optimizer drains the queue by passing each variable to the function maybe-convert-to-ssa.

The function maybe-convert-to-ssa tries to convert each variable to SSA form. SSA variables are "static, single assignments". Each SSA variable is defined by a single assignment, and never changes in value. Because the values of these variables never change, the optimizer can make lots of useful assumptions about them.

Unfortunaely, maybe-convert-to-ssa isn't very smart. It only does the most trivial kind of conversion—it changes <initial-variable>s with one <initial-definition> into <ssa-variable>s. (There are good algorithms for SSA conversion which can convert every variable. If we need these, check the literature.)

Once the initial-variables queue has been drained, the optimizer starts work on the reoptimize-queue. This queue was initially set up during conversion (by the FER builder, I think). It contains everything we currently need to optimize.

The optimizer pops the first item off the front of the queue and passes it to the generic function optimize. This function knows how to do special-purpose optimizations on each subclass of <queueable-mixin>. These subclasses include expressions, assignments and certain basic blocks.

The generic function optimize is implemented by methods stewn throughout the optimizer. Each of these methods knows how to simplify a particular kind of queueable object. The method for <if-region>, for example, checks to see if the test condition is a compile-time value, and if so, replaces the <if-region> with the appropriate branch. Other methods do compile-time evaluation of primitives, dispatch generic functions, and optimize slot access.

The methods on optimize may or may not change a particular queueable object. But if a given method does make a change, it may open up new opportunities to improve various other queueables. So when we modify a queueable, we must generally reoptimize everything that depends on it.

To reoptimize depedents, we can't use recursive calls to optimize. Instead, we must re-insert the dependent into the reoptimize-queue (remember that?) using the function reoptimize. If the dependent is already in the queue, reoptimize does nothing. But if the dependent isn't already in the queue, reoptimize adds it to very front.

Once we've optimized a single queueable, we return to the top-level driver. At this point, we have to drain the initial-variables queue again (which will almost always be empty). Once this is done, we pop the next item off of reoptimize-queue and make another call to optimize.

Once both queues are finally empty, the driver can move on. It first applies a number of "simplifications"—control-flow cleanups, CSE-elimination, and things like that. Any of these simplifications might add something to one of our two queues. If this happens, we need to drain our queues again and reapply our simplifications from the beginning. Fortunately, this backtracking happens very rarely, and it doesn't trigger any kind of global reoptimization.

The next step depends on our simplify-only? argument. If this argument is true, we're doing "pre-optimizion" of an inline function, and we should stop at this point. But if simplify-only? is false, we're optimizing a regular function and need to get it ready for the back end.

To prepare a function for the back end, we need to add type checks, replace certain "placeholder" queueables, do environment analysis to find closure variables, and build the external entry points for local functions.

Each of these stages works like a simplification. It may add items to our two queues. If it does, we have to go drain those queues, reapply all our simplifications, and so forth.

Call Optimization & Transformers

Dylan, like LISP, uses function calls to represent most language operations. Arithmetic oprators are functions. Slot getters and setters are functions. Even array access is done using functions. Given this, it's clear that the optimizer needs to spend a lot of time improving function calls.

Function calls are represented in FER using subclasses of <abstract-call>. You can find these classes in the file front/front.dylan. (Go ahead and read the comments, they're actually informative.)

The are three subclasses of <abstract-call>. Each represents a different kind of function call. Calls to specific, known methods are represented by <known-call>. Calls to a generic function are represented by subclasses of <general-call>. (The most common of these is <unknown-call>.) Calls which are known to be erroneous are represented by <error-call>.

Most function calls start life as an instance of <unknown-call>. These are processed by the optimize method for <unknown-call>. (Look in callopt.dylan.) This method does some complicated dispatching. But in several of the most common cases, we wind up calling optimize-generic.

The function optimize-generic tries two different optimization strategies. First, it looks for an appropriate transformer. Transformers know how to optimize specific generic functions. If no transformer can be found, or the transformer declines, optimize-generic tries to do compile-time dispatch instead. This is done by looking for applicable methods. If there's a single applicable method, then we can replace the <general-call> with a <known-call>.

Simpler optimizations are applied to <known-call>s. But again, it's possible to register a transformer. This transformer will be applied to known calls with exactly matching function signatures.

Most of the transformers live in trans.dylan. There's also some black magic in here to handle calls to instance? by converting them into <instance?> placeholders, and changing them back into real code later, when replacing placeholders. This was probably done to hide these calls during simplification passes.