April 2021
The backend of the compiler has to map each IR construct into a corresponding and equivalent construct in the target processor's instruction set. Depending on the level of abstraction used, the compiler may have to expose the IR to decide what mapping to use.
Instruction selection is highly complex because a typical processor provides many distinct ways of performing the same computation. Each alternative also has its own costs, which themselves depend on context.
To tackle this problem of large search space, the compiler writer uses techniques that explore this search space in a disciplined manner: either by limiting their searching or precomputing enough information to make a deep search efficient.
Instruction selection can be transformed into a tree-matching problem. Given an AST and a set of operation trees (tree equivalent of the target machine operations), the goal is to map the AST to operations by constructing a tiling of the AST with operation trees.
A tiling is a collection of <AST-node, op-tree>
pairs. A tiling implements an AST if it implements every operation
and each tile connects with its neighbors. Given a tiling, the
compiler can easily generate assembly code in a bottom-up walk.
A set of rewrite rules encodes the relationships between operation trees and subtrees in the AST. Each rewrite rule has a production, cost, and an assembly code template. A tiling algorithm reduces an input AST into the goal symbol of the rule set, and produces the rule sequence for that reduction. A code generator uses this rule sequence and their associated code templates to generate assembly code in a bottom-up walk of the AST.
Another technique for performing instruction selection is to use peephole optimization. A peephole optimizer finds local improvements by examining short sequences of adjacent operations.
Modern peephole optimizers have three components: an expander, a simplifier, and a matcher. The expander recognizes the input and builds an internal representation; the simplifier rewrites this representation; the matcher transforms it into target-machine code.
A peephole optimizer can be repurposed into performing instruction selection by using IR as the input language and assembly as the output language. The peephole system expands the IR into LLIR, simplifies it, and matches the LLIR constructs into target ISA operations.
The core of an instruction selector via peephole optimization is the matcher component, which maps LLIR construct into assembly code. Modern peephole systems require much larger pattern libraries because ISAs today can get quite large. Most systems include a tool to automatically generate a matcher from a description of a target machine's instruction set.
This scheme is attractive because it simplifies the task of retargeting a compiler --- the compiler writer must provide a machine description to the matcher generator, rewrite the LLIR sequences generated by the earlier phases, and modify the instruction scheduler and register allocator to reflect the characteristics of the new ISA. The basic infrastructure remains intact.
Another advantage of using peephole systems is that, if the compiler already uses the LLIR for optimization, then the compiler does not need an explicit expander. Additionally, if the compiler optimized the LLIR, the simplifier need not worry about dead effects. Systemic simplification produces code that is much more efficient than simple hand-coded pass that walks the IR and rewrites it into assembly.