Back

Compilers

March 2021

A compiler is a computer program that translates programs written in one programming language into another language --- typically a lower level language such as the ISA of some processor.

The two fundamental principles of compilation are:

  1. The compiler must preserve the meaning of the program being compiled.
  2. The compiler must improve the input program in some discernible way.

One slight note about compilers and their close cousins, the interpreters. An interpreter accepts an input program and produces the result of executing that program. This is different from a compiler, which emits an output program that can be executed to produce the result.

A compiler is organized into three parts:

  1. The frontend, whose job is to "understand" the input program.
  2. The optimizer, whose job is to improve the input program.
  3. The backend, whose job is to generate the output program.

To "understand" an input program, the compiler has to check the syntax and semantics of that program. The technical terms for these activities are lexical analysis, syntactic analysis, and semantic elaboration. In lexical analysis, the scanner transforms a stream of characters into a sequence of words. In syntactic analysis, the parser accepts a sequence of words and builds a parse tree. In semantic elaboration, the evaluator evaluates the parse tree and checks that the program is meaningful.

Once the compiler has verified that the input program is both syntactically and semantically correct, the next order of business is to generate intermediate representations (IRs) of the input program. Specific choices of IRs affect the compiler's ability to transform and improve the code. Optimization and transformation are performed by the optimizer.

After the compiler has optimized and transformed its definitive version of the IR of the input program, the next step is to the map the IR constructs into a sequence of target machine operations. This step, called instruction selection, is performed by the backend. The backend also performs other activities such as register allocation (mapping virtual registers to physical registers) and instruction scheduling (reodering the instructions to improve the throughput of the target machine by exploiting parallelism).

More on Compilers