Semantics Elaboration

Semantic Elaboration

March 2021

Semantic elaboration (or context-sensitive analysis) deals with analysing and verifying the meaning of a piece of program. Semantic elaboration is typically performed after the parse tree for the input program has been constructed.

Type Systems

A type in a programming language is a collection of properties that some data value has to satisfy. Every languages come with a set of predefined types. Common base types are numbers, characters, booleans, arrays, strings, and pointers. Most languages also have mechanisms for constructing new types via structures.

Type systems serve three purposes:

To ensure runtime safety. The goal of the type system is to reduce the probability of encountering runtime errors.
To improve expressiveness. Type systems allow language designers to specify language behavior more precisely.
To generate better code. Type systems reveal information that can be used to produce more efficient translations.

A type system also includes several rules that use types. Different type systems implement their own mechanisms for determining type equivalences. Two general approaches are common: name equivalence and structural equivalence.

Another key component of any type system is its inference rules. Type inference rules specify, for each operator, the mapping between operand types and the result type. The compiler may or may not perform implicit conversions to comply with the inference rules of the language.

Type inference rules are used to infer the types for expressions. Certain expressions involving procedures require type signatures that specify the types of the formal parameter and return values of the procedures.

Attribute Grammar Framework

One technique for performing type checking is to use the attribute grammar framework. An attribute grammar framework is a formalism for performing context-sensitive analysis (type checking is just one kind of these analysis).

An attribute grammar is a context-free grammar augmented with a set of rules that define a collection of attributes for the symbols in the grammar. Each instance of a grammar symbol that occurs in the parse tree has a set of instances of its attributes.

Each attribute rule defines an attribute in terms of literal constants and the attributes of other symbols in the production. Note that this scheme allows information to be passed in both directions.

A synthesized attribute is an attribute defined in terms of the attributes of the node, its children, and constants. An inherited attribute is an attribute defined in terms of the attributes of the node, its siblings, its parent, and constants. These dependences between the attributes of all symbols in some parse tree form an attribute-dependence graph.

An implementation of a set of attribute rules is called an evaluator. Evaluators are written either manually as ad hoc programs, or automatically using an evaluator generator.

The dependence graph captures the flow of values that an evaluator must respect in evaluating an instance of an attributed tree --- it imposes a partial order on the attributes. In general, evaluation methods are classified into three classes:

Dynamic methods --- "readiness" check, topological sort
Oblivious methods --- repeated left-to-right passes
Rule-Based methods --- analyse statically and construct evaluation order

Cycles in the dependence graph must be avoided because it causes the evaluator to never terminate. One way of avoiding cycles in the dependence graph is to restrict the attribute grammar to a class that cannot give rise to circular dependence graphs (eg. use synthesized attributes only).