Imperative, Environment-Based Type Systems

In this lesson you learn how to define a type system for an imperative language (the IMP++ language defined in Part 4 of the tutorial), using a style based on type environments.

Let us copy the imp.k file from Part 4 of the tutorial, Lesson 7, which holds the semantics of IMP++, and modify it into a type system. The resulting type system, when executed, yields a type checker.

We start by defining the new strictness attributes of the IMP++ syntax. While doing so, remember that programs and fragments of programs now reduce to their types. So types will be the new results of our new (type) semantics. We also clean up the semantics by removing the unnecessary tags, and also use strict instead of seqstrict wherever possible, because strict gives implementations more freedom. Interestingly, note that spawn is strict now, because the code of the child thread should type in the current parent's type environment. Note that this is not always the case for threads, see for example SIMPLE in the languages tutorial, but it works here for our simpler IMP++.

From a typing perspective, the && construct is strict in both its arguments; its short-circuit (concrete) semantics is irrelevant for its (static) type system. Similarly, both the conditional and the while loop are strict constructs when regarded through the typing lenses.

Finally, the sequential composition is now sequentially strict! Indeed, statements are now going to reduce to their type, stmt, and it is critical for sequential composition to type its argument statements left-to-right; for example, imagine that the second argument is a variable declaration (whose type semantics will modify the type environment).

We continue by defining the new results of computations, that is, the actual types. In this simple imperative language, we only have a few constant types: int, bool, string, block and stmt.

We next define the new configuration, which is actually quite simple. Besides the <k/> cell, all we need is a type environment cell, <tenv/>, which will hold a map from identifiers to their types. A type environment is therefore like a state in the abstract domain of type values.

Let us next modify the semantic rules, turning them into a type system. In short, the idea is to reduce the basic values to their types, and then have a rule for each language construct reducing it to its result type whenever its arguments have the expected types.

We write the rules in the order given by the syntax declarations, to make sure we do not forget any construct.

Integers reduce to their type, int.

So do the strings.

Variables are now looked up in the type environment and reduced to their type there. Since we only declare integer variables in IMP++, their type in tenv will always be int. Nevertheless, we write the rule generically, so that we would not have to change it later if we add other type declarations to IMP++. Note that we reject programs which lookup undeclared variables. Rejection, in this case, means rewriting getting stuck.

Variable increment types to int, provided the variable has type int.

Read types to int, because we only allow integer input.

Division is only allowed on integers, so it rewrites to int provided that its arguments rewrite to int. Note, however, that in order to write int / int, we have to explicitly add int to the syntax of arithmetic expressions. Otherwise, the K parser rightfully complains, because / was declared on arithmetic expressions, not on types. One simple and generic way to allow types to appear anywhere, is to define Type as a syntactic subcategory of all the other syntactic categories. Let's do it on a by-need basis, though.

Addition is overloaded, so we add two typing rules for it: one for integers and another for strings.

As discussed, spawn types to stmt provided that its argument types to block.

The assignment construct was strict(2); its typing policy is that the declared type of X should be identical to the type of the assigned value. Like for lookup, we define this rule more generically than needed for IMP++, for any type, not only for int.

The typing rules for Boolean expression constructs are in the same spirit. Note that we need only one rule for &&.

The typing of blocks is a bit trickier. First, note that we still need to recover the environment after the block is typed, because we do not want the block-local variables to be visible in the outer type environment. We recover the type environment only after the block-enclosed statements type; moreover, we also opportunistically yield a block type on the computation when we discard the type environment recovery item. To account for the fact that the block-enclosed statement can itself be a block (e.g., {{S}}), we would need an additional rule. Since we do not like repetition, we instead group the types block and stmt into one syntactic category, BlockOrStmtType, and now we can have only one rule. We also include BlockOrStmtType in Type, as a replacement for the two basic types.

The expression statement types as expected. Recall that we only allow arithmetic expressions, which type to int, to be used as statements in IMP++.

The conditional was declared strict in all its arguments. Its typing policy is that its first argument types to bool and its two branches to block. If that is the case, then it yields a stmt type.

For while, its first argument should type to bool and its second to block.

Variable declarations add new bindings to the type environment. Recall that we can only declare variables of integer type in IMP++.

The typing policy of print is that it can only print integer or string values, and in that case it types to stmt. Like for BlockOrStmtType, to avoid having two similar rules, one for int and another for string, we prefer to introduce an additional syntactic category, PrintableType, which includes both int and string types.

halt types to stmt; so its subsequent code is also typed.

join types to stmt, provided that its argument types to int.

Sequential composition was declared as a whitespace-separated sequentially strict list. Its typing policy is that all the statements in the list should type to stmt or block in order for the list to type to stmt. Since lists are maintained internally as cons-lists, this is probably the simplest way to do it:

rule .Stmts => stmt
rule _:BlockOrStmtType Ss => Ss

Note that the first rule, which types the empty sequence of statements to stmt, is needed anyway, to type empty blocks {} (together with the block rule).

kompile imp.k and krun all the programs in Part 4 of the tutorial. They should all type to stmt.

In the next lesson we will define a substitution-based type system for LAMBDA.

Go to Lesson 2, Type Systems: Substitution-Based Higher-Order Type Systems.

MOVIE (out of date) [10'11"]