Computations, Results, Strictness; Rules Involving Cells

In this lesson we will learn about the syntactic category K of computations, about how strictness attributes are in fact syntactic sugar for rewrite rules over computations, and why it is important to tell the tool which computations are results. We will also see a K rule that involves cells.

K Computations

Computation structures, or more simply computations, extend the abstract syntax of your language with a list structure using ~> (read followed by or and then, and written $\curvearrowright$ in Latex) as a separator. K provides a distinguished sort, K, for computations. The extension of the abstract syntax of your language into computations is done automatically by the K tool when you declare constructs using the syntax keyword, so the K semantic rules can uniformly operate only on terms of sort K. The intuition for computation structures of the form

t1 ~> t2 ~> ... ~> tn

is that the listed tasks are to be processed in order. The initial computation typically contains the original program as its sole task, but rules can then modify it into task sequences, as seen shortly.

Strictness in Theory

The strictness attributes, used as annotations to language constructs, actually correspond to rules over computations. For example, the strict(2) attribute of the assignment statement corresponds to the following two opposite rules (X ranges over Id and A over AExp):

X=A; => A ~> X=[];
A ~> X=[]; => X=A;

The first rule pulls A from the syntactic context X=A; and schedules it for processing. The second rule plugs A back into its context. Inspired from the chemical abstract machine, we call rules of the first type above heating rules and rules of the second type cooling rules. Similar rules are generated for other arguments in which operations are strict. Iterative applications of heating rules eventually bring to the top of the computation atomic tasks, such as a variable lookup, or a builtin operation, which then make computational progress by means of other rules. Once progress is made, cooling rules can iteratively plug the result back into context, so that heating rules can pick another candidate for reduction, and so on and so forth.

When operations are strict only in some of their arguments, the corresponding positions of the arguments in which they are strict are explicitly enumerated in the argument of the strict attribute, e.g., strict(2) like above, or strict(2 3) for an operation strict in its second and third arguments, etc. If an operation is simply declared strict then it means that it is strict in all its arguments. For example, the strictness of addition yields:

A1+A2 => A1 ~> []+A2
A1 ~> []+A2 => A1+A2
A1+A2 => A2 ~> A1+[]
A2 ~> A1+[] => A1+A2

It can be seen that such heating/cooling rules can easily lead to non-determinism, since the same term may be heated many different ways; these different evaluation orders may lead to different behaviors in some languages (not in IMP, because its expressions do not have side effects, but we will experiment with non-determinism in its successor, IMP++).

A similar desugaring applies to sequential strictness, declared with the keyword seqstrict. While the order of arguments of strict is irrelevant, it matters in the case of seqstrict: they are to be evaluated in the specified order; if no arguments are given, then they are assumed by default to be evaluated from left-to-right. For example, the default heating/cooling rules associated to the sequentially strict <= construct above are (A1, A2 range over AExp and I1 over Int):

A1<=A2 => A1 ~> []<=A2
A1 ~> []<=A2 => A1<=A2
I1<=A2 => A2 ~> I1<=[]
A2 ~> I1<=[] => I1<=A2

In other words, A2 is only heated/cooled after A1 is already evaluated.

While the heating/cooling rules give us a nice and uniform means to define all the various allowable ways in which a program can evaluate, all based on rewriting, the fact that they are reversible comes with a serious practical problem: they make the K definitions unexecutable, because they lead to non-termination.

Strictness in Practice; K Results

To break the reversibility of the theoretical heating/cooling rules, and, moreover, to efficiently execute K definitions, the current implementation of the K tool relies on users giving explicit definitions of their languages' results.

The K tool provides a predicate isKResult, which is automatically defined as we add syntactic constructs to KResult (in fact the K tool defines such predicates for all syntactic categories, which are used, for example, as rule side conditions to check user-declared variable memberships, such as V:Val stating that V belongs to Val).

The kompile tool, depending upon what it is requested to do, changes the reversible heating/cooling rules corresponding to evaluation strategy definitions (e.g., those corresponding to strictness attributes) to avoid non-termination. For example, when one is interested in obtaining an executable model of the language (which is the default compilation mode of kompile), then heating is performed only when the to-be-pulled syntactic fragment is not a result, and the corresponding cooling only when the to-be-plugged fragment is a result. In this case, e.g., the heating/cooling rules for assignment are modified as follows:

X=A; => A ~> X=[];  requires notBool isKResult(A)
A ~> X=[]; => X=A;  requires isKResult(A)

Note that non-termination of heating/cooling is avoided now. The only thing lost is the number of possible behaviors that a program can manifest, but this is irrelevant when all we want is one behavior.

As will be discussed in the IMP++ tutorial, the heating/cooling rules are modified differently by kompile when we are interested in other aspects of the language definition, such us, for example, in a search-able model that comprises all program behaviors. This latter model is obviously more general from a theoretical perspective, but, in practice, it is also slower to execute. The kompile tool strives to give you the best model of the language for the task you are interested in.

Can't Results be Inferred Automatically?

This is a long story, but the short answer is: No!. Maybe in some cases it is possible, but we prefer to not attempt it in the K tool. For example, you most likely do not want any stuck computation to count as a result, since some of them can happen simply because you forgot a semantic rule that could have further reduce it! Besides, in our experience with defining large languages, it is quite useful to take your time and think of what the results of your language's computations are. This fact in itself may help you improve your overall language design. We typically do it at the same time with defining the evaluation strategies of our languages. Although in theory K could infer the results of your language as the stuck computations, based on the above we have deliberately decided to not provide this feature, in spite of requests from some users. So you currently do have to explicitly define your K results if you want to effectively use the K tool. Note, however, that theoretical definitions, not meant to be executed, need not worry about defining results (that's because in theory semantic rules apply modulo the reversible heating/cooling rules, so results are not necessary).

A K Rule Involving Cells

All our K rules so far in the tutorial were of the form

rule left => right requires condition

where left and right were syntactic, or more generally computation, terms.

Here is our first K rule explicitly involving cells:

rule <k> X:Id => I ...</k> <state>... X |-> I ...</state>

Recall that the k cell holds computations, which are sequences of tasks separated by ~>. Also, the state cell holds a map, which is a set of bindings, each binding being a pair of computations (currently, the K builtin data-structures, like maps, are untyped; or, said differently, they are all over the type of computations, K).

Therefore, the two cells mentioned in the rule above hold collections of things, ordered or not. The ...s, which we also call cell frames, stand for more stuff there, which we do not care about.

The rewrite relation => is allowed in K to appear anywhere in a term, its meaning being that the corresponding subterm is rewritten as indicated in the shown context. We say that K's rewriting is local.

The rule above says that if the identifier X is the first task in the k cell, and if X is bound to I somewhere in the state, then X rewrites to I locally in the k cell. Therefore, IMP variables need to be already declared when looked up.

Of course, the K rule above can be translated into an ordinary rewrite rule of the form

rule <k> X ~> Rest </k> <state> Before (X |-> I) After </state>
  => <k> I ~> Rest </k> <state> Before (X |-> I) After </state>

Besides being more verbose and thus tedious to write, this ordinary rule is also more error-prone; for example, we may forget the Rest variable in the right-hand-side, etc. Moreover, the concurrent semantics of K allows for its rules to be interpreted as concurrent transactions, where the context is the read-only component of the transaction, while the subterms which are rewritten are read/write component of the transaction; thus, K rule instances can apply concurrently if they only overlap on read-only parts, while they cannot if regarded as ordinary rewrite logic rules. Note: our current implementation of the K tool is not concurrent, so K rules are in fact desugared as normal rewrite rules in the K tool.

Kompile imp.k using a documentation option and check out how the K rule looks in the generated document. The ... frames are displayed as cell tears, metaphorically implying that those parts of the cells that we do not care about are torn away. The rewrite relation is replaced by a horizontal line: specifically, the subterm which rewrites, X, is underlined, and its replacement is written underneath the line.

In the next lesson we define the complete K semantics of IMP and run the programs we parsed in the first lesson.

Go to Lesson 4, IMP: Configuration Abstraction, Part 1; Types of Rules.

MOVIE (out of date) [10'30"]