Lesson 1.13: Basics of K Rewriting

The purpose of this lesson is to explain how rewrite rules that are not the definition of a function behave, and how, using these rules, you can construct a semantics of programs in a programming language in K.

Recap: Function rules in K

Recall from Lesson 1.2 that we have, thus far, introduced two types of productions in K: constructors and functions. A function is identified by the function attribute placed on the production. As you may recall, when we write a rule with a function on the left-hand side of the => operator, we are defining the meaning of that function for inputs which match the patterns on the left-hand side of the rule. If the argument to the function match the patterns, then the function is evaluated to the value constructed by substituting the bindings for the variables into the right-hand side of the rule.

Top-level rules

However, function rules are not the only type of rule permissible in K, nor even the most frequently used. K also has a concept of a top-level rewrite rule. The simplest way to ensure that a rule is treated as a top-level rule is for the left-hand side of the rule to mention one or more cells. We will cover how cells work and are declared in more detail in a later lesson, but for now, what you should know is that when we ran krun in our very first example in Lesson 1.2 and got the following output:

<k>
  Yellow ( ) ~> .
</k>

<k> is a cell, known by convention as the K cell. This cell is available by default in any definition without needing to be explicitly declared.

The K cell contains a single term of sort K. K is a predefined sort in K with two constructors, that can be roughly represented by the following grammar:

  syntax K ::= KItem "~>" K
             | "."

As a syntactic convenience, K allows you to treat ~> like it is an associative list (i.e., as if it were defined as syntax K ::= K "~>" K). When a definition is compiled, it will automatically transform the rules you write so that they treat the K sort as a cons-list. Another syntactic convenience is that, for disambiguation purposes, you can write .K anywhere you would otherwise write . and the meaning is identical.

Now, you may notice that the above grammar mentions the sort KItem. This is another built-in sort in K. For every sort S declared in a definition (with the exception of K and KItem), K will implicitly insert the following production:

  syntax KItem ::= S

In other words, every sort is a subsort of the sort KItem, and thus a term of any sort can be injected as an element of a term of sort K, also called a K sequence.

By default, when you krun a program, the AST of the program is inserted as the sole element of a K sequence into the <k> cell. This explains why we saw the output we did in Lesson 1.2.

With these preliminaries in mind, we can now explain how top-level rewrite rules work in K. Put simply, any rule where there is a cell (such as the K cell) at the top on the left-hand side will be a top-level rewrite rule. Once the initial program has been inserted into the K cell, the resulting term, called the configuration, will be matched against all the top-level rewrite rules in the definition. If only one rule matches, the substitution generated by the matching will be applied to the right-hand side of the rule and the resulting term is rewritten to be the new configuration. Rewriting proceeds by iteratively applying rules, also called taking steps, until no top-level rewrite rule can be applied. At this point the configuration becomes the final configuration and is output by krun.

If more than one top-level rule applies, by default, K will pick just one of those rules, apply it, and continue rewriting. However, it is non-deterministic which rule applies. In theory, it could be any of them. By passing the --search flag to krun, you are able to tell krun to explore all possible non-deterministic choices, and generate a complete list of all possible final configurations reachable by each nondeterminstic choice that can be made. Note that the --search flag to krun only works if you pass --enable-search to kompile first.

Unlike top-level rewrite rules, function rules are not associated with any particular set of cells in the configuration (although they can contain cells in their function arguments and return value). While top-level rewrite rules apply to the entire term being rewritten, function rules apply anywhere a function application for that function appears, and are immediately rewritten to their return value in that position.

Another key distinction between top-level rules and function rules is that function symbols, i.e., productions with the function attribute, are mathematical functions rather than constructors. While a constructor is logically distinct from any other constructor of the same sort, and can be matched against unconditionally, a function does not necessaraily have the same restriction unless it happens to be an injective function. Thus, two function symbols with different arguments may still ultimately produce the same value and thus compare equal to one another. Due to this, concrete execution (i.e., all K definitions introduced thus far; see Lesson 1.21) introduces the restriction that you cannot match on a function symbol on the left-hand side of a rule, except as the top symbol on the left-hand side of a function rule. This restriction will be later lifted when we introduce the Haskell Backend which performs symbolic execution.

Exercise

Pass a program containing no functions to krun. You can use a term of sort Exp from LESSON-11-E. Observe the output and try to understand why you get the output you do. Then write two rules that rewrite that program to another. Run krun --search on that program and observe both results. Then add a third rule that rewrites one of those results again. Test that that rule applies as well.

Using top-level rules to evaluate expressions

Thus far, we have focused primarily on defining functions over constructors in K. However, now that we have a basic understanding of top-level rules, it is possible to introduce a rewrite system to our definitions. A rewrite system is a collection of top-level rewrite rules which performs an organized transformation of a particular program into a result which expresses the meaning of that program. For example, we might rewrite an expression in a programming language into a value representing the result of evaluating that expression.

Recall in Lesson 1.11, we wrote a simple grammar of Boolean and integer expressions that looked roughly like this (lesson-13-a.k):

k
module LESSON-13-A
  imports INT

  syntax Exp ::= Int
               | Bool
               | Exp "+" Exp
               | Exp "&&" Exp
endmodule

In that lesson, we defined a function eval which evaluated such expressions to either an integer or Boolean.

However, it is more idiomatic to evaluate such expressions using top-level rewrite rules. Here is how one might do so in K (lesson-13-b.k):

k
module LESSON-13-B-SYNTAX
  imports UNSIGNED-INT-SYNTAX
  imports BOOL-SYNTAX

  syntax Val ::= Int | Bool
  syntax Exp ::= Val
               > left: Exp "+" Exp
               > left: Exp "&&" Exp
endmodule

module LESSON-13-B
  imports LESSON-13-B-SYNTAX
  imports INT
  imports BOOL

  rule <k> I1:Int + I2:Int ~> K:K </k> => <k> I1 +Int I2 ~> K </k>
  rule <k> B1:Bool && B2:Bool ~> K:K </k> => <k> B1 andBool B2 ~> K </k>

  syntax KItem ::= freezer1(Val) | freezer2(Exp)
                 | freezer3(Val) | freezer4(Exp)

  rule <k> E1:Val + E2:Exp ~> K:K </k> => <k> E2 ~> freezer1(E1) ~> K </k> [priority(51)]
  rule <k> E1:Exp + E2:Exp ~> K:K </k> => <k> E1 ~> freezer2(E2) ~> K </k> [priority(52)]
  rule <k> E1:Val && E2:Exp ~> K:K </k> => <k> E2 ~> freezer3(E1) ~> K </k> [priority(51)]
  rule <k> E1:Exp && E2:Exp ~> K:K </k> => <k> E1 ~> freezer4(E2) ~> K </k> [priority(52)]

  rule <k> E2:Val ~> freezer1(E1) ~> K:K </k> => <k> E1 + E2 ~> K </k>
  rule <k> E1:Val ~> freezer2(E2) ~> K:K </k> => <k> E1 + E2 ~> K </k>
  rule <k> E2:Val ~> freezer3(E1) ~> K:K </k> => <k> E1 && E2 ~> K </k>
  rule <k> E1:Val ~> freezer4(E2) ~> K:K </k> => <k> E1 && E2 ~> K </k>
endmodule

This is of course rather cumbersome currently, but we will soon introduce syntactic convenience which makes writing definitions of this type considerably easier. For now, notice that there are roughly 3 types of rules here: the first matches a K cell in which the first element of the K sequence is an Exp whose arguments are values, and rewrites the first element of the sequence to the result of that expression. The second also matches a K cell with an Exp in the first element of its K sequence, but it matches when one or both arguments of the Exp are not values, and replaces the first element of the K sequence with two new elements: one being an argument to evaluate, and the other being a special constructor called a freezer. Finally, the third matches a K sequence where a Val is first, and a freezer is second, and replaces them with a partially evaluated expression.

This general pattern is what is known as heating an expression, evaluating its arguments, cooling the arguments into the expression again, and evaluating the expression itself. By repeatedly performing this sequence of actions, we can evaluate an entire AST containing a complex expression down into its resulting value.

Exercise

Write an addition expression with integers. Use krun --depth 1 to see the result of rewriting after applying a single top-level rule. Gradually increase the value of --depth to see successive states. Observe how this combination of rules is eventually able to evaluate the entire expression.

Simplifying the evaluator: Local rewrites and cell ellipses

As you saw above, the definition we wrote is rather cumbersome. Over the remainder of Lessons 1.13 and 1.14, we will greatly simplify it. The first step in doing so is to teach a bit more about the rewrite operator, =>. Thus far, all the rules we have written look like rule LHS => RHS. However, this is not the only way the rewrite operator can be used. It is actually possible to place a constructor or function at the very top of the rule, and place rewrite operators inside that term. While a rewrite operator cannot appear nested inside another rewrite operator, by doing this, we can express that some parts of what we are matching are not changed by the rewrite operator. For example, consider the following rule from above:

  rule <k> I1:Int + I2:Int ~> K:K </k> => <k> I1 +Int I2 ~> K </k>

We can equivalently write it like following:

  rule <k> (I1:Int + I2:Int => I1 +Int I2) ~> _:K </k>

When you put a rewrite inside a term like this, in essence, you are telling the rule to only rewrite part of the left-hand side to the right-hand side. In practice, this is implemented by lifting the rewrite operator to the top of the rule by means of duplicating the surrounding context.

There is a way that the above rule can be simplified further, however. K provides a special syntax for each cell containing a term of sort K, indicating that we want to match only on some prefix of the K sequence. For example, the above rule can be simplified further like so:

  rule <k> I1:Int + I2:Int => I1 +Int I2 ...</k>

Here we have placed the symbol ... immediately prior to the </k> which ends the cell. What this tells the compiler is to take the contents of the cell, treat it as the prefix of a K sequence, and insert an anonymous variable of sort K at the end. Thus we can think of ... as a way of saying we don't care about the part of the K sequence after the beginning, leaving it unchanged.

Putting all this together, we can rewrite LESSON-13-B like so (lesson-13-c.k):

k
module LESSON-13-C-SYNTAX
  imports UNSIGNED-INT-SYNTAX
  imports BOOL-SYNTAX

  syntax Val ::= Int | Bool
  syntax Exp ::= Val
               > left: Exp "+" Exp
               > left: Exp "&&" Exp
endmodule

module LESSON-13-C
  imports LESSON-13-C-SYNTAX
  imports INT
  imports BOOL

  rule <k> I1:Int + I2:Int => I1 +Int I2 ...</k>
  rule <k> B1:Bool && B2:Bool => B1 andBool B2 ...</k>

  syntax KItem ::= freezer1(Val) | freezer2(Exp)
                 | freezer3(Val) | freezer4(Exp)

  rule <k> E1:Val + E2:Exp => E2 ~> freezer1(E1) ...</k> [priority(51)]
  rule <k> E1:Exp + E2:Exp => E1 ~> freezer2(E2) ...</k> [priority(52)]
  rule <k> E1:Val && E2:Exp => E2 ~> freezer3(E1) ...</k> [priority(51)]
  rule <k> E1:Exp && E2:Exp => E1 ~> freezer4(E2) ...</k> [priority(52)]

  rule <k> E2:Val ~> freezer1(E1) => E1 + E2 ...</k>
  rule <k> E1:Val ~> freezer2(E2) => E1 + E2 ...</k>
  rule <k> E2:Val ~> freezer3(E1) => E1 && E2 ...</k>
  rule <k> E1:Val ~> freezer4(E2) => E1 && E2 ...</k>
endmodule

This is still rather cumbersome, but it is already greatly simplified. In the next lesson, we will see how additional features of K can be used to specify heating and cooling rules much more compactly.

Exercises

Modify LESSON-13-C to add rules to evaluate integer subtraction.

Next lesson

Once you have completed the above exercises, you can continue to Lesson 1.14: Defining Evaluation Order.