Lesson 1.13: Basics of K Rewriting
The purpose of this lesson is to explain how rewrite rules that are not the definition of a function behave, and how, using these rules, you can construct a semantics of programs in a programming language in K.
Recap: Function rules in K
Recall from Lesson 1.2 that we have, thus far,
introduced two types of productions in K: constructors and functions.
A function is identified by the function attribute placed on the
production. As you may recall, when we write a rule with a function on the
left-hand side of the =>
operator, we are defining the meaning of that
function for inputs which match the patterns on the left-hand side of the rule.
If the argument to the function match the patterns, then the function is
evaluated to the value constructed by substituting the bindings for the
variables into the right-hand side of the rule.
Top-level rules
However, function rules are not the only type of rule permissible in K, nor
even the most frequently used. K also has a concept of a
top-level rewrite rule. The simplest way to ensure that a rule is treated
as a top-level rule is for the left-hand side of the rule to mention one or
more cells. We will cover how cells work and are declared in more detail
in a later lesson, but for now, what you should know is that when we ran krun
in our very first example in Lesson 1.2 and got the following output:
<k>
Yellow ( ) ~> .
</k>
<k>
is a cell, known by convention as the K cell. This cell is available
by default in any definition without needing to be explicitly declared.
The K cell contains a single term of sort K
. K
is a predefined sort in K
with two constructors, that can be roughly represented by the following
grammar:
syntax K ::= KItem "~>" K
| "."
As a syntactic convenience, K allows you to treat ~>
like it is an
associative list (i.e., as if it were defined as syntax K ::= K "~>" K
).
When a definition is compiled, it will automatically transform the rules you
write so that they treat the K
sort as a cons-list. Another syntactic
convenience is that, for disambiguation purposes, you can write .K
anywhere
you would otherwise write .
and the meaning is identical.
Now, you may notice that the above grammar mentions the sort KItem
. This is
another built-in sort in K. For every sort S
declared in a definition (with
the exception of K
and KItem
), K will implicitly insert the following
production:
syntax KItem ::= S
In other words, every sort is a subsort of the sort KItem
, and thus a term
of any sort can be injected as an element of a term of sort K
, also called
a K sequence.
By default, when you krun
a program, the AST of the program is inserted as
the sole element of a K sequence into the <k>
cell. This explains why we
saw the output we did in Lesson 1.2.
With these preliminaries in mind, we can now explain how top-level rewrite
rules work in K. Put simply, any rule where there is a cell (such as the K
cell) at the top on the left-hand side will be a top-level rewrite rule. Once
the initial program has been inserted into the K cell, the resulting term,
called the configuration, will be matched against all the top-level
rewrite rules in the definition. If only one rule matches, the substitution
generated by the matching will be applied to the right-hand side of the rule
and the resulting term is rewritten to be the new configuration. Rewriting
proceeds by iteratively applying rules, also called taking steps, until
no top-level rewrite rule can be applied. At this point the configuration
becomes the final configuration and is output by krun
.
If more than one top-level rule applies, by default, K
will pick just one
of those rules, apply it, and continue rewriting. However, it is
non-deterministic which rule applies. In theory, it could be any of them.
By passing the --search
flag to krun
, you are able to tell krun
to
explore all possible non-deterministic choices, and generate a complete list of
all possible final configurations reachable by each nondeterminstic choice that
can be made. Note that the --search
flag to krun only works if you pass
--enable-search
to kompile first.
Unlike top-level rewrite rules, function rules are not associated with any particular set of cells in the configuration (although they can contain cells in their function arguments and return value). While top-level rewrite rules apply to the entire term being rewritten, function rules apply anywhere a function application for that function appears, and are immediately rewritten to their return value in that position.
Another key distinction between top-level rules and function rules is that
function symbols, i.e., productions with the function
attribute, are
mathematical functions rather than constructors. While a constructor is
logically distinct from any other constructor of the same sort, and can be
matched against unconditionally, a function does not necessaraily have the
same restriction unless it happens to be an injective function. Thus, two
function symbols with different arguments may still ultimately produce the
same value and thus compare equal to one another. Due to this, concrete
execution (i.e., all K definitions introduced thus far; see Lesson 1.21)
introduces the restriction that you cannot match on a function symbol on the
left-hand side of a rule, except as the top symbol on the left-hand side of
a function rule. This restriction will be later lifted when we introduce the
Haskell Backend which performs symbolic execution.
Exercise
Pass a program containing no functions to krun
. You can use a term of sort
Exp
from LESSON-11-E
. Observe the output and try to understand why you get
the output you do. Then write two rules that rewrite that program to another.
Run krun --search
on that program and observe both results. Then add a third
rule that rewrites one of those results again. Test that that rule applies as
well.
Using top-level rules to evaluate expressions
Thus far, we have focused primarily on defining functions over constructors in K. However, now that we have a basic understanding of top-level rules, it is possible to introduce a rewrite system to our definitions. A rewrite system is a collection of top-level rewrite rules which performs an organized transformation of a particular program into a result which expresses the meaning of that program. For example, we might rewrite an expression in a programming language into a value representing the result of evaluating that expression.
Recall in Lesson 1.11, we wrote a simple grammar of Boolean and integer
expressions that looked roughly like this (lesson-13-a.k
):
kmodule LESSON-13-A imports INT syntax Exp ::= Int | Bool | Exp "+" Exp | Exp "&&" Exp endmodule
In that lesson, we defined a function eval
which evaluated such expressions
to either an integer or Boolean.
However, it is more idiomatic to evaluate such expressions using top-level
rewrite rules. Here is how one might do so in K (lesson-13-b.k
):
kmodule LESSON-13-B-SYNTAX imports UNSIGNED-INT-SYNTAX imports BOOL-SYNTAX syntax Val ::= Int | Bool syntax Exp ::= Val > left: Exp "+" Exp > left: Exp "&&" Exp endmodule module LESSON-13-B imports LESSON-13-B-SYNTAX imports INT imports BOOL rule <k> I1:Int + I2:Int ~> K:K </k> => <k> I1 +Int I2 ~> K </k> rule <k> B1:Bool && B2:Bool ~> K:K </k> => <k> B1 andBool B2 ~> K </k> syntax KItem ::= freezer1(Val) | freezer2(Exp) | freezer3(Val) | freezer4(Exp) rule <k> E1:Val + E2:Exp ~> K:K </k> => <k> E2 ~> freezer1(E1) ~> K </k> [priority(51)] rule <k> E1:Exp + E2:Exp ~> K:K </k> => <k> E1 ~> freezer2(E2) ~> K </k> [priority(52)] rule <k> E1:Val && E2:Exp ~> K:K </k> => <k> E2 ~> freezer3(E1) ~> K </k> [priority(51)] rule <k> E1:Exp && E2:Exp ~> K:K </k> => <k> E1 ~> freezer4(E2) ~> K </k> [priority(52)] rule <k> E2:Val ~> freezer1(E1) ~> K:K </k> => <k> E1 + E2 ~> K </k> rule <k> E1:Val ~> freezer2(E2) ~> K:K </k> => <k> E1 + E2 ~> K </k> rule <k> E2:Val ~> freezer3(E1) ~> K:K </k> => <k> E1 && E2 ~> K </k> rule <k> E1:Val ~> freezer4(E2) ~> K:K </k> => <k> E1 && E2 ~> K </k> endmodule
This is of course rather cumbersome currently, but we will soon introduce
syntactic convenience which makes writing definitions of this type considerably
easier. For now, notice that there are roughly 3 types of rules here: the first
matches a K cell in which the first element of the K sequence is an Exp
whose
arguments are values, and rewrites the first element of the sequence to the
result of that expression. The second also matches a K cell with an Exp
in
the first element of its K sequence, but it matches when one or both arguments
of the Exp
are not values, and replaces the first element of the K sequence
with two new elements: one being an argument to evaluate, and the other being
a special constructor called a freezer. Finally, the third matches a K
sequence where a Val
is first, and a freezer is second, and replaces them
with a partially evaluated expression.
This general pattern is what is known as heating an expression, evaluating its arguments, cooling the arguments into the expression again, and evaluating the expression itself. By repeatedly performing this sequence of actions, we can evaluate an entire AST containing a complex expression down into its resulting value.
Exercise
Write an addition expression with integers. Use krun --depth 1
to see the
result of rewriting after applying a single top-level rule. Gradually increase
the value of --depth
to see successive states. Observe how this combination
of rules is eventually able to evaluate the entire expression.
Simplifying the evaluator: Local rewrites and cell ellipses
As you saw above, the definition we wrote is rather cumbersome. Over the
remainder of Lessons 1.13 and 1.14, we will greatly simplify it. The first step
in doing so is to teach a bit more about the rewrite operator, =>
. Thus far,
all the rules we have written look like rule LHS => RHS
. However, this is not
the only way the rewrite operator can be used. It is actually possible to place
a constructor or function at the very top of the rule, and place rewrite
operators inside that term. While a rewrite operator cannot appear nested
inside another rewrite operator, by doing this, we can express that some parts
of what we are matching are not changed by the rewrite operator. For
example, consider the following rule from above:
rule <k> I1:Int + I2:Int ~> K:K </k> => <k> I1 +Int I2 ~> K </k>
We can equivalently write it like following:
rule <k> (I1:Int + I2:Int => I1 +Int I2) ~> _:K </k>
When you put a rewrite inside a term like this, in essence, you are telling the rule to only rewrite part of the left-hand side to the right-hand side. In practice, this is implemented by lifting the rewrite operator to the top of the rule by means of duplicating the surrounding context.
There is a way that the above rule can be simplified further, however. K provides a special syntax for each cell containing a term of sort K, indicating that we want to match only on some prefix of the K sequence. For example, the above rule can be simplified further like so:
rule <k> I1:Int + I2:Int => I1 +Int I2 ...</k>
Here we have placed the symbol ...
immediately prior to the </k>
which ends
the cell. What this tells the compiler is to take the contents of the cell,
treat it as the prefix of a K sequence, and insert an anonymous variable of
sort K
at the end. Thus we can think of ...
as a way of saying we
don't care about the part of the K sequence after the beginning, leaving
it unchanged.
Putting all this together, we can rewrite LESSON-13-B
like so
(lesson-13-c.k
):
kmodule LESSON-13-C-SYNTAX imports UNSIGNED-INT-SYNTAX imports BOOL-SYNTAX syntax Val ::= Int | Bool syntax Exp ::= Val > left: Exp "+" Exp > left: Exp "&&" Exp endmodule module LESSON-13-C imports LESSON-13-C-SYNTAX imports INT imports BOOL rule <k> I1:Int + I2:Int => I1 +Int I2 ...</k> rule <k> B1:Bool && B2:Bool => B1 andBool B2 ...</k> syntax KItem ::= freezer1(Val) | freezer2(Exp) | freezer3(Val) | freezer4(Exp) rule <k> E1:Val + E2:Exp => E2 ~> freezer1(E1) ...</k> [priority(51)] rule <k> E1:Exp + E2:Exp => E1 ~> freezer2(E2) ...</k> [priority(52)] rule <k> E1:Val && E2:Exp => E2 ~> freezer3(E1) ...</k> [priority(51)] rule <k> E1:Exp && E2:Exp => E1 ~> freezer4(E2) ...</k> [priority(52)] rule <k> E2:Val ~> freezer1(E1) => E1 + E2 ...</k> rule <k> E1:Val ~> freezer2(E2) => E1 + E2 ...</k> rule <k> E2:Val ~> freezer3(E1) => E1 && E2 ...</k> rule <k> E1:Val ~> freezer4(E2) => E1 && E2 ...</k> endmodule
This is still rather cumbersome, but it is already greatly simplified. In the next lesson, we will see how additional features of K can be used to specify heating and cooling rules much more compactly.
Exercises
- Modify
LESSON-13-C
to add rules to evaluate integer subtraction.
Next lesson
Once you have completed the above exercises, you can continue to Lesson 1.14: Defining Evaluation Order.