Lesson 1.21: Unification and Symbolic Execution
The purpose of this lesson is to teach the basic concepts of symbolic execution in order to introduce the unique capabilities of the Haskell Backend at a conceptual level.
Symbolic Execution
Thus far, all of the programs we have run using K have been concrete configurations. What this means is that the configuration we use to initialize the K rewrite engine is concrete; in other words, contains no logical variables. The LLVM Backend is a concrete execution engine, meaning that it is only capable of rewriting concrete configurations.
By contrast, the Haskell Backend performs symbolic execution, which is capable of rewriting any configuration, including those where parts of the configuration are symbolic, ie, contain variables or uninterpreted functions.
Unification
Previously, we have introduced the concept that K rewrite rules operate by means of pattern matching: the current configuration being rewritten is pattern matched against the left-hand side of the rewrite rule, and the substitution is used in order to construct a new term from the right-hand side. In symbolic execution, we use unification instead of pattern matching. To summarize, unification behaves akin to a two-way pattern matching where both the configuration and the left-hand side of the rule can contain variables, and the algorithm generates a most general unifier containing substitutions for the variables in both which will make both terms equal.
Feasibility
Unification by itself cannot completely solve the problem of symbolic execution. One task symbolic execution must perform is to identify whether a particular symbolic term is feasible, that is to say, that there actually exists a concrete instantiation of that term such that all the logical constraints on that term can actually be satisfied. The Haskell Backend delegates this task to Z3, an SMT solver. This solver is used to periodically trim configurations that are determined to be mathematically infeasible.
Symbolic terms
The final component of symbolic execution consists of the task of introducing
symbolic terms into the configuration. This can be done one of two different
ways. First, the term being passed to krun
can actually be symbolic. This
is less frequently used because it requires the user to construct an AST
that contains variables, something which our current parsing capabilities are
not well-equipped to do. The second, more common, way of introducing symbolic
terms into a configuration consists of writing rules where there exists an
existentially qualified variable on the right-hand side of the rule that does
not exist on the left-hand side of the rule.
In order to prevent users from writing such rules by accident, K requires
that such variables begin with the ?
prefix. For example, here is a rule
that rewrites a constructor foo
to a symbolic integer:
rule <k> foo => ?X:Int ...</k>
When this rule applies, a fresh variable is introduced to the configuration, which then is unified against the rules that might apply in order to symbolically execute that configuration.
ensures
clauses
We also introduce here a new feature of K rules that applies when a rule
has this type of variable on the right-hand side: the ensures
clause.
An ensures
clause is similar to a requires
clause and can appear after
a rule body, or after a requires
clause. The ensures
clause is used to
introduce constraints that might apply to the variable that was introduced by
that rule. For example, we could write the rule above with the additional
constraint that the symbolic integer that was introduced must be less than
five, by means of the following rule:
rule <k> foo => ?X:Int ...</k> ensures ?X <Int 5
Putting it all together
Putting all these pieces together, it is possible to use the Haskell Backend to perform symbolic reasoning about a particular K module, determining all the possible states that can be reached by a symbolic configuration.
For example, consider the following K definition (lesson-21.k
):
kmodule LESSON-21 imports INT rule <k> 0 => ?X:Int ... </k> ensures ?X =/=Int 0 rule <k> X:Int => 5 ... </k> requires X >=Int 10 endmodule
When we symbolically execute the program 0
, we get the following output
from the Haskell Backend:
<k>
5 ~> .
</k>
#And
{
true
#Equals
?X:Int >=Int 10
}
#And
#Not ( {
?X:Int
#Equals
0
} )
#Or
<k>
?X:Int ~> .
</k>
#And
#Not ( {
true
#Equals
?X:Int >=Int 10
} )
#And
#Not ( {
?X:Int
#Equals
0
} )
Note some new symbols introduced by this configuration: #And
, #Or
, and
#Equals
. While andBool
, orBool
, and ==K
represent functions of sort
Bool
, #And
, #Or
, and #Equals
are matching logic connectives. We
will discuss matching logic in more detail later in the tutorial, but the basic
idea is that these symbols represent Boolean operators over the domain of
configurations and constraints, as opposed to over the Bool
sort.
Notice that the configuration listed above is a disjunction of conjunctions. This is the most common form of output that can be produced by the Haskell Backend. In this case, each conjunction consists of a configuration and a set of constraints. What this conjunction describes, essentially, is a configuration and a set of information that was derived to be true while rewriting that configuration.
Similar to how we saw --search
in a previous lesson, the reason we have
multiple disjuncts is because there are multiple possible output states
for this program, depending on whether or not the second rule applied. In the
first case, we see that ?X
is greater than or equal to 10, so the second rule
applied, rewriting the symbolic integer to the concrete integer 5. In the
second case, we see that the second rule did not apply because ?X
is less
than 10. Moreover, because of the ensures
clause on the first rule, we know
that ?X
is not zero, therefore the first rule will not apply a second time.
If we had omitted this constraint, we would have ended up infinitely applying
the first rule, leading to krun
not terminating.
In the next lesson, we will cover how symbolic execution forms the backbone of deductive program verification in K and how we can use K to prove programs correct against a specification.
Exercises
- Create another rule in
LESSON-21
that rewrites odd integers greater than ten to a symbolic even integer less than 10 and greater than 0. This rule will now apply nondeterministically along with the existing rules. Predict what the resulting output configuration will be from rewriting0
after adding this rule. Then run the program and see whether your prediction is correct.
Once you have completed the above exercises, you can continue to Lesson 1.22: Basics of Deductive Program Verification using K.