Lesson 1.16: Maps, Semantic Lists, and Sets
The purpose of this lesson is to explain how to use the data structure sorts provided by K: maps, lists, and sets.
Maps
The most frequently used type of data structure in K is the map. The sort
provided by K for this purpose is the Map
sort, and it is provided in
domains.md in the MAP
module. This type is not (currently) polymorphic. All Map
terms are maps that
map terms of sort KItem
to other terms of sort KItem
. A KItem
can contain
any sort except a K
sequence. If you need to store such a term in a
map, you can always use a wrapper such as syntax KItem ::= kseq(K)
.
A Map
pattern consists of zero or more map elements (as represented by the
symbol syntax Map ::= KItem "|->" KItem
), mixed in any order, separated by
whitespace, with zero or one variables of sort Map
. The empty map is
represented by .Map
. If all of the bindings for the variables in the keys
of the map can be deterministically chosen, these patterns can be matched in
O(1)
time. If they cannot, then each map element that cannot be
deterministically constructed contributes a single dimension of polynomial
time to the cost of the matching. In other words, a single such element is
linear, two are quadratic, three are cubic, etc.
Patterns like the above are the only type of Map
pattern that can appear
on the left-hand-side of a rule. In other words, you are not allowed to write
a Map
pattern on the left-hand-side with more than one variable of sort Map
in it. You are, however, allowed to write such patterns on the right-hand-side
of a rule. You can also write a function pattern in the key of a map element
so long as all the variables in the function pattern can be deterministically
chosen.
Note the meaning of matching on a Map
pattern: a map pattern with no
variables of sort Map
will match if the map being matched has exactly as
many bindings as |->
symbols in the pattern. It will then match if each
binding in the map pattern matches exactly one distinct binding in the map
being matched. A map pattern with one Map
variable will also match any map
that contains such a map as a subset. The variable of sort Map
will be bound
to whatever bindings are left over (.Map
if there are no bindings left over).
Here is an example of a simple definition that implements a very basic
variable declaration semantics using a Map
to store the value of variables
(lesson-16-a.k
):
kmodule LESSON-16-A-SYNTAX imports INT-SYNTAX imports ID-SYNTAX syntax Exp ::= Id | Int syntax Decl ::= "int" Id "=" Exp ";" [strict(2)] syntax Pgm ::= List{Decl,""} endmodule module LESSON-16-A imports LESSON-16-A-SYNTAX imports BOOL configuration <T> <k> $PGM:Pgm </k> <state> .Map </state> </T> // declaration sequence rule <k> D:Decl P:Pgm => D ~> P ...</k> rule <k> .Pgm => . ...</k> // variable declaration rule <k> int X:Id = I:Int ; => . ...</k> <state> STATE => STATE [ X <- I ] </state> // variable lookup rule <k> X:Id => I ...</k> <state>... X |-> I ...</state> syntax Bool ::= isKResult(K) [symbol, function] rule isKResult(_:Int) => true rule isKResult(_) => false [owise] endmodule
There are several new features in this definition. First, note we import
the module ID-SYNTAX
. This module is defined in domains.md
and provides a
basic syntax for identifiers. We are using the Id
sort provided by this
module in this definition to implement the names of program variables. This
syntax is only imported when parsing programs, not when parsing rules. Later in
this lesson we will see how to reference specific concrete identifiers in a
rule.
Second, we introduce a single new function over the Map
sort. This function,
which is represented by the symbol
syntax Map ::= Map "[" KItem "<-" KItem "]"
, represents the map update
operation. Other functions over the Map
sort can be found in domains.md
.
Finally, we have used the ...
syntax on a cell containing a Map. In this
case, the meaning of <state>... Pattern ...</state>
,
<state>... Pattern </state>
, and <state> Pattern ...</state>
are the same:
it is equivalent to writing <state> (Pattern) _:Map </state>
.
Consider the following program (a.decl
):
int x = 0;
int y = 1;
int a = x;
If we run this program with krun
, we will get the following result:
<T>
<k>
.
</k>
<state>
a |-> 0
x |-> 0
y |-> 1
</state>
</T>
Note that krun
has automatically sorted the collection for you. This doesn't
happen at runtime, so you still get the performance of a hash map, but it will
help make the output more readable.
Exercise
Create a sort Stmt
that is a subsort of Decl
. Create a production of sort
Stmt
for variable assignment in addition to the variable declaration
production. Feel free to use the syntax syntax Stmt ::= Id "=" Exp ";"
. Write
a rule that implements variable assignment using a map update function. Then
write the same rule using a map pattern. Test your implementations with some
programs to ensure they behave as expected.
Semantic Lists
In a previous lesson, we explained how to represent lists in the AST of a program. However, this is not the only context where lists can be used. We also frequently use lists in the configuration of an interpreter in order to represent certain types of program state. For this purpose, it is generally useful to have an associative-list sort, rather than the cons-list sorts provided in Lesson 1.12.
The type provided by K for this purpose is the List
sort, and it is also
provided in domains.md
, in the LIST
module. This type is also not
(currently) polymorphic. Like Map
, all List
terms are lists of terms of the
KItem
sort.
A List
pattern in K consists of zero or more list elements (as represented by
the ListItem
symbol), followed by zero or one variables of sort List
,
followed by zero or more list elements. An empty list is represented by
.List
. These patterns can be matched in O(log(N))
time. This is the only
type of List
pattern that can appear on the left-hand-side of a rule. In
other words, you are not allowed to write a List
pattern on the
left-hand-side with more than one variable of sort List
in it. You are,
however, allowed to write such patterns on the right-hand-side of a rule.
Note the meaning of matching on a List
pattern: a list pattern with no
variables of sort List
will match if the list being matched has exactly as
many elements as ListItem
symbols in the pattern. It will then match if each
element in sequence matches the pattern contained in the ListItem
symbol. A
list pattern with one variable of sort List
operates the same way, except
that it can match any list with at least as many elements as ListItem
symbols, so long as the prefix and suffix of the list match the patterns inside
the ListItem
symbols. The variable of sort List
will be bound to whatever
elements are left over (.List
if there are no elements left over).
The ...
syntax is allowed on cells containing lists as well. In this case,
the meaning of <cell>... Pattern </cell>
is the same as
<cell> _:List (Pattern) </cell>
, the meaning of <cell> Pattern ...</cell>
is the same as <cell> (Pattern) _:List</cell>
. Because list patterns with
multiple variables of sort List
are not allowed, it is an error to write
<cell>... Pattern ...</cell>
.
Here is an example of a simple definition that implements a very basic
function-call semantics using a List
as a function stack (lesson-16-b.k
):
kmodule LESSON-16-B-SYNTAX imports INT-SYNTAX imports ID-SYNTAX syntax Exp ::= Id "(" ")" | Int syntax Stmt ::= "return" Exp ";" [strict] syntax Decl ::= "fun" Id "(" ")" "{" Stmt "}" syntax Pgm ::= List{Decl,""} syntax Id ::= "main" [token] endmodule module LESSON-16-B imports LESSON-16-B-SYNTAX imports BOOL imports LIST configuration <T> <k> $PGM:Pgm ~> main () </k> <functions> .Map </functions> <fstack> .List </fstack> </T> // declaration sequence rule <k> D:Decl P:Pgm => D ~> P ...</k> rule <k> .Pgm => . ...</k> // function definitions rule <k> fun X:Id () { S } => . ...</k> <functions>... .Map => X |-> S ...</functions> // function call syntax KItem ::= stackFrame(K) rule <k> X:Id () ~> K => S </k> <functions>... X |-> S ...</functions> <fstack> .List => ListItem(stackFrame(K)) ...</fstack> // return statement rule <k> return I:Int ; ~> _ => I ~> K </k> <fstack> ListItem(stackFrame(K)) => .List ...</fstack> syntax Bool ::= isKResult(K) [function, symbol] rule isKResult(_:Int) => true rule isKResult(_) => false [owise] endmodule
Notice that we have declared the production syntax Id ::= "main" [token]
.
Since we use the ID-SYNTAX
module, this declaration is necessary in order to
be able to refer to the main
identifier directly in the configuration
declaration. Our <k>
cell now contains a K
sequence initially: first we
process all the declarations in the program, then we call the main
function.
Consider the following program (foo.func
):
fun foo() { return 5; }
fun main() { return foo(); }
When we krun
this program, we should get the following output:
<T>
<k>
5 ~> .
</k>
<functions>
foo |-> return 5 ;
main |-> return foo ( ) ;
</functions>
<fstack>
.List
</fstack>
</T>
Note that we have successfully put on the <k>
cell the value returned by the
main
function.
Exercise
Add a term of sort Id
to the stackFrame
operator to keep track of the
name of the function in that stack frame. Then write a function
syntax String ::= printStackTrace(List)
that takes the contents of the
<fstack>
cell and pretty prints the current stack trace. You can concatenate
strings with +String
in the STRING
module in domains.md
, and you can
convert an Id
to a String
with the Id2String
function in the ID
module.
Test this function by creating a new expression that returns the current stack
trace as a string. Make sure to update isKResult
and the Exp
sort as
appropriate to allow strings as values.
Sets
The final primary data structure sort in K is a set, i.e., an idempotent
unordered collection where elements are deduplicated. The sort provided by K
for this purpose is the Set
sort and it is provided in domains.md
in the
SET
module. Like maps and lists, this type is not (currently) polymorphic.
Like Map
and List
, all Set
terms are sets of terms of the KItem
sort.
A Set
pattern has the exact same restrictions as a Map
pattern, except that
its elements are treated like keys, and there are no values. It has the same
performance characteristics as well. However, syntactically it is more similar
to the List
sort: An empty Set
is represented by .Set
, but a set element
is represented by the SetItem
symbol.
Matching behaves similarly to the Map
sort: a set pattern with no variables
of sort Set
will match if the set has exactly as many bindings as SetItem
symbols, and if each element pattern matches one distinct element in the set.
A set with a variable of sort Set
also matches any superset of such a set.
As with map, the elements left over will be bound to the Set
variable (or
.Set
if no elements are left over).
Like Map
, the ...
syntax on a set is syntactic sugar for an anonymous
variable of sort Set
.
Here is an example of a simple modification to LESSON-16-A
which uses a Set
to ensure that variables are never declared more than once. In practice, you
would likely just use the in_keys
symbol over maps to test for this, but
it's still useful as an example of sets in practice:
kmodule LESSON-16-C-SYNTAX imports LESSON-16-A-SYNTAX endmodule module LESSON-16-C imports LESSON-16-C-SYNTAX imports BOOL imports SET configuration <T> <k> $PGM:Pgm </k> <state> .Map </state> <declared> .Set </declared> </T> // declaration sequence rule <k> D:Decl P:Pgm => D ~> P ...</k> rule <k> .Pgm => . ...</k> // variable declaration rule <k> int X:Id = I:Int ; => . ...</k> <state> STATE => STATE [ X <- I ] </state> <declared> D => D SetItem(X) </declared> requires notBool X in D // variable lookup rule <k> X:Id => I ...</k> <state>... X |-> I ...</state> <declared>... SetItem(X) ...</declared> syntax Bool ::= isKResult(K) [symbol, function] rule isKResult(_:Int) => true rule isKResult(_) => false [owise] endmodule
Now if we krun
a program containing duplicate declarations, it will get
stuck on the declaration.
Exercises
- Modify your solution to Lesson 1.14, Exercise 2 and introduce the sorts
Decls
,Decl
, andStmt
which include variable and function declaration (without function parameters), and return and assignment statements, as well as call expressions. UseList
andMap
to implement these operators, making sure to consider the interactions between components, such as saving and restoring the environment of variables at each call site. Don't worry about local function definitions or global variables for now. Make sure to test the resulting interpreter.
Next lesson
Once you have completed the above exercises, you can continue to Lesson 1.17: Cell Multiplicity and Cell Collections.