This repository has been archived on 2025-03-06. You can view files and clone it, but cannot push or open issues or pull requests.
dekker-phd-thesis/chapters/4_rewriting.tex

980 lines
47 KiB
TeX

%************************************************
\chapter{Rewriting Constraint Modelling Languages}\label{ch:rewriting}
%************************************************
In this chapter We describe a new \textbf{systematic view
of the execution of \cmls}, and build on this to propose a new tool chain. We
show how this tool chain allows us to
The new architecture is shown in \Cref{sfig:4-newarch}. A constraint model is
first compiled to byte code (called \microzinc), independent of the data. The
byte code is interpreted with the data to produce \nanozinc\ code, an extension
of the existing \flatzinc\ format. The interpreter can even be made incremental:
in \cref{ch:incremental} we discuss how in meta optimisation, no recompilation
is required.
We have developed a prototype of this tool chain, and present experimental
validation of these advantages. The prototype is still very experimental, but
preliminary results suggest the new tool chain can perform flattening much
faster, and produce better models, than the current \minizinc\ compiler.
This chapter is organised as follows. \Cref{sec:4-micronano} introduces the
\microzinc\ and \nanozinc\ languages, the new intermediate representation we
propose that enables more efficient flattening. \Cref{sec:4-simplification}
describes how we can perform various processing and simplification steps on this
representation, and in \cref{sec:4-experiments} we report on the experimental
results of the prototype implementation. Finally, \Cref{sec:4-conclusion}
presents our conclusions.
\begin{figure}[ht]
\centering
\begin{subfigure}[b]{.45\columnwidth}
\centering
\includegraphics[width=\columnwidth]{assets/img/4_old_process}
\caption{\label{sfig:4-oldarch}}
\end{subfigure}%
\hspace{0.05\columnwidth}%
\begin{subfigure}[b]{.50\columnwidth}
\centering
\includegraphics[width=\columnwidth]{assets/img/4_new_process}
\caption{\label{sfig:4-newarch}}
\end{subfigure}
\caption{\label{fig:4-arch} (a) Current \minizinc\ architecture, and (b) new
\minizinc\ architecture}
\end{figure}
\section{\glsentrytext{microzinc} and
\glsentrytext{nanozinc}}\label{sec:4-micronano}
This section introduces the two sub-languages of \minizinc\ at the core of the
new abstract machine model we have developed.
\microzinc\ is generally a simplified subset of \minizinc\ that only contains
function declarations and a restricted expression language, but extends
\minizinc{}'s type system with tuples. Models written in \minizinc\ can be
translated into \microzinc. \nanozinc\ is similar to \flatzinc\ in that it
contains only simple variable declarations and flat constraints. However, while
all function calls in \flatzinc\ need to be solver-level primitives, \nanozinc\
calls can refer to functions implemented in \microzinc.
The core of the syntax of \microzinc\ is defined in \cref{fig:4-uzn-syntax}. We
have omitted the definitions of \syntax{<mzn-type-inst>} and \syntax{<ident>},
which you can assume to be the same as the definition of \syntax{<type-inst>}
and \syntax{<ident>} in the \minizinc\ grammar, see \cref{ch:minizinc-grammar}.
While we omit the typing rules here for space reasons, we will assume that in
well-typed \microzinc\ programs, the conditions \syntax{<exp>} in if-then-else
and where expressions have \mzninline{par} type.
\begin{figure}
\begin{grammar}
<model> ::= <fun-decl>*
<fun-decl> ::= "function" <type-inst>":" <ident>"(" <param> ["," <param>]* ")" [ "=" <exp> ]";"
<literal> ::= <constant> | <ident>
<exp> ::= <literal>
\alt <ident>"(" <literal> ["," <literal>]* ")"
\alt "let" "{" <item>* "}" "in" <exp>
\alt "if" <exp> "then" <exp> "else" <exp> "endif"
\alt "[" <exp> "|" <gen-exp> "where" <exp> "]"
\alt "(" <exp> ["," <exp>]* ")"
<item> ::= <param> [ "=" <exp> ]";"
\alt "(" <param> ["," <param>]* ")" "=" <exp>";"
\alt "constraint" <exp>";"
<param> ::= <type-inst>":" <ident>
<type-inst> ::= <mzn-type-inst> | "tuple(" <mzn-type-inst> ["," <mzn-type-inst>]* ")"
<gen-exp> ::= <ident> "in" <exp> ["," <ident> "in" <exp> ]*
\end{grammar}
\caption{\label{fig:4-uzn-syntax}Syntax of \microzinc.}
\end{figure}
\newcommand{\sep}{\rhd}
A \nanozinc\ program, defined in \cref{fig:4-nzn-syntax}, is simply a list of
variable declaration and constraints in the form of calls. The syntax
``\texttt{└──}'' will be used to track dependent constraints (this will be
explained in detail in \cref{sec:4-nanozinc}, for now you can ignore them).
\begin{figure}
\begin{grammar}
<nzn> ::= <nzn-item>*
<nzn-item> ::= <nzn-def> | <nzn-con>
<nzn-con> ::= "constraint" <ident>"(" <literal> ["," <literal>]* ")"";"
<nzn-var> ::= "var" <nzn-dom> ":" <ident> ";" <nzn-bind>*
<nzn-dom> ::= <constant> ".." <constant> | <constant>
\alt "bool" | "int" | "float" | "set of int"
<nzn-bind> ::= "└──" <nzn-con>
\end{grammar}
\caption{\label{fig:4-nzn-syntax}Syntax of \nanozinc.}
\end{figure}
\subsection{\glsentrytext{minizinc} to
\glsentrytext{microzinc}/\glsentrytext{nanozinc}}
The translation from \minizinc\ to \microzinc\ involves the following steps:
\begin{enumerate}
\item Transform all expressions that are valid in \minizinc\ but not in
\microzinc. This includes simple cases such as replacing operator
expressions like \mzninline{x+y} by function calls
\mzninline{int_plus(x,y)}, and more complex rules for rewriting
conditionals with variables (as mentioned above), optional variables
into non-optional forms \autocite{mears-2014-option}, lifting partiality
in expressions into the surrounding context
\autocite{stuckey-2013-functions}, and statically resolving
subtype-based overloading.
\item Transform all function definitions so that they do not refer to
top-level variables, by adding those variables as extra arguments to the
function declaration and each call site
\item Move all top-level variable declarations and constraints into a let
expression in the body of a new function called \mzninline{main}, whose
arguments are the parameters of the model.
\end{enumerate}
\begin{example}
Consider the following \minizinc\ model, a simple unsatisfiable ``pigeonhole
problem'', trying to assign \(n\) pigeons to \(n-1\) holes:
\begin{mzn}
predicate all_different(array[int] of var int: x) =
forall (i,j in 1..length(x) where i<j) (
x[i] != x[j]
);
int: n;
array[1..n] of var 1..n-1: pigeon;
constraint all_different(pigeon);
\end{mzn}
The translation to \microzinc\ turns all expressions into function calls,
replaces the special generator syntax in the \mzninline{forall} with an array
comprehension, and creates the \mzninline{main} function. Note how the domain
\(1..n-1\) of the \mzninline{pigeon} array is replaced by a \mzninline{set_in}
constraint for each element of the array.
\begin{mzn}
predicate all_different(array[int] of var int: x) =
forall([
int_neq(element(x,i),element(x,j))
| i in range(1,length(x)),
j in range(1,length(x))
where int_lt(i,j)
]);
predicate main(int: n) = let {
array[range(1,n)] of var int: pigeon;
constraint forall([
set_in(element(pigeon,i), range(1,int_minus(n,1)))
| i in range(1,n)
])
constraint all_different(pigeon);
} in true;
\end{mzn}
For a concrete instance of the model, we can then create a simple \nanozinc\
program that calls \mzninline{main}, for example for \(n=10\):
\begin{nzn}
constraint main(10);
\end{nzn}
\end{example}
\subsection{Partial Evaluation of \glsentrytext{nanozinc}}\label{sub:4-partial}
Each call in a \nanozinc\ program is either a primitive, or it has a
corresponding \microzinc\ definition. The goal of partial evaluation is to
transform the \nanozinc\ program into (essentially) \flatzinc, i.e., a program
where all calls are either solver-level constraint primitives, or defining
variables.
To achieve this, we define the semantics of \microzinc\ as a rewriting system
that produces \nanozinc\ calls. Each non-primitive call in a \nanozinc\ program
can then be replaced with the result of evaluating the corresponding \microzinc\
function.
A key element of this rewriting process, then, is the transformation of
functional expressions into relational form. For this to happen, the rewriting
process introduces fresh variables to represent intermediate expressions, and
keeps track of constraints added during the rewriting.
\begin{example}\label{ex:4-abs}
Consider the following (reasonably typical) \minizinc\ encoding for the
absolute value function:
\begin{mzn}
function var int: abs(var int: x) =
let {
var int: z;
constraint int_abs(x, z);
} in z;
\end{mzn}
When an occurrence of \mzninline{abs} is encountered, the call will be
replaced by a variable \mzninline{z}. But before that can occur, the variable
must be introduced, and the constraint \mzninline{int_abs(x, z)} is added to
the model, enforcing its value.
\end{example}
The function \mzninline{abs(x)} above is a total function: for every value of
\mzninline{x}, there is always exactly one valid value of \mzninline{z}. In
this case, the constraint introduced in the \mzninline{let} expression may be
enforced globally, and multiple occurrences of \mzninline{abs(x)} may be
replaced with the same \mzninline{z}. However, the introduced
\mzninline{int_abs} constraint is only needed so long as any \emph{other}
constraint refers to \mzninline{z}. If \mzninline{z} is unused in the rest of
the model, both \mzninline{z} and the \mzninline{int_abs} constraint can be
removed.
As we shall see in \cref{sec:4-simplification}, certain optimisations and
simplifications during the evaluation can lead to many expressions becoming
unused. It is therefore important to track which constraints were merely
introduced to define results of function calls. In order to track these
dependencies, \nanozinc\ attaches constraints that define a variable to the item
that introduces the variable.
\subsection{\glsentrytext{nanozinc}}\label{sec:4-nanozinc}
A \nanozinc\ model (\cref{fig:4-nzn-syntax}) consists of a topologically-ordered
list of definitions. Each definition binds a variable to the result of a call or
another variable, and it is associated with a list of identifiers of auxiliary
constraints. The \nanozinc\ model contains a special definition
\mzninline{true}, containing all ``root-context'' constraints of the model,
i.e., those that have to hold globally and are not just used to define an
auxiliary variable. Only root-context constraints (attached to \mzninline{true})
can effectively constrain the overall problem. Constraints attached to
definitions originate from function calls, and since all functions are
guaranteed to be total, attached constraints can only define the function
result.
\begin{example}\label{ex:4-absnzn}
Consider the \minizinc\ fragment
\begin{mzn}
constraint abs(x) > y;
\end{mzn}
where \mzninline{x} and \mzninline{y} have domain \mzninline{-10..10}. After
rewriting all definitions, the resulting \nanozinc\ program could look like
this:
\begin{nzn}
var -10..10: x;
var -10..10: y;
var int: z;
└── constraint int_abs(x, z);
constraint int_gt(z, y);
\end{nzn}
\jip{Adjust text to match the new \nanozinc\ formatting.}
The variables \mzninline{x} and \mzninline{y} result in identifiers bound to
the special built-in \mzninline{mkvar}, which records the variable's domain in
its arguments. Evaluating the call \mzninline{abs(x)} introduces a new
variable \mzninline{z} (with unrestricted domain) and the \mzninline{int_abs}
constraint that is used to define \mzninline{z}, bound to a fresh identifier
\mzninline{b0}. A fresh identifier \mzninline{t} is introduced to capture both
the result \mzninline{z} as well as the list of defining constraints,
\mzninline{[b0]}. The top-level constraint \mzninline{abs(x) > y} is rewritten
into \mzninline{int_gt(t,y)} and bound to fresh identifier \mzninline{b1},
which is added to the root-context definition \mzninline{true}.
\end{example}
\subsection{Rewriting Rules}
The core of the proposed \minizinc\ architecture is an abstract machine that
performs the rewriting of \nanozinc\ items using the corresponding \microzinc\
function definitions. We can now formally define the rewriting rules for this
abstract machine.
The full set of rules appears in
\cref{fig:4-rewrite-to-nzn-calls,fig:4-rewrite-to-nzn-let,fig:4-rewrite-to-nzn-other}.
To simplify presentation, we assume that all rules that introduce new
identifiers into the \nanozinc\ program do so in a way that ensures those
identifiers are fresh (\ie\ not used in the rest of the \nanozinc\ program), by
suitable alpha renaming.
\begin{figure*}
\centering
\begin{prooftree}
\hypo{\ptinline{F(\(p_1, \ldots, p_k\)) = E;} \in \Prog \text{~where the~} p_i \text{~are fresh}}
\infer[no rule]1{\Sem{\(a_i\)}{\Prog, \Env_{i-1}} \Rightarrow \tuple{v_i,
\Env_i'}, \ \Env_0=\Env, \Env_i=\Env_i'\cup\{p_i\mapsto v_i\sep[]\}, \forall 1 \leq i \leq k}
\infer[no rule]1{\Sem{E}{\Prog, \Env_k} \Rightarrow \tuple{v, \Env'}}
\infer1[(Call)]{\Sem{F(\(a_1, \ldots, a_k\))}{\Prog, \Env_{0}} \Rightarrow \tuple{v, \Env'}}
\end{prooftree} \\
\medskip
\begin{prooftree}
\hypo{\ptinline{F} \in \text{Builtins}}
\hypo{\Sem{\(a_i\)}{\Prog, \Env} \Rightarrow \tuple{v_i, \Env}, \forall{} 1 \leq{} i \leq{} k}
\infer2[(CallBuiltin)]{\Sem{F(\(a_1, \ldots, a_k\))}{\Prog, \Env} \Rightarrow \tuple{ \mathit{eval}(\ptinline{F}(v_1,\ldots, v_k)), \Env}}
\end{prooftree} \\
\medskip
\begin{prooftree}
\hypo{\ptinline{F(\(p_1, \ldots, p_k\));} \in \Prog}
\hypo{\Sem{\(a_i\)}{\Prog, \Env_{i-1}} \Rightarrow \tuple{v_i, \Env_i}, \forall 1 \leq i \leq k}
\infer2[(CallPrimitive)]{\Sem{F(\(a_1, \ldots, a_k\))}{\Prog, \Env_0} \Rightarrow
\tuple{ x, \{ x \mapsto \ptinline{F}(v_1,\ldots, v_k) \sep [] \} \cup \Env_k}}
\end{prooftree}
\caption{\label{fig:4-rewrite-to-nzn-calls} Rewriting rules for partial evaluation
of \microzinc\ calls to \nanozinc.}
\end{figure*}
The rules in \cref{fig:4-rewrite-to-nzn-calls} handle function calls. The first
rule (Call) evaluates a function call expression in the context of a \microzinc\
program \(\Prog\) and \nanozinc\ program \(\Env\). The rule first evaluates all
actual parameter expressions \(a_i\), creating new contexts where the evaluation
results are bound to the formal parameters \(p_i\). It then evaluates the
function body \(\ptinline{E}\) on this context, and returns the result.
The (CallBuiltin) rule applies to ``built-in'' functions that can be evaluated
directly, such as arithmetic and Boolean operations on fixed values. The rule
simply returns the result of evaluating the built-in function on the evaluated
parameter values.
The (CallPrimitive) rule applies to non-built-in functions that are defined
without a function body. These are considered primitives, and as such simply
need to be transferred into the \nanozinc\ program. The rule evaluates all
actual parameters, and creates a new variable \(x\) to store the result, which is
simply the function applied to the evaluated parameters.
\begin{figure*}
\centering
\begin{prooftree}
\hypo{\Sem{\(\mathbf{I}\)}{\Prog, \Env} \Rightarrow (\Ctx, \Env')}
\hypo{\Sem{\(E\)}{\Prog, \Env'} \Rightarrow \tuple{v, \Env''}}
\hypo{x \text{~fresh}}
\infer3[(LetC)]{\Sem{let \{ \(\mathbf{I}\) \} in \(E\)}{\Prog, \Env} \Rightarrow
\tuple{x, \{ x \mapsto v \sep \Ctx \} \cup \Env''}}
\end{prooftree} \\
\medskip
\begin{prooftree}
\hypo{}
\infer1[(Item0)]{\Sem{\(\epsilon\)}{\Prog, \Env} \Rightarrow \tuple{\ptinline{true}, \Env}}
\end{prooftree} \\
\medskip
\begin{prooftree}
\hypo{\Sem{\(\mathbf{I}\)}{\Prog, \Env\ \cup\ \{x \mapsto\ \texttt{mkvar()} \sep\ []\}} \Rightarrow \tuple{\Ctx, \Env'}}
\infer1[(ItemT)]{\Sem{\(t:x, \mathbf{I}\)}{\Prog, \Env} \Rightarrow \tuple{\Ctx, \Env'}}
\end{prooftree} \\
\medskip
\begin{prooftree}
\hypo{\Sem{\(E\)}{\Prog, \Env} \Rightarrow \tuple{v, \Env'}}
\hypo{\Sem{\(\mathbf{I}\)}{\Prog, \Env' \cup\ \{x \mapsto\ v \sep\ [] \}} \Rightarrow \tuple{\Ctx, \Env''}}
\infer2[(ItemTE)]{\Sem{\(t:x = E, \mathbf{I}\)}{\Prog, \Env} \Rightarrow \tuple{\Ctx, \Env''}}
\end{prooftree} \\
\medskip
\begin{prooftree}
\hypo{\Sem{\(E\)}{\Prog, \Env} \Rightarrow \tuple{\left(v_{1}, \ldots, v_{n}\right), \Env'}}
\infer[no rule]1{\Sem{\(\mathbf{I}\)}{\Prog, \Env' \cup\ \{x_{1} \mapsto\ v_{1} \sep\ [] \} \cup\ \ldots\ \cup\ \{x_{n} \mapsto\ v_{n} \sep\ [] \}} \Rightarrow \tuple{\Ctx, \Env''}}
\infer1[(ItemTD)]{\Sem{\(\left(t_{1}: x_{1}, \ldots, t_{n}: x_{n}\right) = E, \mathbf{I}\)}{\Prog, \Env} \Rightarrow
\tuple{\Ctx, \Env''}}
\end{prooftree} \\
\medskip
\begin{prooftree}
\hypo{\Sem{\(C\)}{\Prog, \Env} \Rightarrow \tuple{v, \Env'}}
\hypo{\Sem{\(\mathbf{I}\)}{\Prog, \Env'} \Rightarrow \tuple{\Ctx, \Env''}}
\infer2[(ItemC)]{\Sem{\(\mbox{constraint~} C, \mathbf{I}\)}{\Prog, \Env} \Rightarrow
\tuple{\{v\}\cup\Ctx, \Env''}}
\end{prooftree}
\caption{\label{fig:4-rewrite-to-nzn-let} Rewriting rules for partial evaluation
of \microzinc\ let expressions to \nanozinc.}
\end{figure*}
The rules in \cref{fig:4-rewrite-to-nzn-let} considers let expressions. The
(LetC) rule generates the constraints \(\phi\) and \nanozinc\ program \(\Env'\)
defined by the let items \textbf{I}, then evaluates the body expression \(E\) in
this new context. It finally binds the result \(v\) to a fresh variable \(x\),
collecting the constraints \(\phi\) that arose from the let items: \(x\mapsto
v\sep\phi\). The (ItemT) rule handles introduction of new variables by adding
them to the context. The (ItemTE) rule handles introduction of new variables
with a defining equation by evaluating them in the current context and adding a
definition to the context to evaluate the result of the items. The (Item0) rule
is the base case for a list of let items.
\begin{figure*}
\centering
\begin{prooftree}
\hypo{x \in \syntax{<ident>}}
\hypo{v \in \syntax{<literal>}}
\infer2[(IdC)]{\Sem{\(x\)}{\Prog, \{x \mapsto\ v \sep\ []\} \cup\ \Env} \Rightarrow \tuple{v, \{x \mapsto v \sep [] \} \cup \Env}}
\end{prooftree} \\
\medskip
\begin{prooftree}
\hypo{x \in \syntax{<ident>}}
\hypo{v}
\hypo{\phi \text{~otherwise}}
\infer3[(IdX)]{\Sem{\(x\)}{\Prog, \{x \mapsto\ v \sep\ \phi\ \} \cup\ \Env} \Rightarrow \tuple{x, \{x \mapsto v \sep \phi \} \cup \Env}}
\end{prooftree} \\
\medskip
\begin{prooftree}
\hypo{c \text{~constant}}
\infer1[(Const)]{\Sem{c}{\Prog, \Env} \Rightarrow \tuple{c, \Env}}
\end{prooftree} \\
\medskip
\begin{prooftree}
\hypo{\Sem{\(E_{i}\)}{\Prog,\Env^{i-1}} \Rightarrow \tuple{v_{i}, \Env^{i}}, \forall{} 1 \leq{} i \leq{} k}
\infer1[(Tuple)]{\Sem{\(\left(E_{1}, \ldots, E_{k}\right)\)}{\Prog, \Env^{0}} \Rightarrow \tuple{x, \{x \mapsto \left(v_{1}, \ldots, v_{n}\right) \sep [] \} \cup \Env^{k}}}
\end{prooftree} \\
\medskip
\begin{prooftree}
\hypo{\Sem{\(C\)}{\Prog, \Env} \Rightarrow \tuple{\ptinline{true}, \_}}
\infer1[(If\(_T\))]{\Sem{if \(C\) then \(E_t\) else \(E_e\) endif}{\Prog, \Env} \Rightarrow \Sem{\(E_t\)}{\Prog, \Env}}
\end{prooftree} \\
\medskip
\begin{prooftree}
\hypo{\Sem{\(C\)}{\Prog, \Env} \Rightarrow \tuple{\ptinline{false}, \_}}
\infer1[(If\(_F\))]{\Sem{if \(C\) then \(E_t\) else \(E_e\) endif}{\Prog, \Env} \Rightarrow \Sem{\(E_e\)}{\Prog, \Env}}
\end{prooftree} \\
\medskip
\begin{prooftree}
\hypo{\Sem{G}{\Prog,\Env} \Rightarrow \tuple{ \ptinline{true}, \_}}
\hypo{\Sem{\(E\)}{\Prog, \Env} \Rightarrow \tuple {v, \Env'}}
\infer2[(WhereT)]{\Sem{\([E ~|~ \mbox{where~} G]\)}{\Prog, \Env} \Rightarrow \tuple{[v], \Env'}}
\end{prooftree} \\
\medskip
\begin{prooftree}
\hypo{\Sem{G}{\Prog,\Env} \Rightarrow \tuple{\ptinline{false}, \_}}
\infer1[(WhereF)]{\Sem{\([E ~|~ \mbox{where~} G]\)}{\Prog, \Env} \Rightarrow \tuple{[], \Env}}
\end{prooftree} \\
\medskip
\begin{prooftree}
\hypo{\Sem{\(\mathit{PE}\)}{\Prog,\Env} \Rightarrow \tuple{S, \_}}
\infer[no rule]1{\Sem{\([E ~|~ \mathit{GE} \mbox{~where~} G]\)}{\Prog, \Env\ \cup\ \{ x\mapsto\
v \sep\ []\}}}
\infer[no rule]1{\hspace{8em} \Rightarrow \tuple{A_v, \Env \cup \{x \mapsto v \sep []\} \cup \Env_v}, \forall
v \in S }
\infer1[(ListG)]{\Sem{\([E~|~ x \mbox{~in~} \mathit{PE}, \mathit{GE} \mbox{~where~} G]\)}{\Prog, \Env} \Rightarrow
\tuple{\mbox{concat} [ A_v ~|~ v \in S ], \Env \cup \bigcup_{v \in S} \Env_v}}
\end{prooftree}
\caption{\label{fig:4-rewrite-to-nzn-other} Rewriting rules for partial evaluation
of other \microzinc\ expressions to \nanozinc.}
\end{figure*}
Finally, the rules in \cref{fig:4-rewrite-to-nzn-other} handle the evaluation of
variables, constants, conditionals and comprehensions. For completeness, we give
the details here.
The (IdX) and (IdC) rules look up a variable in the environment and return its
bound value (for constants) or the variable itself. The (Const) rule simply
returns the constant. The (If\(_T\)) and (If\(_F\)) rules evaluate the if condition
and then return the result of the then or else branch appropriately. The
(WhereT) rule evaluates the guard of a where comprehension and if true returns
the resulting expression in a list. The (WhereF) rule evaluates the guard and
when false returns an empty list. The (ListG) rule evaluates the iteration set
\(PE\) to get its value \(S\), which must be fixed. It then evaluates the expression
\(E\) with all the different possible values \(v \in S\) of the generator \(x\),
collecting the additional \nanozinc\ definitions \(\Env_v\). It returns the
concatenation of the resulting lists with all the additional \nanozinc\ items.
\subsection{Prototype Implementation}
Our prototype implementation of the \microzinc/\nanozinc\ framework consists of
the following components.
The \textbf{compiler} translates \minizinc\ into a \textbf{bytecode} encoding of
\microzinc. The compiler currently supports a significant subset of the full
\minizinc\ language, with the missing features (such as multi-dimensional arrays
and complex output statements) requiring additional engineering effort but no
new technology.
The \textbf{interpreter} evaluates \microzinc\ bytecode and \nanozinc\ programs
based on the rewrite system introduced in this section. It can read parameter
data from a file or via an API\@. The interpreter constructs the call to the
\mzninline{main} function, and then performs the rewriting into flat \nanozinc.
Memory management is implemented using reference counting, which lends itself
well to the optimisations discussed in the next section. The interpreter is
written in around 3.5kLOC of C++ to enable easy embedding into applications and
other programming languages.
The framework includes \textbf{interfaces} to multiple solvers. It can
pretty-print the \nanozinc\ code to standard \flatzinc, so that any solver
currently compatible with \minizinc\ can be used. Additionally, a direct C++ API
can be used to connect solvers directly, in order to enable incremental
solving, as described in \cref{sec:4-incremental}.
\textbf{Python bindings} to the framework are provided to enable easy
scripting.
The source code for the complete system will be made available under an open
source license.
\section{NanoZinc Simplification}\label{sec:4-simplification}
The previous section introduced the basic evaluation model of \nanozinc: each
call that has a corresponding \microzinc\ definition can be rewritten into a set
of calls by evaluating the \microzinc\ code. This section looks at further steps
that the system can perform in order to produce better, more compact \nanozinc\
code.
\subsection{Removing dead variables and constraints}
The most basic form of simplification is to identify variables that are not
used by any call, and remove them. This is correct because the semantics of
\nanozinc\ requires us to find values for all variables such that all
constraints are satisfied --- but if a variable does not occur in any
constraint, we can pick any value for it. The model without the variable is
therefore equisatisfiable with the original model.
Consider now the case where a variable in \nanozinc\ is only used in its own
auxiliary definitions (the constraints directly succeeding the declaration
prepended by \texttt{└── }).
\begin{example}\label{ex:4-absreif}
The following is a slight variation on the \minizinc\ fragment in
\cref{ex:4-absnzn}:
\begin{mzn}
constraint abs(x) > y \/ c;
constraint c;
\end{mzn}
The constraint \mzninline{abs(x) > y} is now used in a disjunction. This means
that instead of enforcing the constraint globally, the model needs to reason
about whether the constraint holds or not. This is called \gls{reification}
(we will come back to this in \cref{sec:4-cse}). Following the rewriting
rules, the generated \nanozinc\ code will look very similar to the code
generated in \cref{ex:4-absnzn}, but with an additional \mzninline{bool_or}
constraint for the disjunction, which uses the variable \mzninline{b1} that
was introduced for \mzninline{abs(x) > y}:
\begin{nzn}
var true: c;
var -10..10: x;
var -10..10: y;
var int: z;
└── constraint int_abs(x, z);
var bool: b1;
└── constraint int_gt(z, y);
constraint bool_or(b1, c);
\end{nzn}
Since \mzninline{c} is bound to \mzninline{true}, the disjunction
\mzninline{b2} trivially holds, independent of the value of \mzninline{b1}.
The constraint \mzninline{abs(x) > y} is therefore not required. However, the
rewriting has to choose a particular order in which arguments are evaluated,
so it is always possible that it already evaluated the left hand side before
``noticing'' that the disjunction is true.
If the system now simplifies the constraint \mzninline{b2} to \mzninline{true},
the identifier \mzninline{b1} will become unused, since it was only
referenced from the \mzninline{bool_or} call. Removing \mzninline{b1} leads
to \mzninline{t} not being used by any constraint, so its definition can be
removed. This in turn means that \mzninline{b0} is not referenced anywhere
and can be removed, and then \mzninline{z}. The resulting \nanozinc\ program
is much more compact:
\begin{nzn}
var true: c;
var -10..10: x;
var -10..10: y;
\end{nzn}
We assume here that \mzninline{c}, \mzninline{x} and \mzninline{y} are used in
other parts of the model not shown here and therefore not removed.
\end{example}
\jip{adjust text to match the new \nanozinc\ syntax}
Note how \mzninline{t} acts as a unique name for the result of the let
expression (and thus the function call) --- we could not simply have used
\mzninline{z}, because then \mzninline{b0} would have kept \mzninline{z} alive.
It is worth noting the difference between the constraints \mzninline{b0} and
\mzninline{b1} in this example: \mzninline{b0} is referenced from the auxiliary
constraint list of another variable. This means that it is a constraint that
defines the value of a variable, and it can be enforced globally (it must hold,
under all circumstances). On the other hand, \mzninline{b1} is referenced as an
argument to the \mzninline{bool_or} call. This means that the constraint does
not have to hold globally --- rather, we require the variable \mzninline{b1} to
reflect whether the constraint holds or not, so that the \mzninline{bool_or}
can implement the disjunction. These types of constraints are typically called
\emph{reified} in constraint modelling. So \nanozinc\ uses the same syntax for
reified and globally true constraints, allocating an identifier (such as
\mzninline{b0} or \mzninline{b1}) for any constraint. Whether the constraint is
reified or not depends on where that identifier appears.
\begin{example}
The importance of the use of auxiliary definitions when removing dead
variables and constraints is demonstrated in the following example:
\begin{mzn}
int n;
var int: x;
function var int: f(var int: x, int: i) = let {
var int: y;
constraint y = int_minus(x, i);
} in y;
constraint [f(x,i) | i in 1..n][1] == 10;
\end{mzn}
The rewriting of this instance will initially create the \mzninline{n}
elements for the array comprehension. Each element is evaluated as a new
variable \mzninline{y} and a constraint to ensure that \mzninline{y} is the
subtraction of \mzninline{x} and \mzninline{i}. Although \mzninline{n}
variables are created, only the first variable is constrained to be 10. All
other variables can thus be removed and the model will stay equisatisfiable.
In the \nanozinc\ version of this model, each of the subtraction constraints
will be added to the list of auxiliary constraints of its corresponding
\mzninline{y} variable. When, after evaluating the array access, all of those
variables except the first are detected to be unused, their removal triggers
the removal of the subtraction constraints.
For this example the current \minizinc\ compiler, version 2.3.2, will produce
a \flatzinc\ program that contains the variable \mzninline{x}, \mzninline{n-1}
variables \mzninline{y}, and \mzninline{n} constraints for the subtraction.
The first \mzninline{y} variable is directly replaced with the constant 10.
Using the \nanozinc\ system, only 1 variable, \mzninline{x}, and 1 constraint,
\mzninline{10 = int_minus(x,1);} is produced.
\end{example}
\paragraph{Implementation} The removal of unused identifiers is taken care of
by garbage collection in the interpreter. Since our prototype interpreter uses
reference counting, it already accurately tracks the liveness of each
identifier.
\subsection{Constraint propagation}
Another way that variables can become unused is if a constraint can be detected
to be true based on its semantics, and the known domains of the variables. For
example, consider the constraint \mzninline{3*x + 7*y > 8}, and assume that
both \mzninline{x} and \mzninline{y} cannot be smaller than 1. In this case, we
can determine that the constraint is always satisfied, and remove it from the
model without changing satisfiability. This, in turn, may result in
\mzninline{x} or \mzninline{y} becoming unused and also removed (including
their auxiliary definitions, if any).
This is a simple form of \gls{propagation}, the main inference method used in
constraint solvers \autocite{schulte-2008-propagation}. Constraint propagation,
in general, can not only detect when constraints are trivially true (or false),
but also tighten variable domains. For instance, given the constraint
\mzninline{x>y}, with initial domains \mzninline{x in 1..10, y in 1..10}, would
result in the domains being tightened to \mzninline{x in 2..10, y in 1..9}.
These tightened domains would be reflected in the corresponding
\mzninline{mkvar(...)} calls. Note, however, that this may now prevent us from
removing \mzninline{x} or \mzninline{y}: even if they later become unused, the
tightened domains may impose a constraint on their definition. For instance, if
\mzninline{x} is defined as \mzninline{abs(z)}, then any restriction on the
domain of \mzninline{x} constrains the possible values of \mzninline{z} through
the \mzninline{abs} function. We therefore need to track whether a variable
domain has been enforced through external constraints, or is the consequence of
its own definition. We omit the details of this mechanism for space reasons.
A further effect of constraint propagation is the replacement of constraints by
simpler versions. A constraint \mzninline{x=y+z}, where the domain of
\mzninline{z} has been tightened to the singleton \mzninline{0..0}, can be
rewritten to \mzninline{x=y}. Propagating the effect of one constraint can thus
trigger the propagation of other constraints, and constraint solvers typically
run all propagation rules or algorithms until they reach a fix-point.
The \nanozinc\ interpreter applies propagation rules for the most important
constraints, such as linear and Boolean constraints, which can lead to
significant reductions in generated code size. It has been shown previously that
applying full constraint propagation during compilation can be beneficial
\autocite{leo-2015-multipass}, in particular when the target solver requires
complicated reformulations (such as linearisation for MIP solvers
\autocite{belov-2016-linearisation}).
Returning to the example above, constraint propagation (and other model
simplifications) can result in \emph{equality constraints} \mzninline{x=y} being
introduced. For certain types of target solvers, in particular those based on
constraint programming, equality constraints can lead to exponentially larger
search trees if not detected and simplified. Equalities therefore present a
valuable opportunity for model simplification, but also pose some challenges.
Consider the simple case of \mzninline{x=y} where one of the variables, say
\mzninline{x}, was a top-level variable introduced by the modeller, and not
defined as the result of a function call. In that case, we can propagate the
equality constraint by replacing all occurrences of \mzninline{x} with
\mzninline{y}, and remove the definition of \mzninline{x}.
The more complicated case for \mzninline{x=y} is when both \mzninline{x} and
\mzninline{y} are the result of function calls, \eg\ from a model fragment such
as \mzninline{constraint f(a)=g(b)}. Let us assume that part of the
corresponding \nanozinc\ code looks like this:
\begin{nzn}
var int: x;
└── constraint f_rel(a, x);
var int: y;
└── constraint g_rel(b, y);
constraint int_eq(x, y);
\end{nzn}
In this case, simply removing either \mzninline{x} or \mzninline{y} would not be
correct, since that would trigger the removal of the corresponding definition
\mzninline{f_rel(a, x)} or \mzninline{g_rel(b, y)}, leaving nothing to be
substituted. Rather, the interpreter needs to pick one of the two definitions
and \emph{promote} it to top-level status. Let us assume that the interpreter
decides to replace \mzninline{y} by \mzninline{x}. \mzninline{y} is then
globally replaced by \mzninline{x}, and its declaration is removed. The
interpreter moves all auxiliary constraints from \mzninline{y} into the list of
top-level constraints, and then removes the \mzninline{int_eq} constraint. The
resulting \nanozinc\ looks like this:
\begin{nzn}
var int: x;
└── constraint f_rel(a, x);
constraint g_rel(x, y);
constraint int_eq(x, y);
\end{nzn}
The process of equality propagation is similar to unification in logic
programming \autocite{warren-1983-wam}. However, since equations in \minizinc\
always concern scalar variables rather than terms, we can restrict ourselves to
this simple substitution scheme and leave the further checking of the resulting
equalities to the target constraint solver.
\paragraph{Implementation}
Rewriting a function call that has a \microzinc\ definition and rewriting a
constraint due to propagation are very similar. The interpreter therefore simply
interleaves both forms of rewriting. Efficient constraint propagation engines
``wake up'' a constraint only when one of its variables has received an update
(i.e., when its domain has been shrunk). To enable this, the interpreter needs
to keep a data structure linking variables to the constraints where they appear
(in addition to the reverse links from calls to the variables in their
arguments). These links do not take part in the reference counting.
\subsection{Delayed Rewriting}
Adding constraint propagation and simplification into the rewriting system means
that the system becomes non-confluent, and some orders of execution may produce
``better'', i.e., more compact or more efficient, \nanozinc.
\begin{example}
The following example is similar to code found in the \minizinc\ libraries of
MIP solvers.
\begin{mzn}
predicate lq_zero_if_b(var int: x, var bool: b) =
x <= ub(x)*(1-b);
\end{mzn}
This predicate expresses the constraint \mzninline{b -> x<=0}, using a
well-known method called ``big-M transformation''. The expression
\mzninline{ub(x)} returns a valid upper bound for \mzninline{x}, \ie\ a fixed
value known to be greater than or equal to \mzninline{x}. This could be the
initial upper bound \mzninline{x} was declared with, or the current value in
the corresponding \nanozinc\ \mzninline{mkvar} call. If \mzninline{b} takes
the value 0, the expression \mzninline{ub(x)*(1-b)} is equal to
\mzninline{ub(x)}, and the constraint \mzninline{x <= ub(x)} holds trivially.
If \mzninline{b} takes the value 1, \mzninline{ub(x)*(1-b)} is equal to 0,
enforcing the constraint \mzninline{x <= 0}.
\end{example}
For MIP solvers, it is quite important to enforce \emph{tight} bounds in order
to improve efficiency and sometimes even numerical stability. It would therefore
be useful to rewrite the \mzninline{lq_zero_if_b} predicate only after all the
constraint propagation has been performed, in order to take advantage of the
tightest possible bounds. On the other hand, evaluating a predicate may also
\emph{impose} new bounds on variables, so it is not always clear which order of
evaluation is best.
In our \microzinc/\nanozinc\ system, we allow predicates and functions to be
\emph{annotated} as potential candidates for \emph{delayed rewriting}. Any
annotated constraint is handled by the (CallPrimitive) rule rather than the
(Call) rule, which means that it is simply added as a call into the \nanozinc\
code, without evaluating its body. When propagation reaches a fix-point, all
annotations are removed from the current \nanozinc\ program, and evaluation
resumes with the predicate bodies.
This crucial optimisation enables rewriting in multiple \emph{phases}. For
example, a model targeted at a MIP solver is rewritten into Boolean and reified
constraints, whose definitions are annotated to be delayed. The \nanozinc\ model
can then be fully simplified by constraint propagation, before the Boolean and
reified constraints are rewritten into lower-level linear primitives suitable
for MIP\@.
\subsection{Common Subexpression Elimination}\label{sec:4-cse}
The simplifications introduced above remove definitions from the model when we
can detect that they are no longer needed. In some cases, it is even possible to
not generate definitions in the first place through the use of common
subexpression elimination (CSE). This simplification is a well understood
technique that originates from compiler optimisation \autocite{cocke-1970-cse}
and has proven to be very effective in discrete optimisation
\autocite{marinov-2005-sat-optimisations, araya-2008-cse-numcsp}, including
during the evaluation of constraint modelling languages such as
\minizinc\ \autocite{rendl-2009-enhanced-tailoring}.
For instance, in the constraint\\
\mzninline{(abs(x)*2 >= 20) \/ (abs(x)+5 >= 15)}\\
the expression \mzninline{abs(x)} is evaluated twice. Since functions in
\microzinc\ are guaranteed to be pure and total, any expressions that are
syntactically equal can be detected by a static compiler analysis before
evaluation even begins, and hoisted into a let expression:
\begin{mzn}
constraint let { var int: y = abs(x) } in
(y*2 >= 20) \/ (y+5 >= 15)
\end{mzn}
%
However, some expressions only become syntactically equal during evaluation, as
in the following example.
%
\begin{example}
Consider the fragment:
\begin{mzn}
function var float: pythagoras(var float: a, var float: b) =
let {
var float: x = pow(a, 2);
var float: y = pow(b, 2);
} in sqrt(x + y);
constraint pythagoras(i, i) >= 5;
\end{mzn}
Although the expressions \mzninline{pow(a, 2)} and \mzninline{pow(b, 2)} are
not syntactically equal, the function call \mzninline{pythagoras(i,i)} using
the same variable for \mzninline{a} and \mzninline{b} makes them equivalent.
A straightforward approach to ensure that the same instantiation of a
function is only evaluated once is through memorisation. After the evaluation
of a \microzinc\ call, the instantiated call and its result are stored in a
lookup table, the CSE table. Before a call is evaluated the CSE table is
consulted. If the call is found, then the accompanying result is used;
otherwise, the evaluation proceeds as normal.
In our example, the evaluation of \mzninline{pythagoras(i, i)} would proceed
as normal to evaluate \mzninline{x = pow(i, 2)}. However, the expression
defining \mzninline{y}, \mzninline{pow(i, 2)}, will be found in the CSE table
and replaced by the earlier stored result: \mzninline{y = x}.
\end{example}
It is worth noting that the CSE approach based on memorisation also covers the
static example above, where the compiler can enforce sharing of common
subexpressions before evaluation begins. However, since the static analysis is
cheap and can potentially avoid many dynamic table lookups, it is valuable to
combine both approaches. The static approach can be further improved by
\emph{inlining} function calls, since that may expose more calls as being
syntactically equal.
\paragraph{Reification}
Many constraint modelling languages, including \minizinc, not only enable
constraints to be globally enforced, but typically also support constraints to
be \emph{reified} into a Boolean variable. Reification means that a variable
\mzninline{b} is constrained to be true if and only if a corresponding
constraint \mzninline{c(...)} holds. We have already seen reification in
\cref{ex:4-absreif}: the truth of constraint \mzninline{abs(x) > y} was
bound to a Boolean variable \mzninline{b1}, which was then used in a
disjunction. We say that the same constraint can be used in \emph{root context}
as well as in a \emph{reified context}. In \minizinc, almost all constraints
can be used in both contexts. However, reified constraints are often defined in
the library in terms of complicated decompositions into simpler constraints, or
require specialised algorithms in the target solvers. In either case, it can be
very beneficial for the efficiency of the generated \nanozinc\ program if we can
detect that a reified constraint is in fact not required.
If a constraint is present in the root context, it means that it must hold
globally. If the same constraint is used in a reified context, it can therefore
be replaced with the constant \mzninline{true}, avoiding the need for
reification (or in fact any evaluation). We can harness CSE to store the
evaluation context when a constraint is added, and detect when the same
constraint is used in both contexts. Whenever a lookup in the CSE table is
successful, action can be taken depending on both the current and stored
evaluation context. If the stored expression was in root context, the constant
\mzninline{true} can be used, independent of the current context. If the stored
expression was in reified context and the current context is reified, the
stored result variable can be used. If the stored expression was in reified
context and the current context is root context, then the previous result can
be replaced by the constant \mzninline{true} (and all its dependencies removed)
and the evaluation will proceed as normal with the root context constraint.
\begin{example}
Consider the fragment:
\begin{mzn}
predicate p(var int: x, var int: y) = q(x) /\ r(y);
constraint b0 <-> q(x);
constraint b1 <-> t(x,y);
constraint b1 <-> p(x,y);
\end{mzn}
If we process the top-level constraints in order we create a reified call to
\mzninline{q(x)} for the original call. Suppose processing the second
constraint we discover \mzninline{t(x,y)} is \mzninline{true}, setting
\mzninline{b1}. When we process \mzninline{q(x)} in the call
\mzninline{t(x,y)} we find it is the root context. So CSE needs to set
\mzninline{b0} to \mzninline{true}.
\end{example}
\minizinc\ distinguishes not only between root and reified contexts, but in
addition recognises constraints in \emph{positive} contexts, also known as
\emph{half-reified} \autocite{feydy-2011-half-reif}. We omit the details here
for brevity, but the CSE approach generalises to this extension.
\section{Experiments}\label{sec:4-experiments}
We have created a prototype implementation of the architecture presented in the
preceding sections. It consists of a compiler from \minizinc\ to \microzinc, and
an incremental \microzinc\ interpreter producing \nanozinc. The system supports
a significant subset of the full \minizinc\ language; notable features that are
missing are support for set and float variables, option types, and compilation
of model output expressions and annotations. We will release our implementation
under an open-source license and can make it available to the reviewers upon
request.
The implementation is not optimised for performance yet, but was created as a
faithful implementation of the developed concepts, in order to evaluate their
suitability and provide a solid baseline for future improvements. In the
following we present experimental results on basic flattening performance as
well as incremental flattening and solving that demonstrate the efficiency
gains that are possible thanks to the new architecture.
\subsection{Basic Flattening}
As a first experiment, we selected 20 models from the annual \minizinc\
challenge and compiled 5 instances of each model to \flatzinc, using the current
\minizinc\ release version 2.4.3 and the new prototype system. In both cases we
use the standard \minizinc\ library of global constraints (i.e., we decompose
those constraints rather than using solver built-ins, in order to stress-test
the flattening). We measured pure flattening time, i.e., without time required
to parse and typecheck in version 2.4.3, and without time required for
compilation to \microzinc\ in the new system (compilation is usually very fast).
Times are averages of 10 runs.\footnote{All models obtained from
\url{https://github.com/minizinc/minizinc-benchmarks}:
\texttt{\justify{}accap, amaze, city-position, community-detection,
depot-placement, freepizza, groupsplitter, kidney-exchange, median-string,
multi-knapsack, nonogram, nside, problem, rcpsp-wet, road-cons, roster,
stack-cuttingstock, steelmillslab, train, triangular, zephyrus}.}
\Cref{sfig:4-compareruntime} compares the flattening time for each of the 100
instances. Points below the line indicate that the new system is faster. On
average, the new system achieves a speed-up of \(2.3\), with very few instances
not achieving any speedup. In terms of memory performance
(\cref{sfig:4-comparemem}), version 2.4.3 can sometimes still outperform the new
prototype. We have identified that the main memory bottlenecks are our currently
unoptimised implementations of CSE lookup tables and argument vectors.
These are very encouraging results, given that we are comparing a largely
unoptimised prototype to a mature piece of software.
\begin{figure}[t]
\centering
\begin{subfigure}[b]{.48\columnwidth}
\centering
\includegraphics[width=\columnwidth]{assets/img/4_compare_runtime}
\caption{\label{sfig:4-compareruntime}flattening run time (ms)}
\end{subfigure}%
\hspace{0.04\columnwidth}%
\begin{subfigure}[b]{.48\columnwidth}
\centering
\includegraphics[width=\columnwidth]{assets/img/4_compare_memory}
\caption{\label{sfig:4-comparemem}flattening memory (MB)}
\end{subfigure}
\caption{\label{fig:4-runtime}Performance on flattening 100 MiniZinc Challenge
instances. \minizinc\ 2.4.3 (x-axis) vs new architecture (y-axis), log-log
plot. Dots below the line indicate the new system is better.}
\end{figure}