This repository has been archived on 2025-03-06. You can view files and clone it, but cannot push or open issues or pull requests.
dekker-phd-thesis/chapters/3_rewriting.tex
2021-05-22 14:38:14 +10:00

1221 lines
60 KiB
TeX

%************************************************
\chapter{Rewriting Constraint Modelling Languages}\label{ch:rewriting}
%************************************************
Rewriting a high-level constraint model down into an equivalent solver-level
constraint model might often seem like a simple term rewriting system. In
reality, however, simple rewriting of the model will often result in sub-optimal
solver-level model and this might result in exponentially more work for the
solver. To combat this problem many techniques have been developed to create
more efficient solver-level models such as: continuously updating variable
domains according to the constraints, correctly resolving constraint sub-typing
when variables get fixed, removing any variables and constraints that have
become unused, detecting duplicate constraints, and reusing duplicate functional
definitions.
The application of all these optimisations can, however, be time intensive
during the rewriting process. And although this was not a problem when
high-level \cmls\ were targeting \gls{cp} solvers, where the solver-level
constraint model stays relatively small, this poses a big problem for \gls{mip}
and \gls{sat} solvers, whose solver-level constraint models are significantly
larger.
In this chapter, we revisit the rewriting of high-level \cmls\ into solver-level
constraint models. We describe a new \textbf{systematic view of the execution of
\minizinc{}} and build on this to propose a new tool chain. We show how this
tool chain allows us to:
\begin{itemize}
\item efficiently rewrite high-level constraint models with \textbf{minimal
overhead},
\item easily integrate a range of \textbf{optimisation and simplification}
techniques,
\item and effectively \textbf{detect and eliminate dead code} introduced by
functional definitions
\end{itemize}
The new architecture is shown in \Cref{fig:rew-comp}. A constraint model is
first compiled into a smaller constraint language called \microzinc{},
independent of the data. After the \microzinc{} is transformed into a byte code,
it is interpreted with the data to produce \nanozinc\ code, an extension of the
existing \flatzinc\ format. The interpreter can even be made incremental: in
\cref{ch:incremental} we discuss how in meta optimisation, no recompilation is
required.
We have developed a prototype of this tool chain, and present experimental
validation of these advantages. The prototype is still very experimental, but
preliminary results suggest the new tool chain can perform flattening much
faster, and produce better models, than the current \minizinc\ compiler.
This chapter is organised as follows. \Cref{sec:4-micronano} introduces the
\microzinc\ and \nanozinc\ languages, the new intermediate representation we
propose that enables more efficient flattening. \Cref{sec:4-simplification}
describes how we can perform various processing and simplification steps on this
representation, and in \cref{sec:4-experiments} we report on the experimental
results of the prototype implementation. Finally, \Cref{sec:4-conclusion}
presents our conclusions.
\begin{figure}
\centering
\includegraphics[width=\linewidth]{assets/img/rew_compilation_structure}
\caption{\label{fig:rew-comp} The proposed process for the compilation of
\minizinc\ instances.}
\end{figure}
\section{\glsentrytext{microzinc} and
\glsentrytext{nanozinc}}\label{sec:4-micronano}
This section introduces the two sub-languages of \minizinc\ at the core of the
new abstract machine model we have developed.
\microzinc{} is a simplified subset of \minizinc\ that only contains function
declarations and a restricted expression language, but it extends \minizinc{}'s
type system with tuples. Models written in \minizinc\ can be translated into
\microzinc{}.
\nanozinc{} is similar to \flatzinc\ in that it contains only simple variable
declarations and flat constraints. However, while all function calls in
\flatzinc\ need to be solver-level primitives, \nanozinc\ calls can refer to
functions implemented in \microzinc{}. Furthermore, \nanozinc\ allows for
constraints to be bound to a specific variable and constraints and variables
declarations can be interleaved.
The core of the syntax of \microzinc\ is defined in \cref{fig:4-uzn-syntax}. We
have omitted the definitions of \syntax{<mzn-type-inst>} and \syntax{<ident>},
which you can assume to be the same as the definition of \syntax{<type-inst>}
and \syntax{<ident>} in the \minizinc\ grammar, see \cref{ch:minizinc-grammar}.
While we omit the typing rules here for space reasons, we will assume that in
well-typed \microzinc\ programs, the conditions \syntax{<exp>} in if-then-else
and where expressions have \mzninline{par} type.
\begin{figure}
\begin{grammar}
<model> ::= <pred-decl>*<fun-decl>*
<pred-decl> ::= "predicate" <ident>"(" <param> ["," <param>]* ");"
<fun-decl> ::= "function" <type-inst>":" <ident>"(" <param> ["," <param>]* ")" "=" <exp>";"
<literal> ::= <constant> | <ident>
<exp> ::= <literal>
\alt <ident>"(" <exp> ["," <exp>]* ")"
\alt "let" "{" <item>* "}" "in" <literal>
\alt "if" <exp> "then" <exp> "else" <exp> "endif"
\alt "[" <exp> "|" <gen-exp> "where" <exp> "]"
\lat <ident>"[" <literal> "]"
<item> ::= <param> [ "=" <exp> ]";"
\alt "tuple(" <type-inst> ["," <type-inst>]* "):" <ident> = "(" <exp> ["," <exp>]* ")";
\alt "(" <param> ["," <param>]* ")" "=" <exp>";"
\alt "constraint" <exp>";"
<param> ::= <type-inst>":" <ident>
<gen-exp> ::= <ident> "in" <exp> ["," <ident> "in" <exp> ]*
\end{grammar}
\caption{\label{fig:4-uzn-syntax}Syntax of \microzinc{}.}
\end{figure}
\newcommand{\sep}{\rhd}
A \nanozinc\ program, defined in \cref{fig:4-nzn-syntax}, is simply a list of
variable declaration and constraints in the form of calls. The syntax
``\texttt{}'' will be used to track dependent constraints (this will be
explained in detail in \cref{sec:4-nanozinc}, for now you can ignore them).
\begin{figure}
\begin{grammar}
<nzn> ::= <nzn-item>*
<nzn-item> ::= <nzn-def> | <nzn-con>
<nzn-con> ::= "constraint" <ident>"(" <literal> ["," <literal>]* ")"";"
<nzn-var> ::= "var" <nzn-dom> ":" <ident> ";" <nzn-bind>*
<nzn-dom> ::= <constant> ".." <constant> | <constant>
\alt "bool" | "int" | "float" | "set of int"
<nzn-bind> ::= "↳" <nzn-con>
\end{grammar}
\caption{\label{fig:4-nzn-syntax}Syntax of \nanozinc{}.}
\end{figure}
\subsection{\glsentrytext{minizinc} to
\glsentrytext{microzinc}/\glsentrytext{nanozinc}}
The translation from \minizinc\ to \microzinc{} involves the following steps:
\begin{enumerate}
\item Transform all expressions that are valid in \minizinc\ but not in
\microzinc{}. This includes simple cases such as replacing operator
expressions like \mzninline{x+y} by function calls
\mzninline{int_plus(x,y)}. As mentioned above, more complex rules for
rewriting conditionals with variables, such as \mzninline{if x then A
else B endif} and \mzninline{A[x]} where \mzninline{x} is a variable,
are also replaced by function calls (\eg\
\mzninline{if_then_else([x, true], [A, B])} and
\mzninline{element(a, x)} respectively).
\item Replace optional variables into non-optional forms. As shown by Mears et
al., optional variable types can be represented using a variable of the
same non-optional type and a boolean variable
\autocite*{mears-2014-option}. The boolean variable, then, represents if
the variable exists or is absent, while the non-optional variable of the
same type represents the value of the optional variable if it exists. In
\microzinc\ this can be represented as a tuple of the two variables. The
functions definitions for optional types then take both variables into
account. For example, the definition of \mzninline{int_plus(x, y)} for
optional types will add the value variable to the result of the function
if boolean variable exists.
\item Lift the partiality in expressions into the surrounding context to
implement \minizinc{}'s relational semantics. In contrast to
\minizinc{}, the evaluation of \microzinc\ will implement strict
semantics for partial expressions, whereas \minizinc\ uses relational
semantics. Stuckey and Tack have extensively examined these problems and
describe how to transform any partial function into a total function.
\autocite*{stuckey-2013-functions}. A general approach for describing
the partially in \microzinc\ functions is to make the function return
both a boolean variable that indicates the totality of the input
variables and the resulting value of the function that is adjusted that
is adjusted to return a predefined value when it would normally be
undefined. For example, The function
\begin{mzn}
function var int: element(array of int: a, var int: x);
\end{mzn}
is only defined when \mzninline{x in index_set(a)}. A total version to
be used in \microzinc\ could look like\footnote{For brevity's sake, we
will still use \minizinc\ operators to write the \microzinc\
definition.}:
\begin{mzn}
function tuple(var int, var bool): element_safe(array of int: a, var int: x) =
let {
var bool: total = x in index_set(x);
var index_set(a): xs;
constraint total -> x = xs; % Give xs the value of x when result is total
var res = element(a, xs); % Use of element that is guaranteed to be total
tuple(var int, var bool): ret = (res, total);
} in ret;
\end{mzn}
Similar to the replacement of optional values, all usage of the
\mzninline{element} function are replaced by \mzninline{element_safe}
and the usage of its result should be guarded by the returned totality
variable in the surrounding boolean context. This means that, for
example, the expression \mzninline{element(e, v) = 5 \/ b} would be
transformed into:
\begin{mzn}
let {
(var int: res, var bool: total) = element(e, v);
} in (total /\ res = 5) \/ b;
\end{mzn}
\item Resolve sub-type based overloading statically. To prevent the runtime
overhead of the dynamic lookup of overloaded function definitions for
every call, the function dispatch in \microzinc\ is determined
statically.
It is important, however, that the correct version of a function is
chosen when it allows for both variable and parameter parameters. And,
as we will later discuss in more detail, is possible for variables to be
fixed to a single value, at which point they can be treated as
parameters. To ensure the correct version of the function is used,
functions are changed to, at runtime, check if a variable is fixed and,
if so, dispatch to the parameter version of the function. If we for
example have a \minizinc\ function \mzninline{f} that is overloaded on
\mzninline{var int} and \mzninline{int}, then the \microzinc\
transformation might look like:
\begin{mzn}
function f_int(int: x) = /* original body */;
function f_varint(varint: x) =
if is_fixed(x) then
f_int(x)
else
/* original body */;
endif;
\end{mzn}
\item Transform all function definitions so that they do not refer to
top-level variables, by adding those variables as extra arguments to the
function declaration and each call site
\item Move all top-level variable declarations and constraints into a let
expression in the body of a new function called \mzninline{main}, whose
arguments are the parameters of the model and which returns a fresh
Boolean variable.
\end{enumerate}
\begin{example}
Consider the following \minizinc\ model, a simple unsatisfiable ``pigeonhole
problem'', trying to assign \(n\) pigeons to \(n-1\) holes:
\begin{mzn}
predicate all_different(array[int] of var int: x) =
forall (i,j in 1..length(x) where i<j) (
x[i] != x[j]
);
int: n;
array[1..n] of var 1..n-1: pigeon;
constraint all_different(pigeon);
\end{mzn}
The translation to \microzinc\ turns all expressions into function calls,
replaces the special generator syntax in the \mzninline{forall} with an array
comprehension, and creates the \mzninline{main} function. Note how the domain
\(1..n-1\) of the \mzninline{pigeon} array is replaced by a \mzninline{set_in}
constraint for each element of the array.
\begin{mzn}
function var bool: all_different(array[int] of var int: x) =
forall([
int_neq(element(x, i),element(x, j))
| i in range(1, length(x)),
j in range(1, length(x))
where int_lt(i,j)
]);
function var bool: main(int: n) = let {
var bool: root_ctx;
array[range(1, n)] of var int: pigeon;
constraint forall([
set_in(element(pigeon, i), range(1, int_minus(n,1)))
| i in range(1, n)
])
constraint all_different(pigeon);
} in root_ctx;
\end{mzn}
For a concrete instance of the model, we can then create a simple \nanozinc\
program that calls \mzninline{main}, for example for \(n=10\):
\begin{nzn}
constraint main(10);
\end{nzn}
\end{example}
\subsection{Partial Evaluation of \glsentrytext{nanozinc}}\label{sub:4-partial}
Each call in a \nanozinc\ program is either a call to a solver-level predicate,
or it has a corresponding \microzinc\ function definition. The goal of partial
evaluation is to transform the \nanozinc\ program into (essentially)
\flatzinc{}, \ie\ a program where all calls are calls to solver-level
predicates.
To achieve this, we define the semantics of \microzinc\ as a rewriting system
that produces \nanozinc\ calls. Each non-primitive call in a \nanozinc\ program
can then be replaced with the result of evaluating the corresponding \microzinc\
function.
A key element of this rewriting process, then, is the transformation of
functional expressions into relational form. For this to happen, the rewriting
process introduces fresh variables to represent intermediate expressions, and
keeps track of constraints added during the rewriting.
\begin{example}\label{ex:4-abs}
Consider the following (reasonably typical) \minizinc\ encoding for the
absolute value function:
\begin{mzn}
function var int: abs(var int: x) =
let {
var int: z;
constraint int_abs(x, z);
} in z;
\end{mzn}
When an occurrence of \mzninline{abs} is encountered, the call will be
replaced by a variable \mzninline{z}. But before that can occur, the variable
must be introduced, and the constraint \mzninline{int_abs(x, z)} is added to
the model, enforcing its value.
\end{example}
The function \mzninline{abs(x)} above is a total function: for every value of
\mzninline{x}, there is always exactly one valid value of \mzninline{z}. In
this case, the constraint introduced in the \mzninline{let} expression may be
enforced globally, and multiple occurrences of \mzninline{abs(x)} may be
replaced with the same \mzninline{z}. However, the introduced
\mzninline{int_abs} constraint is only needed so long as any \emph{other}
constraint refers to \mzninline{z}. If \mzninline{z} is unused in the rest of
the model, both \mzninline{z} and the \mzninline{int_abs} constraint can be
removed.
As we shall see in \cref{sec:4-simplification}, certain optimisations and
simplifications during the evaluation can lead to many expressions becoming
unused. It is therefore important to track which constraints were merely
introduced to define results of function calls. In order to track these
dependencies, \nanozinc\ attaches constraints that define a variable to the item
that introduces the variable.
\subsection{\glsentrytext{nanozinc}}\label{sec:4-nanozinc}
A \nanozinc\ model (\cref{fig:4-nzn-syntax}) consists of a topologically-ordered
list of variables declaration and constraints. Variables are declared with an
identifier, a domain, and it is associated with a list of identifiers of
auxiliary constraints. Constraints can also occur at the top-level of the
\nanozinc\ model. These are said to be the ``root-context'' constraints of the
model, \ie\ those that have to hold globally and are not just used to define an
auxiliary variable. Only root-context constraints can effectively constrain the
overall problem. Constraints attached to variables originate from function
calls, and since all functions are guaranteed to be total, attached constraints
can only define the function result.
\begin{example}\label{ex:4-absnzn}
Consider the \minizinc\ fragment
\begin{mzn}
constraint abs(x) > y;
\end{mzn}
where \mzninline{x} and \mzninline{y} have domain \mzninline{-10..10}. After
rewriting all definitions, the resulting \nanozinc\ program could look like
this:
\begin{nzn}
var -10..10: x;
var -10..10: y;
var int: z;
↳ constraint int_abs(x, z);
constraint int_gt(z, y);
\end{nzn}
Evaluating the call \mzninline{abs(x)} introduces a new variable \mzninline{z}
(with unrestricted domain) and the \mzninline{int_abs} constraint that is used
to define \mzninline{z}. The top-level constraint \mzninline{abs(x) > y} is
rewritten into \mzninline{int_gt(z, y)} and added as a top-level constraint.
\end{example}
\subsection{Rewriting Rules}
The core of the proposed \minizinc\ architecture is an abstract machine that
performs the rewriting of \nanozinc\ items using the corresponding \microzinc\
function definitions. We can now formally define the rewriting rules for this
abstract machine.
The full set of rules appears in
\cref{fig:4-rewrite-to-nzn-calls,fig:4-rewrite-to-nzn-let,fig:4-rewrite-to-nzn-other}.
To simplify presentation, we assume that all rules that introduce new
identifiers into the \nanozinc\ program do so in a way that ensures those
identifiers are fresh (\ie\ not used in the rest of the \nanozinc\ program), by
suitable alpha renaming.
\begin{figure*}
\centering
\begin{prooftree}
\hypo{\texttt{function} ~t\texttt{:}~\ptinline{F(\(p_1, \ldots{}, p_k\)) = E;} \in{} \Prog{} \text{~where the~} p_i \text{~are fresh}}
\infer[no rule]1{\Sem{\(\ptinline{E}_{[p_i \mapsto a_i, \forall{} 1 \leq{} i \leq{} k]}\)}{\Prog, \Env} \Rightarrow{} \tuple{v, \Env'}}
\infer1[(Call)]{\Sem{F(\(a_1, \ldots, a_k\))}{\Prog, \Env} \Rightarrow{} \tuple{v, \Env'}}
\end{prooftree} \\
\bigskip
\begin{prooftree}
\hypo{\ptinline{F} \in \text{Builtins}}
\infer1[(CallBuiltin)]{\Sem{F(\(a_1, \ldots, a_k\))}{\Prog, \Env} \Rightarrow{} \tuple{\ensuremath{\mathit{eval}(\ptinline{F}(a_1,\ldots, a_k))}, \Env}}
\end{prooftree} \\
\bigskip
\begin{prooftree}
\hypo{\texttt{predicate}~\ptinline{F(\(p_1, \ldots, p_k\));} \in \Prog}
\infer1[(CallPredicate)]{\Sem{F(\(a_1, \ldots, a_k\))}{\Prog, \Env} \Rightarrow{}
\tuple{\ensuremath{\texttt{constraint}~\ptinline{F}(a_1, \ldots, a_k), \Env}}}
\end{prooftree}
\caption{\label{fig:4-rewrite-to-nzn-calls} Rewriting rules for partial evaluation
of \microzinc\ calls to \nanozinc{}.}
\end{figure*}
The rules in \cref{fig:4-rewrite-to-nzn-calls} handle function calls. The first
rule (Call) evaluates a function call expression in the context of a \microzinc\
program \(\Prog{}\) and \nanozinc\ program \(\Env{}\). The rule evaluates the
function body \(\ptinline{E}\) where the formal parameters \(p_{i}\) are
substituted by the call arguments \(a_{i}\).\footnote{We use
\(E_{[a \mapsto b]}\) as a notation for the expression \(E\) where \(a\) is
substituted by \(b\).} The result of this evaluation is the result of the
function call.
The (CallBuiltin) rule applies to ``built-in'' functions that can be evaluated
directly, such as arithmetic and Boolean operations on fixed values. The rule
simply returns the result of evaluating the built-in function on the evaluated
parameter values.
The (CallPredicate) rule applies to calls to solver predicates. These are
considered primitives, and as such simply need to be transferred into the
\nanozinc\ program.
\begin{figure*}
\centering
\begin{prooftree}
\hypo{\Sem{\(\mathbf{I} \mid{} \mathbf{X}\)}{\Prog, \Env, \emptyset} \Rightarrow \tuple{x, \Env'}}
\infer1[(Let)]{\Sem{let \{ \(\mathbf{I}\) \} in \(\mathbf{X}\)}{\Prog, \Env} \Rightarrow\ \tuple{x, \Env'}}
\end{prooftree} \\
\bigskip
\begin{prooftree}
\hypo{\Sem{\(\mathbf{X}\)}{\Prog, \{ t : x \} \cup{} \Env} \Rightarrow \tuple{x, \{ t: x \} \cup{} \Env'}}
\infer1[(Item0)]{\Sem{\(\epsilon{} \mid{} \mathbf{X}\)}{\Prog, \{ t: x \}\cup{}\Env, \Ctx} \Rightarrow{} \tuple{x, \{ t: x \texttt{~↳~} \Ctx{} \} \cup{} \Env'}}
\end{prooftree} \\
\bigskip
\begin{prooftree}
\hypo{\Sem{\(\mathbf{I} \mid{} \mathbf{X}\)}{\Prog, \Env\ \cup\ \{t : x\}, \Ctx} \Rightarrow \tuple{x, \Env'}}
\infer1[(ItemT)]{\Sem{\(t\texttt{:}~x\texttt{;} \mathbf{I} \mid{} \mathbf{X}\)}{\Prog, \Env, \Ctx} \Rightarrow{} \tuple{x, \Env'}}
\end{prooftree} \\
\bigskip
\begin{prooftree}
\hypo{\Sem{\(E\)}{\Prog, \Env} \Rightarrow \tuple{v, \Env'}}
\hypo{\Sem{\(\mathbf{I}_{[x \mapsto v]} \mid{} \mathbf{X}_{[x \mapsto v]}\)}{\Prog, \Env', \Ctx} \Rightarrow\ \tuple{x, \Env''}}
\infer2[(ItemTE)]{\Sem{\(t\texttt{:}~x = E\texttt{;} \mathbf{I} \mid{} \mathbf{X}\)}{\Prog, \Env, \Ctx} \Rightarrow\ \tuple{x, \Env''}}
\end{prooftree} \\
\bigskip
\begin{prooftree}
\hypo{\Sem{\(E_{i}\)}{\Prog,\Env^{i-1}} \Rightarrow \tuple{v_{i}, \Env^{i}}, \forall{} 1 \leq{} i \leq{} k}
\infer[no rule]1{\Sem{\(\mathbf{I}_{[x_{\langle{}i\rangle} \mapsto v_{i}, \forall{} 1 \leq{} i \leq{} k]} \mid{} \mathbf{X}_{[x_{\langle{}i\rangle} \mapsto v_{i}, \forall{} 1 \leq{} i \leq{} k]}\)}{\Prog, \Env^{k}, \Ctx} \Rightarrow\ \tuple{x, \Env'}}
\infer1[(ItemTupC)]{\Sem{\( \texttt{tuple(}t_{1}, \ldots, t_{k}\texttt{):}~x = \left(E_{1}, \ldots, E_{k}\right)\)}{\Prog, \Env^{0}, \Ctx} \Rightarrow{} \tuple{x, \Env'}}
\end{prooftree} \\
\bigskip
\begin{prooftree}
\hypo{\Sem{\(E\)}{\Prog, \Env} \Rightarrow \tuple{\left(v_{\langle{}1\rangle}, \ldots, v_{\langle{}k\rangle}\right), \Env'}}
\infer[no rule]1{\Sem{\(\mathbf{I}_{[x_{i} \mapsto v_{\langle{}i\rangle}, \forall{} 1 \leq{} i \leq{} k]} \mid{} \mathbf{X}_{[x_{i} \mapsto v_{\langle{}i\rangle}, \forall{} 1 \leq{} i \leq{} k]}\)}{\Prog, \Env', \Ctx} \Rightarrow\ \tuple{x, \Env''}}
\infer1[(ItemTupD)]{\Sem{\(\left(t_{1}\texttt{:}~x_{1}, \ldots, t_{k}\texttt{:}~x_{k}\right) = E\texttt{;} \mathbf{I} \mid{} \mathbf{X}\)}{\Prog, \Env, \Ctx} \Rightarrow\ \tuple{x, \Env''}}
\end{prooftree} \\
\bigskip
\begin{prooftree}
\hypo{\Sem{\(C\)}{\Prog, \Env} \Rightarrow \tuple{c, \Env'}}
\hypo{\Sem{\(\mathbf{I} \mid{} \mathbf{X}\)}{\Prog, \Env', \{c\}\cup\Ctx} \Rightarrow \tuple{x, \Env''}}
\infer2[(ItemC)]{\Sem{\(\mbox{constraint}~C\texttt{;} \mathbf{I} \mid{} \mathbf{X}\)}{\Prog, \Env, \Ctx} \Rightarrow\ \tuple{x, \Env''}}
\end{prooftree}
\caption{\label{fig:4-rewrite-to-nzn-let} Rewriting rules for partial evaluation
of \microzinc\ let expressions to \nanozinc{}.}
\end{figure*}
The rules in \cref{fig:4-rewrite-to-nzn-let} considers let expressions. Starting
from the main rule, (Let), the let items \textbf{I} are evaluated one at a time.
When all items are evaluated the constraints generated by the (ItemC) rule will
be bound to the literal returned by the body of the let expression \textbf{X} in
(Item0). The (ItemT) rule handles introduction of new variables by adding them
to the context. The (ItemTE) rule handles introduction of new variables with a
defining equation by evaluating them in the current context and substituting the
name of the new variable by the result of evaluation in the remainder of the
namespace. The rules (ItemTupC) and (ItemTupD) are for the construction and
deconstruction of tuple objects. We use the notation \(x_{\langle{}i\rangle}\)
to mark the \(i^{\text{th}}\) member of the tuple \(x\) in our substitution
notation.
\begin{figure*}
\centering
\begin{prooftree}
\hypo{x \in \langle \text{ident} \rangle}
\hypo{\{t: x \texttt{~↳~} \phi\ \} \in \Env}
\infer2[(IdX)]{\Sem{\(x\)}{\Prog, \Env} \Rightarrow{} \tuple{x, \Env}}
\end{prooftree} \\
\bigskip
\begin{prooftree}
\hypo{\Sem{\(x\)}{\Prog, \Env} \Rightarrow \tuple{[x_{1}, \ldots, x_{n}], \Env'}}
\hypo{\Sem{\(i\)}{\Prog, \Env'} \Rightarrow \tuple{v, \Env''}}
\hypo{v \in [1 \ldots{} n]}
\infer3[(Access)]{\Sem{\(x\texttt{[}i\texttt{]}\)}{\Prog, \Env} \Rightarrow{} \tuple{x_{v}, \Env''}}
\end{prooftree} \\
\bigskip
\begin{prooftree}
\hypo{c \text{~constant}}
\infer1[(Const)]{\Sem{\(c\)}{\Prog, \Env} \Rightarrow{} \tuple{c, \Env}}
\end{prooftree} \\
\bigskip
\begin{prooftree}
\hypo{\Sem{\(C\)}{\Prog, \Env} \Rightarrow \tuple{\ptinline{true}, \_}}
\infer1[(If\(_T\))]{\Sem{if \(C\) then \(E_t\) else \(E_e\) endif}{\Prog, \Env} \Rightarrow{} \Sem{\(E_t\)}{\Prog, \Env}}
\end{prooftree} \\
\bigskip
\begin{prooftree}
\hypo{\Sem{\(C\)}{\Prog, \Env} \Rightarrow \tuple{\ptinline{false}, \_}}
\infer1[(If\(_F\))]{\Sem{if \(C\) then \(E_t\) else \(E_e\) endif}{\Prog, \Env} \Rightarrow{} \Sem{\(E_e\)}{\Prog, \Env}}
\end{prooftree} \\
\bigskip
\begin{prooftree}
\hypo{\Sem{G}{\Prog,\Env} \Rightarrow \tuple{ \ptinline{true}, \_}}
\hypo{\Sem{\(E\)}{\Prog, \Env} \Rightarrow \tuple {v, \Env'}}
\infer2[(WhereT)]{\Sem{\([E ~|~ \mbox{where~} G]\)}{\Prog, \Env} \Rightarrow{} \tuple{[v], \Env'}}
\end{prooftree} \\
\bigskip
\begin{prooftree}
\hypo{\Sem{G}{\Prog,\Env} \Rightarrow \tuple{\ptinline{false}, \_}}
\infer1[(WhereF)]{\Sem{\([E ~|~ \mbox{where~} G]\)}{\Prog, \Env} \Rightarrow{} \tuple{[], \Env}}
\end{prooftree} \\
\bigskip
\begin{prooftree}
\hypo{\Sem{\(\mathit{PE}\)}{\Prog,\Env} \Rightarrow \tuple{S, \_}}
\infer[no rule]1{\Sem{\([E ~|~ \mathit{GE} \mbox{~where~} G]\)}{\Prog, \Env_{[x \mapsto{} v]}} \Rightarrow{} \tuple{A_v, \Env_{[x \mapsto{} v]} \cup{} \Env{}_{v}}, \forall{} v \in{} S }
\infer1[(ListG)]{\Sem{\([E~|~ x \mbox{~in~} \mathit{PE}, \mathit{GE} \mbox{~where~} G]\)}{\Prog, \Env} \Rightarrow{}
\tuple{\mbox{concat} [ A_v~|~v \in{} S ], \Env{} \cup{} \bigcup{}_{v \in{} S} \Env{}_{v}}}
\end{prooftree}
\caption{\label{fig:4-rewrite-to-nzn-other} Rewriting rules for partial evaluation
of other \microzinc\ expressions to \nanozinc{}.}
\end{figure*}
Finally, the rules in \cref{fig:4-rewrite-to-nzn-other} handle the evaluation of
variables, constants, conditionals and comprehensions. For completeness, we give
the details here.
The (IdX) and (IdC) rules look up a variable in the environment and return its
bound value (for constants) or the variable itself. The (Const) rule simply
returns the constant. The (If\(_T\)) and (If\(_F\)) rules evaluate the if condition
and then return the result of the then or else branch appropriately. The
(WhereT) rule evaluates the guard of a where comprehension and if true returns
the resulting expression in a list. The (WhereF) rule evaluates the guard and
when false returns an empty list. The (ListG) rule evaluates the iteration set
\(PE\) to get its value \(S\), which must be fixed. It then evaluates the expression
\(E\) with all the different possible values \(v \in S\) of the generator \(x\),
collecting the additional \nanozinc\ definitions \(\Env_v\). It returns the
concatenation of the resulting lists with all the additional \nanozinc\ items.
\subsection{Prototype Implementation}
Our prototype implementation of the \microzinc/\nanozinc\ framework consists of
the following components.
The \textbf{compiler} translates \minizinc\ into a \textbf{byte code} encoding of
\microzinc{}. The compiler currently supports a significant subset of the full
\minizinc\ language, with the missing features (such as multi-dimensional arrays
and complex output statements) requiring additional engineering effort but no
new technology.
The \textbf{interpreter} evaluates \microzinc\ byte code and \nanozinc\ programs
based on the rewrite system introduced in this section. It can read parameter
data from a file or via an API\@. The interpreter constructs the call to the
\mzninline{main} function, and then performs the rewriting into flat \nanozinc{}.
Memory management is implemented using reference counting, which lends itself
well to the optimisations discussed in the next section. The interpreter is
written in around 3.5kLOC of C++ to enable easy embedding into applications and
other programming languages.
The framework includes \textbf{interfaces} to multiple solvers. It can
pretty-print the \nanozinc\ code to standard \flatzinc{}, so that any solver
currently compatible with \minizinc\ can be used. Additionally, a direct C++ API
can be used to connect solvers directly, in order to enable incremental
solving. This topic is revisited in \cref{ch:incremental}.
\textbf{Python bindings} to the framework are provided to enable easy
scripting.
The source code for the complete system will be made available under an open
source license.
\section{NanoZinc Simplification}\label{sec:4-simplification}
The previous section introduced the basic evaluation model of \nanozinc: each
call that has a corresponding \microzinc\ definition can be rewritten into a set
of calls by evaluating the \microzinc\ code. This section looks at further steps
that the system can perform in order to produce better, more compact \nanozinc\
code.
\subsection{Removing dead variables and constraints}
The most basic form of simplification is to identify variables that are not
used by any call, and remove them. This is correct because the semantics of
\nanozinc\ requires us to find values for all variables such that all
constraints are satisfied --- but if a variable does not occur in any
constraint, we can pick any value for it. The model without the variable is
therefore equisatisfiable with the original model.
Consider now the case where a variable in \nanozinc\ is only used in its own
auxiliary definitions (the constraints directly succeeding the declaration
prepended by \texttt{}).
\begin{example}\label{ex:4-absreif}
The following is a slight variation on the \minizinc\ fragment in
\cref{ex:4-absnzn}:
\begin{mzn}
constraint abs(x) > y \/ c;
constraint c;
\end{mzn}
The constraint \mzninline{abs(x) > y} is now used in a disjunction. This means
that instead of enforcing the constraint globally, the model needs to reason
about whether the constraint holds or not. This is called \gls{reification}
(we will come back to this in \cref{sec:4-cse}). Following the rewriting
rules, the generated \nanozinc\ code will look very similar to the code
generated in \cref{ex:4-absnzn}, but with an additional \mzninline{bool_or}
constraint for the disjunction, which uses a variable \mzninline{b} that will
be introduced for \mzninline{abs(x) > y}:
\begin{nzn}
var true: c;
var -10..10: x;
var -10..10: y;
var int: z;
↳ constraint int_abs(x, z);
var bool: b;
↳ constraint int_gt_reif(z, y, b);
constraint bool_or(b, c);
\end{nzn}
Evaluating \mzninline{constraint c} will result in the domain of \mzninline{c}
to be bound to \mzninline{true}, the disjunction then trivially holds,
independent of the value of \mzninline{b}. The constraint
\mzninline{abs(x) > y} is therefore not required. However, the rewriting has to choose a
particular order in which arguments are evaluated, so it is always possible
that it already evaluated the left-hand side before ``noticing'' that the
disjunction is true.
If the system now simplifies the constraint \mzninline{bool_or(b, c)} to
\mzninline{true}, the identifier \mzninline{b} will become unused outside
its auxiliary definitions, since it was only referenced from the
\mzninline{bool_or} call. Removing \mzninline{b} leads will also its defining
constraint, \mzninline{int_gt_reif}, being removed. This in turn means that
\mzninline{z} is not referenced anywhere outside its auxiliary definitions and
can be removed together with its definition. The resulting \nanozinc\ program
is much more compact:
\begin{nzn}
var true: c;
var -10..10: x;
var -10..10: y;
\end{nzn}
We assume here that \mzninline{c}, \mzninline{x} and \mzninline{y} are used in
other parts of the model not shown here and therefore not removed.
\end{example}
Note how counting usage of variables outside its auxiliary definitions allows us
to remove the \mzninline{z} variable --- we could not simply have counted the
usage of \mzninline{z} in all constraints since the constraint used to define
it, \mzninline{int_abs(x, z)}, would have kept it alive.
It is important to notice the difference between constraints that appear at the
top level and constraints that appear as part of the auxiliary definitions of a
variable. Top-level constraint help define the problem, they are globally
enforced and cannot be removed from the problem unless it can be proved that its
relationship is already enforced. Constraint in auxiliary definitions, on the
other hand, are used to define a variable. Auxiliary definitions generally occur
from a functional relationship, like \mzninline{abs(x)} in the example, or from
the reflection if a constraint holds or not, like \mzninline{abs(x) > y} when
used in the disjunction with \mzninline{c}. These types of reflection
constraints are typically called \glspl{reification}. In this thesis we will
follow the same naming standard as \minizinc\ where a \gls{reification} of a
constraint has the same name as the original constraint with \texttt{\_reif}
appended.
\begin{example}
The importance of the use of auxiliary definitions when removing dead
variables and constraints is demonstrated in the following example:
\begin{mzn}
int: n;
var int: x;
function var int: f(var int: x, int: i) = let {
var int: y;
constraint y = x div i;
} in y;
constraint [f(x,i) | i in 1..n][1] == 10;
\end{mzn}
The rewriting of this instance will initially create the \mzninline{n}
elements for the array comprehension. Each element is evaluated as a new
variable \mzninline{y} and a constraint to ensure that \mzninline{y} is the
division of \mzninline{x} and \mzninline{i}. Although \mzninline{n} variables
are created, only the first variable is constrained to be 10. All other
variables can thus be removed and the model will stay equisatisfiable.
In the \nanozinc\ version of this model, each of the division constraints will
be added to the list of auxiliary constraints of its corresponding
\mzninline{y} variable. When, after evaluating the array access, all of those
variables except the first are detected to be unused, their removal triggers
the removal of the subtraction constraints.
For this example the current \minizinc\ compiler, version 2.5.5, will produce
a \flatzinc\ program that contains the variable \mzninline{x}, \mzninline{n-1}
variables \mzninline{y}, and \mzninline{n} constraints for the division.
The first \mzninline{y} variable is directly replaced with the constant 10.
Using the \nanozinc\ system, only 1 variable, \mzninline{x}, and 1 constraint,
\mzninline{constraint int_minus(x,1,10);} is produced.
\end{example}
\paragraph{Implementation} The removal of unused identifiers is taken care of
by garbage collection in the interpreter. Since our prototype interpreter uses
reference counting, it already accurately tracks the liveness of each
identifier.
\subsection{Constraint Propagation}
Another way that variables can become unused is if a constraint can be detected
to be true based on its semantics, and the known domains of the variables. For
example, consider the constraint \mzninline{3*x + 7*y > 8}, and assume that
both \mzninline{x} and \mzninline{y} cannot be smaller than 1. In this case, we
can determine that the constraint is always satisfied, and remove it from the
model without changing satisfiability. This, in turn, may result in
\mzninline{x} or \mzninline{y} becoming unused and also removed (including
their auxiliary definitions, if any).
This is a simple form of \gls{propagation}, the main inference method used in
constraint solvers \autocite{schulte-2008-propagation}. \gls{propagation},
in general, can not only detect when constraints are trivially true (or false),
but also tighten variable domains. For instance, given the constraint
\mzninline{x>y}, with initial domains \mzninline{x in 1..10, y in 1..10}, would
result in the domains being tightened to \mzninline{x in 2..10, y in 1..9}.
These tightened domains would be reflected in the corresponding
\mzninline{mkvar(...)} calls. Note, however, that this may now prevent us from
removing \mzninline{x} or \mzninline{y}: even if they later become unused, the
tightened domains may impose a constraint on their definition. For instance, if
\mzninline{x} is defined as \mzninline{abs(z)}, then any restriction on the
domain of \mzninline{x} constrains the possible values of \mzninline{z} through
the \mzninline{abs} function. We therefore need to track whether a variable
domain has been enforced through external constraints, or is the consequence of
its own definition. We omit the details of this mechanism for space reasons.
A further effect of \gls{propagation} is the replacement of constraints by
simpler versions. A constraint \mzninline{x=y+z}, where the domain of
\mzninline{z} has been tightened to the singleton \mzninline{0..0}, can be
rewritten to \mzninline{x=y}. Propagating the effect of one constraint can thus
trigger the propagation of other constraints, and constraint solvers typically
run all propagation rules or algorithms until they reach a fix-point.
The \nanozinc\ interpreter applies propagation rules for the most important
constraints, such as linear and Boolean constraints, which can lead to
significant reductions in generated code size. It has been shown previously that
applying full \gls{propagation} during compilation can be beneficial
\autocite{leo-2015-multipass}, in particular when the target solver requires
complicated reformulations (such as linearisation for MIP solvers
\autocite{belov-2016-linearisation}).
Returning to the example above, \gls{propagation} (and other model
simplifications) can result in \emph{equality constraints} \mzninline{x=y} being
introduced. For certain types of target solvers, in particular those based on
constraint programming, equality constraints can lead to exponentially larger
search trees if not detected and simplified. Equalities therefore present a
valuable opportunity for model simplification, but also pose some challenges.
Consider the simple case of \mzninline{x=y} where one of the variables, say
\mzninline{x}, was a top-level variable introduced by the modeller, and not
defined as the result of a function call. In that case, we can propagate the
equality constraint by replacing all occurrences of \mzninline{x} with
\mzninline{y}, and remove the definition of \mzninline{x}.
The more complicated case for \mzninline{x=y} is when both \mzninline{x} and
\mzninline{y} are the result of function calls, \eg\ from a model fragment such
as \mzninline{constraint f(a)=g(b)}. Let us assume that part of the
corresponding \nanozinc\ code looks like this:
\begin{nzn}
var int: x;
↳ constraint f_rel(a, x);
var int: y;
↳ constraint g_rel(b, y);
constraint int_eq(x, y);
\end{nzn}
In this case, simply removing either \mzninline{x} or \mzninline{y} would not be
correct, since that would trigger the removal of the corresponding definition
\mzninline{f_rel(a, x)} or \mzninline{g_rel(b, y)}, leaving nothing to be
substituted. Rather, the interpreter needs to pick one of the two definitions
and \emph{promote} it to top-level status. Let us assume that the interpreter
decides to replace \mzninline{y} by \mzninline{x}. \mzninline{y} is then
globally replaced by \mzninline{x}, and its declaration is removed. The
interpreter moves all auxiliary constraints from \mzninline{y} into the list of
top-level constraints, and then removes the \mzninline{int_eq} constraint. The
resulting \nanozinc\ looks like this:
\begin{nzn}
var int: x;
↳ constraint f_rel(a, x);
constraint g_rel(x, y);
constraint int_eq(x, y);
\end{nzn}
The process of equality propagation is similar to unification in logic
programming \autocite{warren-1983-wam}. However, since equations in \minizinc\
always concern scalar variables rather than terms, we can restrict ourselves to
this simple substitution scheme and leave the further checking of the resulting
equalities to the target constraint solver.
\paragraph{Implementation}
Rewriting a function call that has a \microzinc\ definition and rewriting a
constraint due to \gls{propagation} are very similar. The interpreter therefore simply
interleaves both forms of rewriting. Efficient \gls{propagation} engines
``wake up'' a constraint only when one of its variables has received an update
(\ie\ when its domain has been shrunk). To enable this, the interpreter needs
to keep a data structure linking variables to the constraints where they appear
(in addition to the reverse links from calls to the variables in their
arguments). These links do not take part in the reference counting.
\subsection{Delayed Rewriting}
Adding \gls{propagation} and simplification into the rewriting system means
that the system becomes non-confluent, and some orders of execution may produce
``better'', \ie\ more compact or more efficient, \nanozinc{}.
\begin{example}
The following example is similar to code found in the \minizinc\ libraries of
MIP solvers.
\begin{mzn}
function var int: lq_zero_if_b(var int: x, var bool: b) =
x <= ub(x)*(1-b);
\end{mzn}
This predicate expresses the constraint \mzninline{b -> x<=0}, using a
well-known method called ``big-M transformation''. The expression
\mzninline{ub(x)} returns a valid upper bound for \mzninline{x}, \ie\ a fixed
value known to be greater than or equal to \mzninline{x}. This could be the
initial upper bound \mzninline{x} was declared with, or the current value in
the corresponding \nanozinc\ \mzninline{mkvar} call. If \mzninline{b} takes
the value 0, the expression \mzninline{ub(x)*(1-b)} is equal to
\mzninline{ub(x)}, and the constraint \mzninline{x <= ub(x)} holds trivially.
If \mzninline{b} takes the value 1, \mzninline{ub(x)*(1-b)} is equal to 0,
enforcing the constraint \mzninline{x <= 0}.
\end{example}
For MIP solvers, it is quite important to enforce \emph{tight} bounds in order
to improve efficiency and sometimes even numerical stability. It would therefore
be useful to rewrite the \mzninline{lq_zero_if_b} predicate only after all the
\gls{propagation} has been performed, in order to take advantage of the
tightest possible bounds. On the other hand, evaluating a predicate may also
\emph{impose} new bounds on variables, so it is not always clear which order of
evaluation is best.
In our \microzinc/\nanozinc\ system, we allow predicates and functions to be
\emph{annotated} as potential candidates for \emph{delayed rewriting}. Any
annotated constraint is handled by the (CallPrimitive) rule rather than the
(Call) rule, which means that it is simply added as a call into the \nanozinc\
code, without evaluating its body. When \gls{propagation} reaches a fix-point,
all annotations are removed from the current \nanozinc\ program, and evaluation
resumes with the predicate bodies.
This crucial optimisation enables rewriting in multiple \emph{phases}. For
example, a model targeted at a MIP solver is rewritten into Boolean and reified
constraints, whose definitions are annotated to be delayed. The \nanozinc\ model
can then be fully simplified by \gls{propagation}, before the Boolean and
reified constraints are rewritten into lower-level linear primitives suitable
for MIP\@.
\subsection{Common Sub-expression Elimination}\label{sec:4-cse}
\jip{Adjust with the background section. Just needs to talk about \microzinc{} optimisation.}
The simplifications introduced above remove definitions from the model when we
can detect that they are no longer needed. In some cases, it is even possible to
not generate definitions in the first place through the use of
\glsaccesslong{cse}. This simplification is a well understood technique that
originates from compiler optimisation \autocite{cocke-1970-cse} and has proven
to be very effective in discrete optimisation
\autocite{marinov-2005-sat-optimisations, araya-2008-cse-numcsp}, including
during the evaluation of constraint modelling languages such as \minizinc\
\autocite{rendl-2009-enhanced-tailoring}.
For instance, in the constraint\\
\mzninline{(abs(x)*2 >= 20) \/ (abs(x)+5 >= 15)}\\
the expression \mzninline{abs(x)} is evaluated twice. Since functions in
\microzinc\ are guaranteed to be pure and total, any expressions that are
syntactically equal can be detected by a static compiler analysis before
evaluation even begins, and hoisted into a let expression:
\begin{mzn}
constraint let { var int: y = abs(x) } in
(y*2 >= 20) \/ (y+5 >= 15)
\end{mzn}
However, some expressions only become syntactically equal during evaluation, as
in the following example.
\begin{example}
Consider the fragment:
\begin{mzn}
function var float: pythagoras(var float: a, var float: b) =
let {
var float: x = pow(a, 2);
var float: y = pow(b, 2);
} in sqrt(x + y);
constraint pythagoras(i, i) >= 5;
\end{mzn}
Although the expressions \mzninline{pow(a, 2)} and \mzninline{pow(b, 2)} are
not syntactically equal, the function call \mzninline{pythagoras(i,i)} using
the same variable for \mzninline{a} and \mzninline{b} makes them equivalent. A
straightforward approach to ensure that the same instantiation of a function
is only evaluated once is through memorisation. After the evaluation of a
\microzinc\ call, the instantiated call and its result are stored in a lookup
table, the \gls{cse} table. Then before any consequent call is evaluated the
\gls{cse} table is consulted. If the call is found, then the accompanying
result is used; otherwise, the evaluation proceeds as normal.
In our example, the evaluation of \mzninline{pythagoras(i, i)} would proceed
as normal to evaluate \mzninline{x = pow(i, 2)}. However, the expression
defining \mzninline{y}, \mzninline{pow(i, 2)}, will be found in the \gls{cse}
table and replaced by the earlier stored result: \mzninline{y = x}.
\end{example}
It is worth noting that the \gls{cse} approach based on memorisation also covers
the static example above, where the compiler can enforce sharing of common
sub-expressions before evaluation begins. However, since the static analysis is
cheap and can potentially avoid many dynamic table look-ups, it is valuable to
combine both approaches. The static approach can be further improved by
\emph{inlining} function calls, since that may expose more calls as being
syntactically equal.
\jip{TODO:\ \gls{cse} will be discussed in the background. How can we make this
more specific to how it works in \nanozinc{}.}
\paragraph{Reification}
\jip{Adjust with information in the background section. Is there anything specific to the \microzinc{}/\nanozinc{} system?}
Many constraint modelling languages, including \minizinc{}, not only enable
\constraints{} to be globally enforced, but typically also support
\constraints{} to be \emph{reified} into a Boolean variable. Reification means
that a variable \mzninline{b} is constrained to be true if and only if a
corresponding constraint \mzninline{c(...)} holds. We have already seen
reification in \cref{ex:4-absreif}: the truth of constraint \mzninline{abs(x) >
y} was bound to a Boolean variable \mzninline{b1}, which was then used in a
disjunction. We say that the same constraint can be used in \emph{root context}
as well as in a \emph{reified context}. In \minizinc{}, almost all constraints
can be used in both contexts. However, reified constraints are often defined in
the library in terms of complicated decompositions into simpler constraints, or
require specialised algorithms in the target solvers. In either case, it can be
very beneficial for the efficiency of the generated \nanozinc\ program if we can
detect that a reified constraint is in fact not required.
If a constraint is present in the root context, it means that it must hold
globally. If the same constraint is used in a reified context, it can therefore
be replaced with the constant \mzninline{true}, avoiding the need for
reification (or in fact any evaluation). We can harness \gls{cse} to store the
evaluation context when a constraint is added, and detect when the same
constraint is used in both contexts. Whenever a lookup in the \gls{cse} table is
successful, action can be taken depending on both the current and stored
evaluation context. If the stored expression was in root context, then the
constant \mzninline{true} can be used, independent of the current context.
Otherwise, if the stored expression was in reified context and the current
context is reified, then the stored result variable can be used. Finally, if the
stored expression was in reified context and the current context is root
context, then the previous result can be replaced by the constant
\mzninline{true} (and all its dependencies removed) and the evaluation will
proceed as normal with the root context constraint.
\begin{example}
Consider the fragment:
\begin{mzn}
function var bool: p(var int: x, var int: y) = q(x) /\ r(y);
constraint b0 <-> q(x);
constraint b1 <-> t(x,y);
constraint b1 <-> p(x,y);
\end{mzn}
If we process the top-level constraints in order we create a reified call to
\mzninline{q(x)} for the original call. Suppose processing the second
constraint we discover \mzninline{t(x,y)} is \mzninline{true}, fixing
\mzninline{b1}. When we then process \mzninline{q(x)} in instantiation of the
call \mzninline{p(x,y)}, we find it is the root context. So \gls{cse} needs to
set \mzninline{b0} to \mzninline{true}.
\end{example}
\minizinc{} distinguishes not only between root and reified contexts, but in
addition recognises constraints in \emph{positive} contexts, also known as
\emph{half-reified} \autocite{feydy-2011-half-reif}. A full explanation and
discussion of half-reification can be found in \cref{ch:half-reif}.
\jip{TODO:\ Reification should also be discussed in the background. How can we
make this more specific to how it works in \nanozinc{}.}
\subsection{Constraint Aggregation}%
\label{subsec:rew-aggregation}
Complex \minizinc\ expression can sometimes result in the creation of many new
variables that represent intermediate results. This is in particular true for
linear and boolean equations that are generally written using \minizinc\
operators. For example the evaluation of the linear constraint \mzninline{x +
2*y <= z} will result in the following \nanozinc:
\begin{nzn}
var int: x;
var int: y;
var int: z;
var int: i1;
↳ constraint int_times(y, 2);
var int: i2;
↳ constraint int_plus(x, i1);
constraint int_le(i2, z);
\end{nzn}
These \nanozinc\ definitions are correct, but, at least for \gls{mip} solvers,
the existence of the intermediate variables is likely to have a negative impact
on the \gls{solver}'s performance. These \glspl{solver} would likely perform
better had they received the linear constraint
\mzninline{int_lin_le([1,2,-1], [x,y,z], 0)} directly. Since many solvers support linear constraints, it is
often an additional burden to have intermediate values that have to be given a
value in the solution.
This can be resolved using the \gls{aggregation} of constraints. When we
aggregate constraints we combine constraints connected through functional
definitions into one or multiple constraints eliminating the need for
intermediate variables. For example, the arithmetic definitions can be combined
into linear constraints, Boolean logic can be combined into clauses, and
counting constraints can be combined into global cardinality constraints.
In \nanozinc{}, we are able to aggregate constraints during partial evaluation. To
aggregate a certain kind of constraint, the solver must the constraint as a
solver-level primitive. These constraints will now be kept as temporary
functional definitions in the \nanozinc\ program. Once a top-level (relational)
\gls{constraint} is posted that uses the temporary functional definitions as one of
its arguments, the interpreter will employ dedicated \gls{aggregation} logic to
visit the functional definitions and combine their constraints. The top-level
\gls{constraint} is then replaced by the combined \gls{constraint}. When the
intermediate variables become unused, they will be removed using the normal
mechanisms.
\jip{TODO:\ Aggregation is also part of the background. How can we make this more
specific to how it works in \nanozinc{}.}
\section{Experiments}\label{sec:4-experiments}
We have created a prototype implementation of the architecture presented in the
preceding sections. It consists of a compiler from \minizinc\ to \microzinc{}, and
an incremental \microzinc\ interpreter producing \nanozinc{}. The system supports
a significant subset of the full \minizinc\ language; notable features that are
missing are support for set and float variables, option types, and compilation
of model output expressions and annotations. We will release our implementation
under an open-source license and can make it available to the reviewers upon
request.
The implementation is not optimised for performance yet, but was created as a
faithful implementation of the developed concepts, in order to evaluate their
suitability and provide a solid baseline for future improvements. In the
following we present experimental results on basic flattening performance as
well as incremental flattening and solving that demonstrate the efficiency
gains that are possible thanks to the new architecture.
We selected 20 models from the annual \minizinc\ challenge and compiled 5
instances of each model to \flatzinc{}, using the current \minizinc\ release
version 2.4.3 and the new prototype system. In both cases we use the standard
\minizinc\ library of global constraints (\ie\ we decompose those constraints
rather than using solver built-ins, in order to stress-test the flattening). We
measured pure flattening time, \ie\ without time required to parse and type check
in version 2.4.3, and without time required for compilation to \microzinc\ in
the new system (compilation is usually very fast). Times are averages of 10
runs.\footnote{All models obtained from
\url{https://github.com/minizinc/minizinc-benchmarks}:
\texttt{\justify{}accap, amaze, city-position, community-detection,
depot-placement, freepizza, groupsplitter, kidney-exchange, median-string,
multi-knapsack, nonogram, nside, problem, rcpsp-wet, road-cons, roster,
stack-cuttingstock, steelmillslab, train, triangular, zephyrus}.}
\Cref{sfig:4-compareruntime} compares the flattening time for each of the 100
instances. Points below the line indicate that the new system is faster. On
average, the new system achieves a speed-up of \(2.3\), with very few instances
not achieving any speedup. In terms of memory performance
(\cref{sfig:4-comparemem}), version 2.4.3 can sometimes still outperform the new
prototype. We have identified that the main memory bottlenecks are our currently
unoptimised implementations of \gls{cse} lookup tables and argument vectors.
These are very encouraging results, given that we are comparing a largely
unoptimised prototype to a mature piece of software.
\begin{figure}[t]
\centering
\begin{subfigure}[b]{.48\columnwidth}
\centering
\includegraphics[width=\columnwidth]{assets/img/4_compare_runtime}
\caption{\label{sfig:4-compareruntime}Flattening run time (ms)}
\end{subfigure}%
\hspace{0.04\columnwidth}%
\begin{subfigure}[b]{.48\columnwidth}
\centering
\includegraphics[width=\columnwidth]{assets/img/4_compare_memory}
\caption{\label{sfig:4-comparemem}Flattening memory (MB)}
\end{subfigure}
\caption{\label{fig:4-runtime}Performance on flattening 100 MiniZinc Challenge
instances. \minizinc\ 2.4.3 (x-axis) vs new architecture (y-axis), log-log
plot. Dots below the line indicate the new system is better.}
\end{figure}
\begin{figure}[ht]
\centering
\definecolor{cb1}{RGB}{230, 159, 0}
\definecolor{cb2}{RGB}{ 86, 180, 223}
\definecolor{cb3}{RGB}{ 0, 158, 115}
\definecolor{cb4}{RGB}{240, 228, 66}
\pgfplotstableread[
col sep=comma,
]{assets/table/rew_functional.csv}\rewFuncData{}
\begin{tikzpicture}
\begin{axis}[
xbar,
reverse legend,
legend style={at={(0.82,0.8)},
anchor=west,legend columns=1},
xlabel={Run-time (ms)},
symbolic y coords={{A(3,6)},{Fib(23)},{Tak(18,12,6)}},
ytick=data,
% cycle list name=exotic,
nodes near coords,
nodes near coords align={horizontal},
width = \columnwidth,
height = 8cm,
enlarge y limits={0.3},
enlarge x limits={0.03},
xmin = 0,
xmax = 130,
]
\addplot[fill=cb1, postaction={pattern=grid}] table [y={test}, x={MiniZinc}]{\rewFuncData};
\addplot[fill=cb2, postaction={pattern=north east lines}] table [y={test}, x={Python}]{\rewFuncData};
\addplot[fill=cb3, postaction={pattern=dots}] table [y={test}, x={Prototype}]{\rewFuncData};
\addplot[fill=cb4] table [y={test}, x={OCaml}]{\rewFuncData};
\legend{MiniZinc,Python,Prototype,OCaml}
\end{axis}
\end{tikzpicture}
\caption{\label{fig:rew-interpreter-comp} Run-time comparison of interpreters
on recursive functions: Takeuchi, Fibonacci, and Ackermann.}
\end{figure}
\section{Summary}%
\label{sec:rew-summary}
In this chapter, we introduced a systematic view of the execution of \minizinc{},
revisiting the rewriting from high-level \minizinc\ to solver-level \flatzinc{}.
We first introduced the intermediate languages \microzinc{} and
\nanozinc{} and explained how \minizinc\ can be transformed into a set of
\microzinc\ definitions and a \nanozinc\ program. We then, crucially, discussed
how to partially evaluate a \nanozinc\ program using \microzinc\ definitions and
present formal definitions of the rewriting rules applied during partial
evaluation.
Continuously applying these rules would result in a correct solver-level model,
but it is unlikely to be the best model for the solver. We, therefore, discuss
multiple techniques to improve the solver-level constraint model during the
partial evaluation of \nanozinc{}. First, we introduce a novel technique to
eliminate all unused variables and constraints: we track all constraints that
are introduced only to define a variable. This means we can keep an accurate
count of when a variable becomes unused by counting only the references to a
variable in constraints that do not help define it. Then, we discuss the use of
\gls{propagation} during the partial evaluation of the \nanozinc\ program. This
technique can help shrink the domains of the decision variable or even combine
variables known to be equal. When a redefinition of a constraint requires the
introspection into the current domain of a variable, it is often important to
have the tightest possible domain. Hence, we discuss how in choosing the
next \nanozinc\ constraint to rewrite, the interpreter can sometimes delay the
rewriting of certain constraints to ensure the most amount of information is
available. \Gls{cse}, our next optimisation technique, ensures that we do not
create or evaluate the same constraint or function twice and reuse variables
where possible. Finally, the last optimisation technique we discuss is the use
of constraint \gls{aggregation}. The use of \gls{aggregation} ensures that
individual functional constraints can be collected and combined into an
aggregated form. This allows us to avoid the existence of intermediate variables
in some cases. This optimisation is very important for \gls{mip} solvers.
Finally, we test the described system using an experimental implementation. We
compare this experimental implementation against the current \minizinc\
interpreter, version 2.5.3, and look at both runtime and its memory usage.
Although the experimental implementation there are instances where the
experimental implementation uses more memory than the current \minizinc\
interpreter, it clearly outperforms the current \minizinc\ interpreter in terms
of time.