Fixing things

This commit is contained in:
Jip J. Dekker 2021-03-09 15:03:59 +11:00
parent 25c23ae378
commit 78b627a88b
No known key found for this signature in database
GPG Key ID: 517DF4A00618C9C3
6 changed files with 238 additions and 196 deletions

4
.chktexrc Normal file
View File

@ -0,0 +1,4 @@
# Exclude these environments from syntax checking
VerbEnvir { pgfpicture tikzpicture mzn grammar proof }
WipeArg { \mzninline:{} \Sem:{} \texttt:{} }

View File

@ -66,6 +66,9 @@ style=apa,
\newenvironment{example}[1][]{\refstepcounter{example}\par\medskip \noindent
\textbf{Example~\theexample. #1} \rmfamily}{\hfill \ensuremath{\square}}
% Algorithms
% \usepackage[ruled,vlined]{algorithm2e}
% Code formatting
\usepackage[
cachedir=listings,

View File

@ -67,3 +67,61 @@ solution that fits the requirements of the model.
\section{What Makes a ``Good'' Model?}
\label{sec:2-model-quality}
\section{From the Abstract Machine Paper}
\minizinc\ \autocite{nethercote-2007-minizinc} is a high-level, solver- and
data-independent modelling language for discrete satisfiability and optimisation
problems. Its expressive language and extensive library of constraints allow
users to easily model complex problems.
Let us introduce the language by modelling the well-known \emph{Latin squares}
problem \autocite{wallis-2011-combinatorics}: Given an integer $n$, find an
$n \times n$ matrix, such that each row and column is a permutation of values
$1 \ldots n$. A \minizinc\ model encoding this problem could look as follows:
\begin{mzn}
int: n;
array [1..n, 1..n] of var 1..n: x;
constraint forall (r in 1..n) (
all_different([x[r, c] | c in 1..n])
);
constraint forall (c in 1..n) (
all_different([x[r, c] | r in 1..n])
);
\end{mzn}
The model introduces a \gls{parameter} \mzninline{n}, and a two-dimensional
array of \glspl{variable} (marked by the \mzninline{var} keyword) \mzninline{x}.
Each variable in \mzninline{x} is restricted to the set of integers
\mzninline{1..n}, which is called the variable's \gls{domain}. The constraints
specify the requirements of the problem: for each row \mzninline{r}, the
\mzninline{x} variables of all columns must take pairwise different values (and
the same for each column \mzninline{c}). This is modelled using the
\mzninline{all_different} function, one of hundreds of pre-defined constraints
in \minizinc's library.
Given ground assignments to input \glspl{parameter}, a \minizinc\ model is
translated (via a process called \emph{flattening}) into a set of variables and
primitive constraints. Here is the result of flattening for \mzninline{n=2}:
\begin{mzn}
var 1..2: x_1_1;
var 1..2: x_1_2;
var 1..2: x_2_1;
var 1..2: x_2_2;
constraint all_different([x_1_1, x_1_2]);
constraint all_different([x_2_1, x_2_2]);
constraint all_different([x_1_1, x_2_1]);
constraint all_different([x_1_2, x_2_2]);
\end{mzn}
This \emph{flat} problem will be passed to some \gls{solver}, which will attempt
to determine an assignment to each decision variable \verb|x_i_j| that satisfies
all constraints, or report that there is no such assignment.
This type of combinatorial problem is typically called a \gls{csp}. \minizinc
also supports the modelling of \gls{cop}, where a \gls{csp} is augmented with an
\gls{objective} $z$. In this case the goal is to find an assignment that
satisfies all constraints while minimising (or maximising) $z$.

View File

@ -2,66 +2,10 @@
\chapter{Rewriting Constraint Modelling Languages}\label{ch:rewriting}
%************************************************
\minizinc\ \autocite{nethercote-2007-minizinc} is a high-level, solver- and
data-independent modelling language for discrete satisfiability and optimisation
problems. Its expressive language and extensive library of constraints allow
users to easily model complex problems.
Let us introduce the language by modelling the well-known \emph{Latin squares}
problem \autocite{wallis-2011-combinatorics}: Given an integer $n$, find an
$n \times n$ matrix, such that each row and column is a permutation of values
$1 \ldots n$. A \minizinc\ model encoding this problem could look as follows:
\begin{mzn}
int: n;
array [1..n, 1..n] of var 1..n: x;
constraint forall (r in 1..n) (
all_different([x[r, c] | c in 1..n])
);
constraint forall (c in 1..n) (
all_different([x[r, c] | r in 1..n])
);
\end{mzn}
The model introduces a \gls{parameter} \mzninline{n}, and a two-dimensional
array of \glspl{variable} (marked by the \mzninline{var} keyword) \mzninline{x}.
Each variable in \mzninline{x} is restricted to the set of integers
\mzninline{1..n}, which is called the variable's \gls{domain}. The constraints
specify the requirements of the problem: for each row \mzninline{r}, the
\mzninline{x} variables of all columns must take pairwise different values (and
the same for each column \mzninline{c}). This is modelled using the
\mzninline{all_different} function, one of hundreds of pre-defined constraints
in \minizinc's library.
Given ground assignments to input \glspl{parameter}, a \minizinc\ model is
translated (via a process called \emph{flattening}) into a set of variables and
primitive constraints. Here is the result of flattening for \mzninline{n=2}:
\begin{mzn}
var 1..2: x_1_1;
var 1..2: x_1_2;
var 1..2: x_2_1;
var 1..2: x_2_2;
constraint all_different([x_1_1, x_1_2]);
constraint all_different([x_2_1, x_2_2]);
constraint all_different([x_1_1, x_2_1]);
constraint all_different([x_1_2, x_2_2]);
\end{mzn}
This \emph{flat} problem will be passed to some \gls{solver}, which will attempt
to determine an assignment to each decision variable \verb|x_i_j| that satisfies
all constraints, or report that there is no such assignment.
This type of combinatorial problem is typically called a \gls{csp}. \minizinc
also supports the modelling of \gls{cop}, where a \gls{csp} is augmented with an
\gls{objective} $z$. In this case the goal is to find an assignment that
satisfies all constraints while minimising (or maximising) $z$.
\minizinc\ was originally designed as a convenient input language for \gls{cp}
\autocite{marriott-1998-clp} solvers. A user would write a model consisting
(like the one above) of a few loops or comprehensions; given a data file for the
parameters, this would be translated into a relatively small set of constraints
High-level \cmls, like \minizinc, were originally designed as a convenient input
language for \gls{cp} solvers \autocite{marriott-1998-clp}. A user would write a
model consisting of a few loops or comprehensions; given a data file for the
parameters, this would be rewritten into a relatively small set of constraints
which would be fed whole into the solver. The existing process for translating
\minizinc\ into solver-specific constraints is a somewhat ad-hoc, (mostly)
single-pass, recursive unrolling procedure, and many aspects (such as call
@ -70,16 +14,16 @@ acceptable: models (both before and after translation) were small, translation
was fast, and the expected results of translation were obvious. The current
architecture is illustrated in \Cref{sfig:4-oldarch}.
But over time, the uses of \minizinc\ have expanded greatly from this original
vision. It is now used to generate input for wildly different solver
But over time, the uses of high-level \cmls\ have expanded greatly from this
original vision. It is now used to generate input for wildly different solver
technologies: not just constraint programming, but also MIP
\autocite{wolsey-1988-mip}, SAT \autocite{davis-1962-dpll} and Constraint-based
Local Search \autocite{bjordal-2015-fzn2oscarcbls} solvers. Crucially, the same
\minizinc\ model can be used with any of these solvers, which is achieved
through the use of solver-specific libraries of constraint definitions (written
in \minizinc).
constraint model can be used with any of these solvers, which is achieved
through the use of solver-specific libraries of constraint definitions. In
\minizinc, these solver libraries are written in the same language.
The language itself turned out to be much more expressive than expected, so more
\minizinc\ turned out to be much more expressive than expected, so more
and more preprocessing and model transformation has been added to both the core
\minizinc\ library, and users' models. And in addition to one-shot solving,
\minizinc\ is frequently used as the basis of a larger meta-optimisation tool
@ -92,18 +36,16 @@ dashed feedback loop in \Cref{sfig:4-oldarch}.
\begin{subfigure}[b]{.45\columnwidth}
\centering
\includegraphics[width=\columnwidth]{assets/img/4_old_process}
\caption{}
\label{sfig:4-oldarch}
\caption{\label{sfig:4-oldarch}}
\end{subfigure}%
\hspace{0.05\columnwidth}%
\begin{subfigure}[b]{.50\columnwidth}
\centering
\includegraphics[width=\columnwidth]{assets/img/4_new_process}
\caption{}
\label{sfig:4-newarch}
\caption{\label{sfig:4-newarch}}
\end{subfigure}
\caption{(a) Current \minizinc\ architecture, and (b) new \minizinc\ architecture}
\label{fig:4-arch}
\caption{\label{fig:4-arch} (a) Current \minizinc\ architecture, and (b) new
\minizinc\ architecture}
\end{figure}
To a great extent, this is testament to the effectiveness of the language.
@ -113,7 +55,7 @@ weaknesses of the existing \minizinc\ tool chain. In particular:
\begin{itemize}
\item Flattening is inefficient. The flattener does a surprisingly large
amount of work for each expression (especially resolving sub-typing and
overloading), which may be repeated many times -- for example, inside
overloading), which may be repeated many times --- for example, inside
the body of a comprehension. And as models generated for other solver
technologies (particularly MIP) can be quite large, the resulting
flattening procedure can be intolerably slow. As the model
@ -139,10 +81,10 @@ weaknesses of the existing \minizinc\ tool chain. In particular:
to the next, closely related, problem.
\end{itemize}
In this paper, we revisit the translation of \minizinc\ into solver-level
\flatzinc. We describe a new \textbf{systematic view of \minizinc\ execution},
and build on this to propose a new tool chain. We show how this tool chain
allows us to:
In this chapter, we revisit the rewriting of high-level \cmls, like \minizinc,
into solver-level constraint models. We describe a new \textbf{systematic view
of the execution of \cmls}, and build on this to propose a new tool chain. We
show how this tool chain allows us to:
\begin{itemize}
\item easily integrate a range of \textbf{optimisation and simplification}
@ -153,28 +95,28 @@ allows us to:
integration with solvers providing native incremental features.
\end{itemize}
The new architecture is shown in \Cref{sfig:4-newarch}. The model is first
compiled to byte code (called \emph{MicroZinc}), independent of the data. The
byte code is interpreted with the data to produce \emph{NanoZinc} code, an
extension of the current \flatzinc\ format. The interpreter is incremental: when
used in meta optimization, no recompilation is required.
The new architecture is shown in \Cref{sfig:4-newarch}. The model is first
compiled to byte code (called \microzinc), independent of the data. The byte
code is interpreted with the data to produce \nanozinc\ code, an extension of
the existing \flatzinc\ format. The interpreter can even be made incremental: in
\cref{ch:incremental} we discuss how in meta optimisation, no recompilation is
required.
We have developed a prototype of this tool chain, and present experimental
validation of these advantages. The prototype is still very experimental, but
preliminary results suggest the new tool chain can perform flattening much
faster, and produce better models, than the official \minizinc\ compiler.
faster, and produce better models, than the current \minizinc\ compiler.
The rest of the paper is organised as follows. \Cref{sec:4-background} gives some
background on the MiniZinc language and its current tool chain.
\Cref{sec:4-micronano} introduces the \microzinc\ and \nanozinc\ languages, the
new intermediate representation we propose that enables more efficient
flattening. \Cref{sec:4-simplification} describes how we can perform various
processing and simplification steps on this representation, and in
\Cref{sec:4-incremental} we describe how it can support incremental flattening
and solving. Finally, \Cref{sec:4-conclusion} presents our conclusions.
The rest of the paper is organised as follows. \Cref{sec:4-micronano} introduces
the \microzinc\ and \nanozinc\ languages, the new intermediate representation we
propose that enables more efficient flattening. \Cref{sec:4-simplification}
describes how we can perform various processing and simplification steps on this
representation, and in \cref{sec:4-experiments} we report on the experimental
results of the prototype implementation. Finally, \Cref{sec:4-conclusion}
presents our conclusions.
\section{MicroZinc and NanoZinc}
\label{sec:4-micronano}
\section{\glsentrytext{microzinc} and
\glsentrytext{nanozinc}}\label{sec:4-micronano}
This section introduces the two sub-languages of \minizinc\ at the core of the
new abstract machine model we have developed.
@ -241,7 +183,8 @@ you can assume it is empty).
}
\end{figure}
\subsection{\minizinc\ to MicroZinc/NanoZinc}
\subsection{\glsentrytext{minizinc} to
\glsentrytext{microzinc}/\glsentrytext{nanozinc}}
The translation from \minizinc\ to \microzinc\ involves the following steps:
@ -263,6 +206,50 @@ The translation from \minizinc\ to \microzinc\ involves the following steps:
arguments are the parameters of the model.
\end{enumerate}
% \begin{algorithm}[H]
% \KwData{A \minizinc\ model $M = \langle{}V, C, F\rangle{}$, where $V$ is a set of declaration
% items, $C$ is a set of constraint items, and $F$ are the definitions for all
% functions used in $M$.}
% \KwResult{A list of \microzinc\ defintions $\mu$ and a \nanozinc\ program $n$.}
% \While{not at end of this document}{
% read current\;
% \eIf{understand}{
% go to next section\;
% current section becomes this one\;
% }{
% go back to the beginning of current section\;
% }
% }
% \caption{MicrowaveModel}
% \end{algorithm}
% \begin{algorithm}[H]
% \KwData{A \minizinc\ expression $e$ and $rn$ a function to rename identifiers}
% \KwResult{A \microzinc\ expression $e'$.}
% \Switch{$e$} {
% \Case{\textup{\texttt{if \(c\) then \(e_{1}\) else \(e_{2}\) endif}}}{
% \eIf{\(type(c) \in \textup{\texttt{var}}\)}{
% \(e' \longleftarrow \texttt{MicroExpr}(\texttt{slv\_if\_then\_else(\(c\), \(e_{1}\), \(e_{2}\))}, rn)\)
% }{
% \(c' \longleftarrow \texttt{MicroExpr(\(c, rn\))}\)\;
% \(e_{1}' \longleftarrow \texttt{MicroExpr(\(e_{1}, rn\))}\)\;
% \(e_{2}' \longleftarrow \texttt{MicroExpr(\(e_{2}\), rn)}\)\;
% \(e' \longleftarrow \texttt{if \(c'\) then \(e_{1}'\) else \(e_{2}'\) endif}\)
% }
% }
% \Case{\textup{(call)} \(n(e_{1}, ..., e_{n})\)}{
% }
% \lCase{\textup{(ident)} $id$} {\(e' \longleftarrow rn(id)\)}
% \lOther{\(e' \longleftarrow e\)}
% }
% \Return{\(e'\)}
% \caption{MicroExpr}
% \end{algorithm}
\begin{example}
Consider the following \minizinc\ model, a simple unsatisfiable ``pigeonhole
problem'', trying to assign $n$ pigeons to $n-1$ holes:
@ -303,13 +290,12 @@ The translation from \minizinc\ to \microzinc\ involves the following steps:
} in true;
\end{mzn}
For a concrete instance of the model, we can then create a simple \nanozinc
program that calls \mzninline{main}, for example for $n=10$: \mzninline{true
@$\mapsto$@ main(10) @$\sep$@ []}.
For a concrete instance of the model, we can then create a simple \nanozinc\
program that calls \mzninline{main}, for example for $n=10$:
\mzninline{true @$\mapsto$@ main(10) @$\sep$@ []}.
\end{example}
\subsection{Partial Evaluation of NanoZinc}
\label{sub:4-partial}
\subsection{Partial Evaluation of \glsentrytext{nanozinc}}\label{sub:4-partial}
Each call in a \nanozinc\ program is either a primitive, or it has a
corresponding \microzinc\ definition. The goal of partial evaluation is to
@ -362,20 +348,21 @@ introduced to define results of function calls. In order to track these
dependencies, \nanozinc\ attaches constraints that define a variable to the item
that introduces the variable.
\subsection{NanoZinc}\label{sec:4-nanozinc}
\subsection{\glsentrytext{nanozinc}}\label{sec:4-nanozinc}
A \nanozinc\ model (\cref{fig:4-nzn-syntax}) consists of a
topologically-ordered list of definitions. Each definition binds a variable to
the result of a call or another variable, and it is associated with a list of
identifiers of auxiliary constraints. The \nanozinc\ model contains a special
definition $\true$, containing all ``root-context'' constraints of the model,
i.e., those that have to hold globally and are not just used to define an
auxiliary variable. Only root-context constraints (attached to $\true$) can
effectively constrain the overall problem. Constraints attached to definitions
originate from function calls, and since all functions are guaranteed to be
total, attached constraints can only define the function result.
A \nanozinc\ model (\cref{fig:4-nzn-syntax}) consists of a topologically-ordered
list of definitions. Each definition binds a variable to the result of a call or
another variable, and it is associated with a list of identifiers of auxiliary
constraints. The \nanozinc\ model contains a special definition $\true$,
containing all ``root-context'' constraints of the model, i.e., those that have
to hold globally and are not just used to define an auxiliary variable. Only
root-context constraints (attached to $\true$) can effectively constrain the
overall problem. Constraints attached to definitions originate from function
calls, and since all functions are guaranteed to be total, attached constraints
can only define the function result.
\begin{example}\label{ex:4-absnzn}
\begin{example} \label{ex:4-absnzn}
Consider the \minizinc\ fragment\\\mzninline{constraint abs(x) > y;}\\ where
\mzninline{x} and \mzninline{y} have domain $[-10, 10]$. After rewriting all
definitions, the resulting \nanozinc\ program could look like this:
@ -390,17 +377,16 @@ total, attached constraints can only define the function result.
true @$\mapsto$@ true @$\sep$@ [b1]
\end{mzn}
The variables \mzninline{x} and \mzninline{y} result in identifiers bound to
the special built-in \mzninline{mkvar}, which records the variable's domain
in its arguments. Evaluating the call \mzninline{abs(x)} introduces a new
variable \mzninline{z} (with unrestricted domain) and the \mzninline{int_abs}
constraint that is used to define \mzninline{z}, bound to a fresh identifier
\mzninline{b0}. A fresh identifier \mzninline{t} is introduced to capture
both the result \mzninline{z} as well as the list of defining constraints,
\mzninline{[b0]}. The top-level constraint \mzninline{abs(x) > y} is
rewritten into \mzninline{int_gt(t,y)} and bound to fresh identifier
\mzninline{b1}, which is added to the root-context definition
\mzninline{true}.
The variables \mzninline{x} and \mzninline{y} result in identifiers bound to
the special built-in \mzninline{mkvar}, which records the variable's domain in
its arguments. Evaluating the call \mzninline{abs(x)} introduces a new
variable \mzninline{z} (with unrestricted domain) and the \mzninline{int_abs}
constraint that is used to define \mzninline{z}, bound to a fresh identifier
\mzninline{b0}. A fresh identifier \mzninline{t} is introduced to capture both
the result \mzninline{z} as well as the list of defining constraints,
\mzninline{[b0]}. The top-level constraint \mzninline{abs(x) > y} is rewritten
into \mzninline{int_gt(t,y)} and bound to fresh identifier \mzninline{b1},
which is added to the root-context definition \mzninline{true}.
\end{example}
\subsection{Rewriting Rules}
@ -422,13 +408,13 @@ suitable alpha renaming.
\begin{prooftree}
\AxiomC{$\ptinline{F($p_1, \ldots, p_k$) = E;} \in \Prog$ where the $p_i$
are fresh}
\noLine
\noLine\
\UnaryInfC{\Sem{$a_i$}{\Prog, \Env_{i-1}} $\Rightarrow~ \tuple{v_i,
\Env_i'}, \ \Env_0=\Env, \Env_i=\Env_i'\cup\{p_i\mapsto v_i\sep[]\}
\quad \forall
1 \leq
i \leq k$}
\noLine
\noLine\
\UnaryInfC{\Sem{E}{\Prog, \Env_k}
$\Rightarrow ~ \tuple{v, \Env'}$}
\RightLabel{(Call)}
@ -506,14 +492,14 @@ simply the function applied to the evaluated parameters.
(\ptinline{true}, \Env$)}
\end{prooftree}
\begin{prooftree}
\AxiomC{\Sem{$\mathbf{I}$}{\Prog, \Env \cup \{x \mapsto \texttt{mkvar()} \sep []\}} $\Rightarrow (\Ctx, \Env')$}
\AxiomC{\Sem{$\mathbf{I}$}{\Prog, \Env\ \cup\ \{x \mapsto\ \texttt{mkvar()} \sep\ []\}} $\Rightarrow (\Ctx, \Env')$}
\RightLabel{(ItemT)}
\UnaryInfC{\Sem{$t:x, \mathbf{I}$}{\Prog, \Env} $\Rightarrow
(\Ctx, \Env')$}
\end{prooftree}
\begin{prooftree}
\AxiomC{\Sem{$E$}{\Prog, \Env} $\Rightarrow \tuple{v, \Env'}$}
\AxiomC{\Sem{$\mathbf{I}$}{\Prog, \Env' \cup \{x \mapsto v \sep [] \}} $\Rightarrow (\Ctx, \Env'')$}
\AxiomC{\Sem{$\mathbf{I}$}{\Prog, \Env' \cup\ \{x \mapsto\ v \sep\ [] \}} $\Rightarrow (\Ctx, \Env'')$}
\RightLabel{(ItemTE)}
\BinaryInfC{\Sem{$t:x = E, \mathbf{I}$}{\Prog, \Env} $\Rightarrow
(\Ctx, \Env'')$}
@ -546,14 +532,14 @@ is the base case for a list of let items.
\RightLabel{(IdC)}
\AxiomC{$x \in \langle ident \rangle$}
\AxiomC{$v \in \langle literal \rangle$}
\BinaryInfC{\Sem{$x$}{\Prog, \{x \mapsto v \sep []\} \cup \Env} $\Rightarrow \tuple{v, \{x \mapsto v \sep [] \} \cup \Env}$}
\BinaryInfC{\Sem{$x$}{\Prog, \{x \mapsto\ v \sep\ []\} \cup\ \Env} $\Rightarrow \tuple{v, \{x \mapsto v \sep [] \} \cup \Env}$}
\end{prooftree}
\begin{prooftree}
\RightLabel{(IdX)}
\AxiomC{$x \in \langle ident \rangle$}
\AxiomC{$v$}
\AxiomC{$\phi$ otherwise}
\TrinaryInfC{\Sem{$x$}{\Prog, \{x \mapsto v \sep \phi \} \cup \Env} $\Rightarrow \tuple{x, \{x \mapsto v \sep \phi \} \cup \Env}$}
\TrinaryInfC{\Sem{$x$}{\Prog, \{x \mapsto\ v \sep\ \phi\ \} \cup\ \Env} $\Rightarrow \tuple{x, \{x \mapsto v \sep \phi \} \cup \Env}$}
\end{prooftree}
\begin{prooftree}
\RightLabel{(Const)}
@ -586,8 +572,8 @@ is the base case for a list of let items.
\begin{prooftree}
\AxiomC{\Sem{$\mathit{PE}$}{\Prog,\Env} $\Rightarrow \tuple{S, \_}$}
\noLine\
\UnaryInfC{\Sem{$[E ~|~ \mathit{GE} \mbox{~where~} G]$}{\Prog, \Env \cup \{ x\mapsto
v \sep []\}}}
\UnaryInfC{\Sem{$[E ~|~ \mathit{GE} \mbox{~where~} G]$}{\Prog, \Env\ \cup\ \{ x\mapsto\
v \sep\ []\}}}
\noLine\
\UnaryInfC{\hspace{8em} $\Rightarrow \tuple{A_v, \Env \cup \{x \mapsto v \sep []\} \cup \Env_v}, \forall
v \in S $}
@ -628,7 +614,7 @@ new technology.
The \textbf{interpreter} evaluates \microzinc\ bytecode and \nanozinc\ programs
based on the rewrite system introduced in this section. It can read parameter
data from a file or via an API. The interpreter constructs the call to the
data from a file or via an API\@. The interpreter constructs the call to the
\mzninline{main} function, and then performs the rewriting into flat \nanozinc.
Memory management is implemented using reference counting, which lends itself
well to the optimisations discussed in the next section. The interpreter is
@ -647,13 +633,12 @@ scripting.
The source code for the complete system will be made available under an open
source license.
\section{NanoZinc Simplification}
\label{sec:4-simplification}
\section{NanoZinc Simplification}\label{sec:4-simplification}
The previous section introduced the basic evaluation model of \nanozinc: each
call that has a corresponding \microzinc\ definition can be rewritten into a set
of calls by evaluating the \microzinc\ code. This section looks at further steps
that the system can perform in order to produce better, more compact \nanozinc
that the system can perform in order to produce better, more compact \nanozinc\
code.
\subsection{Removing dead variables and constraints}
@ -661,7 +646,7 @@ code.
The most basic form of simplification is to identify variables that are not
used by any call, and remove them. This is correct because the semantics of
\nanozinc\ requires us to find values for all variables such that all
constraints are satisfied -- but if a variable does not occur in any
constraints are satisfied --- but if a variable does not occur in any
constraint, we can pick any value for it. The model without the variable is
therefore equisatisfiable with the original model.
@ -726,7 +711,7 @@ auxiliary definitions (the \mzninline{@$\sep\phi$@} part).
\end{example}
Note how \mzninline{t} acts as a unique name for the result of the let
expression (and thus the function call) -- we could not simply have used
expression (and thus the function call) --- we could not simply have used
\mzninline{z}, because then \mzninline{b0} would have kept \mzninline{z} alive.
It is worth noting the difference between the constraints \mzninline{b0} and
@ -735,7 +720,7 @@ constraint list of another variable. This means that it is a constraint that
defines the value of a variable, and it can be enforced globally (it must hold,
under all circumstances). On the other hand, \mzninline{b1} is referenced as an
argument to the \mzninline{bool_or} call. This means that the constraint does
not have to hold globally -- rather, we require the variable \mzninline{b1} to
not have to hold globally --- rather, we require the variable \mzninline{b1} to
reflect whether the constraint holds or not, so that the \mzninline{bool_or}
can implement the disjunction. These types of constraints are typically called
\emph{reified} in constraint modelling. So \nanozinc\ uses the same syntax for
@ -839,7 +824,7 @@ equality constraint by replacing all occurrences of \mzninline{x} with
\mzninline{y}, and remove the definition of \mzninline{x}.
The more complicated case for \mzninline{x=y} is when both \mzninline{x} and
\mzninline{y} are the result of function calls, e.g. from a model fragment such
\mzninline{y} are the result of function calls, \eg\ from a model fragment such
as \mzninline{constraint f(a)=g(b)}. Let us assume that part of the
corresponding \nanozinc\ code looks like this:
@ -882,7 +867,7 @@ equalities to the target constraint solver.
\paragraph{Implementation}
Rewriting a function call that has a \microzinc definition and rewriting a
Rewriting a function call that has a \microzinc\ definition and rewriting a
constraint due to propagation are very similar. The interpreter therefore simply
interleaves both forms of rewriting. Efficient constraint propagation engines
``wake up'' a constraint only when one of its variables has received an update
@ -908,7 +893,7 @@ that the system becomes non-confluent, and some orders of execution may produce
This predicate expresses the constraint \mzninline{b -> x<=0}, using a
well-known method called ``big-M transformation''. The expression
\mzninline{ub(x)} returns a valid upper bound for \mzninline{x}, i.e. a fixed
\mzninline{ub(x)} returns a valid upper bound for \mzninline{x}, \ie\ a fixed
value known to be greater than or equal to \mzninline{x}. This could be the
initial upper bound \mzninline{x} was declared with, or the current value in
the corresponding \nanozinc\ \mzninline{mkvar} call. If \mzninline{b} takes
@ -929,7 +914,7 @@ evaluation is best.
In our \microzinc/\nanozinc\ system, we allow predicates and functions to be
\emph{annotated} as potential candidates for \emph{delayed rewriting}. Any
annotated constraint is handled by the (CallPrimitive) rule rather than the
(Call) rule, which means that it is simply added as a call into the \nanozinc
(Call) rule, which means that it is simply added as a call into the \nanozinc\
code, without evaluating its body. When propagation reaches a fix-point, all
annotations are removed from the current \nanozinc\ program, and evaluation
resumes with the predicate bodies.
@ -939,10 +924,9 @@ example, a model targeted at a MIP solver is rewritten into Boolean and reified
constraints, whose definitions are annotated to be delayed. The \nanozinc\ model
can then be fully simplified by constraint propagation, before the Boolean and
reified constraints are rewritten into lower-level linear primitives suitable
for MIP.
for MIP\@.
\subsection{Common Subexpression Elimination}
\label{sec:4-cse}
\subsection{Common Subexpression Elimination}\label{sec:4-cse}
The simplifications introduced above remove definitions from the model when we
can detect that they are no longer needed. In some cases, it is even possible to
@ -1059,7 +1043,7 @@ addition recognises constraints in \emph{positive} contexts, also known as
\emph{half-reified} \autocite{feydy-2011-half-reif}. We omit the details here
for brevity, but the CSE approach generalises to this extension.
\section{Experiments}
\section{Experiments}\label{sec:4-experiments}
We have created a prototype implementation of the architecture presented in the
preceding sections. It consists of a compiler from \minizinc\ to \microzinc, and
@ -1087,7 +1071,7 @@ those constraints rather than using solver built-ins, in order to stress-test
the flattening). We measured pure flattening time, i.e., without time required
to parse and typecheck in version 2.4.3, and without time required for
compilation to \microzinc\ in the new system (compilation is usually very fast).
Times are averages of 10 runs. \footnote{All models obtained from
Times are averages of 10 runs.\footnote{All models obtained from
\url{https://github.com/minizinc/minizinc-benchmarks}:
\texttt{\justify{}accap, amaze, city-position, community-detection,
depot-placement, freepizza, groupsplitter, kidney-exchange, median-string,
@ -1110,18 +1094,15 @@ unoptimised prototype to a mature piece of software.
\begin{subfigure}[b]{.48\columnwidth}
\centering
\includegraphics[width=\columnwidth]{assets/img/4_compare_runtime}
\caption{flattening run time (ms)}
\label{sfig:4-compareruntime}
\caption{\label{sfig:4-compareruntime}flattening run time (ms)}
\end{subfigure}%
\hspace{0.04\columnwidth}%
\begin{subfigure}[b]{.48\columnwidth}
\centering
\includegraphics[width=\columnwidth]{assets/img/4_compare_memory}
\caption{flattening memory (MB)}
\label{sfig:4-comparemem}
\caption{\label{sfig:4-comparemem}flattening memory (MB)}
\end{subfigure}
\caption{Performance on flattening 100 MiniZinc Challenge instances.
\minizinc\ 2.4.3 (x-axis) vs new architecture (y-axis), log-log plot. Dots
below the line indicate the new system is better.}
\label{fig:4-runtime}
\caption{\label{fig:4-runtime}Performance on flattening 100 MiniZinc Challenge
instances. \minizinc\ 2.4.3 (x-axis) vs new architecture (y-axis), log-log
plot. Dots below the line indicate the new system is better.}
\end{figure}

View File

@ -2243,7 +2243,7 @@ $b \half c$ run at the lowest priority. This means that Boolean variables are
still fixed at the same time, but there is less overhead.
\section{Implementation in the \minizinc{} Distribution}
\section{Implementation in the \glsentrytext{minizinc} Distribution}
Implementing half-reification within the \minizinc\ distribution consists of two
parts: the \minizinc\ compiler needs to be extended with an implementation of

View File

@ -81,16 +81,14 @@ and remove constraints from an existing model while avoiding recompilation.
Finally, \Cref{sec:6-conclusion} presents the conclusions.
\section{Modelling of Restart-Based Meta-Search}
\label{sec:6-modelling}
\section{Modelling of Restart-Based Meta-Search}\label{sec:6-modelling}
This section introduces a \minizinc\ extension that enables modellers to define
\gls{meta-search} algorithms in \cmls. This extension is based on the construct
introduced in \minisearch\ \autocite{rendl-2015-minisearch}, as summarised
below.
\subsection{Meta-Search in \glsentrytext{minisearch}}
\label{sec:6-minisearch}
\subsection{Meta-Search in \glsentrytext{minisearch}}\label{sec:6-minisearch}
% Most \gls{lns} literature discusses neighbourhoods in terms of ``destroying'' part of
% a solution that is later repaired. However, from a declarative modelling point
@ -391,8 +389,7 @@ solution. We store the solution values in \mzninline{s1} and \mzninline{s2}
arrays. Before each restart we add constraints removing pareto dominated
solutions, based on each previous solution.
\section{Compilation of Meta-Search Specifications}
\label{sec:6-solver-extension}
\section{Compilation of Meta-Search Specifications}\label{sec:6-solver-extension}
The neighbourhoods defined in the previous section can be executed with
\minisearch\ by adding support for the \mzninline{status} and
@ -617,8 +614,7 @@ against being invoked before the first solution is found, since the
solution has been recorded yet, but we use this simple example to illustrate how
these Boolean conditions are compiled and evaluated.
\section{An Incremental Interface for Constraint Modelling Languages}
\label{sec:6-incremental-compilation}
\section{An Incremental Interface for Constraint Modelling Languages}\label{sec:6-incremental-compilation}
As an alternative approach to run \gls{meta-search} algorithm we propose the
possibility of incremental flattening. The execution of any
@ -642,7 +638,7 @@ After evaluating the first constraint, the domain of \mzninline{x} is changed to
be less than 10. Evaluating the second constraint causes the domain of
\mzninline{y} to be less than 9. If we now, however, try to remove the first
constraint, it is not just the direct inference on the domain of \mzninline{x}
that has to be undone, but also any further effects of those changes -- in this
that has to be undone, but also any further effects of those changes --- in this
case, the changes to the domain of \mzninline{y}.
\end{example}
@ -712,8 +708,7 @@ therefore support solvers with different levels of an incremental interface:
\end{itemize}
\section{Experiments}
\label{sec:6-experiments}
\section{Experiments}\label{sec:6-experiments}
We have created a prototype implementation of the architecture presented in the
preceding sections. It consists of a compiler from \minizinc\ to \microzinc, and
@ -738,7 +733,8 @@ offers, we present a runtime evaluation of two meta-heuristics implemented using
our prototype interpreter. For both meta-heuristics, we evaluate the performance
of fully re-evaluating and solving the instance from scratch, compared to the
fully incremental evaluation and solving. The solving in both tests is performed
by the Gecode solver, version 6.1.2, connected using the fully incremental API.
by the Gecode solver, version 6.1.2, connected using the fully incremental
API\@.
\paragraph{GBAC}
The Generalised Balanced Academic Curriculum (GBAC) problem
@ -836,19 +832,19 @@ spent solving is reduced by 33\%.
\newcommand{\chuffedStd}{\textsf{chuffed}}
\newcommand{\chuffedMzn}{\textsf{chuffed-fzn}}
We will now show that a solver that evaluates the compiled \flatzinc \gls{lns}
We will now show that a solver that evaluates the compiled \flatzinc\ \gls{lns}
specifications can (a) be effective and (b) incur only a small overhead compared
to a dedicated implementation of the neighbourhoods.
To measure the overhead, we implemented our new approach in
Gecode~\autocite{gecode-2021-gecode}. The resulting solver (\gecodeMzn in the tables
Gecode~\autocite{gecode-2021-gecode}. The resulting solver (\gecodeMzn\ in the tables
below) has been instrumented to also output the domains of all model variables
after propagating the new special constraints. We implemented another extension
to Gecode (\gecodeReplay) that simply reads the stream of variable domains for
each restart, essentially replaying the \gls{lns} of \gecodeMzn without incurring any
each restart, essentially replaying the \gls{lns} of \gecodeMzn\ without incurring any
overhead for evaluating the neighbourhoods or handling the additional variables
and constraints. Note that this is a conservative estimate of the overhead:
\gecodeReplay has to perform \emph{less} work than any real \gls{lns} implementation.
\gecodeReplay\ has to perform \emph{less} work than any real \gls{lns} implementation.
In addition, we also present benchmark results for the standard release of
Gecode 6.0 without \gls{lns} (\gecodeStd); as well as \chuffedStd, the development
@ -905,9 +901,9 @@ period and frees all courses that are assigned to it.
\end{table}
The results for \texttt{gbac} in \cref{tab:6-gbac} show that the overhead
introduced by \gecodeMzn w.r.t.~\gecodeReplay is quite low, and both their
introduced by \gecodeMzn\ w.r.t.~\gecodeReplay\ is quite low, and both their
results are much better than the baseline \gecodeStd. Since learning is not very
effective for \texttt{gbac}, the performance of \chuffedStd is inferior to
effective for \texttt{gbac}, the performance of \chuffedStd\ is inferior to
Gecode. However, \gls{lns} again significantly improves over standard Chuffed.
\subsubsection{\texttt{steelmillslab}}
@ -934,13 +930,14 @@ slab.
\caption{\label{tab:6-steelmillslab}\texttt{steelmillslab} benchmarks}
\end{table}
For this problem a solution with zero wastage is always optimal. The use of \gls{lns}
makes these instances easy, as all the \gls{lns} approaches find optimal solutions. As
\cref{tab:6-steelmillslab} shows, \gecodeMzn is again slightly slower than
\gecodeReplay (the integral is slightly larger). While \chuffedStd significantly
outperforms \gecodeStd on this problem, once we use \gls{lns}, the learning in
\chuffedMzn is not advantageous compared to \gecodeMzn or \gecodeReplay. Still,
\chuffedMzn outperforms \chuffedStd by always finding an optimal solution.
For this problem a solution with zero wastage is always optimal. The use of
\gls{lns} makes these instances easy, as all the \gls{lns} approaches find
optimal solutions. As \cref{tab:6-steelmillslab} shows, \gecodeMzn\ is again
slightly slower than \gecodeReplay\ (the integral is slightly larger). While
\chuffedStd\ significantly outperforms \gecodeStd\ on this problem, once we use
\gls{lns}, the learning in \chuffedMzn\ is not advantageous compared to
\gecodeMzn\ or \gecodeReplay. Still, \chuffedMzn\ outperforms \chuffedStd\ by
always finding an optimal solution.
% RCPSP/wet
\subsubsection{\texttt{rcpsp-wet}}
@ -967,26 +964,25 @@ that time interval, which allows a reshuffling of these tasks.
\caption{\label{tab:6-rcpsp-wet}\texttt{rcpsp-wet} benchmarks}
\end{table}
\cref{tab:6-rcpsp-wet} shows that \gecodeReplay and \gecodeMzn perform almost
identically, and substantially better than baseline \gecodeStd for these
instances. The baseline learning solver \chuffedStd is best overall on the easy
examples, but \gls{lns} makes it much more robust. The poor performance of \chuffedMzn
on the last instance is due to the fixed search, which limits the usefulness of
nogood learning.
\cref{tab:6-rcpsp-wet} shows that \gecodeReplay\ and \gecodeMzn\ perform almost
identically, and substantially better than baseline \gecodeStd\ for these
instances. The baseline learning solver \chuffedStd\ is best overall on the easy
examples, but \gls{lns} makes it much more robust. The poor performance of
\chuffedMzn\ on the last instance is due to the fixed search, which limits the
usefulness of nogood learning.
\subsubsection{Summary}
The results show that \gls{lns} outperforms the baseline solvers, except for
benchmarks where we can quickly find and prove optimality.
However, the main result from these experiments is that the overhead introduced
by our \flatzinc interface, when compared to an optimal \gls{lns} implementation, is
relatively small. We have additionally calculated the rate of search nodes
explored per second and, across all experiments, \gecodeMzn achieves around 3\%
fewer nodes per second than \gecodeReplay. This overhead is caused by
propagating the additional constraints in \gecodeMzn. Overall, the experiments
demonstrate that the compilation approach is an effective and efficient way of
adding \gls{lns} to a modelling language with minimal changes to the solver.
\section{Conclusions}
\label{sec:6-conclusion}
by our \flatzinc\ interface, when compared to an optimal \gls{lns}
implementation, is relatively small. We have additionally calculated the rate of
search nodes explored per second and, across all experiments, \gecodeMzn\
achieves around 3\% fewer nodes per second than \gecodeReplay. This overhead is
caused by propagating the additional constraints in \gecodeMzn. Overall, the
experiments demonstrate that the compilation approach is an effective and
efficient way of adding \gls{lns} to a modelling language with minimal changes
to the solver.
\section{Conclusions}\label{sec:6-conclusion}