From 78b627a88bb7caeb1666bdf54f2caef6bea51721 Mon Sep 17 00:00:00 2001 From: "Jip J. Dekker" Date: Tue, 9 Mar 2021 15:03:59 +1100 Subject: [PATCH] Fixing things --- .chktexrc | 4 + assets/packages.tex | 3 + chapters/2_background.tex | 58 ++++++++ chapters/4_rewriting.tex | 289 +++++++++++++++++-------------------- chapters/5_half_reif.tex | 2 +- chapters/6_incremental.tex | 78 +++++----- 6 files changed, 238 insertions(+), 196 deletions(-) create mode 100644 .chktexrc diff --git a/.chktexrc b/.chktexrc new file mode 100644 index 0000000..7025788 --- /dev/null +++ b/.chktexrc @@ -0,0 +1,4 @@ +# Exclude these environments from syntax checking +VerbEnvir { pgfpicture tikzpicture mzn grammar proof } + +WipeArg { \mzninline:{} \Sem:{} \texttt:{} } diff --git a/assets/packages.tex b/assets/packages.tex index dc95f1d..401c3e1 100644 --- a/assets/packages.tex +++ b/assets/packages.tex @@ -66,6 +66,9 @@ style=apa, \newenvironment{example}[1][]{\refstepcounter{example}\par\medskip \noindent \textbf{Example~\theexample. #1} \rmfamily}{\hfill \ensuremath{\square}} +% Algorithms +% \usepackage[ruled,vlined]{algorithm2e} + % Code formatting \usepackage[ cachedir=listings, diff --git a/chapters/2_background.tex b/chapters/2_background.tex index c0c7a97..b086ce5 100644 --- a/chapters/2_background.tex +++ b/chapters/2_background.tex @@ -67,3 +67,61 @@ solution that fits the requirements of the model. \section{What Makes a ``Good'' Model?} \label{sec:2-model-quality} + + +\section{From the Abstract Machine Paper} +\minizinc\ \autocite{nethercote-2007-minizinc} is a high-level, solver- and +data-independent modelling language for discrete satisfiability and optimisation +problems. Its expressive language and extensive library of constraints allow +users to easily model complex problems. + +Let us introduce the language by modelling the well-known \emph{Latin squares} +problem \autocite{wallis-2011-combinatorics}: Given an integer $n$, find an +$n \times n$ matrix, such that each row and column is a permutation of values +$1 \ldots n$. A \minizinc\ model encoding this problem could look as follows: + +\begin{mzn} +int: n; +array [1..n, 1..n] of var 1..n: x; + +constraint forall (r in 1..n) ( + all_different([x[r, c] | c in 1..n]) +); +constraint forall (c in 1..n) ( + all_different([x[r, c] | r in 1..n]) +); +\end{mzn} + +The model introduces a \gls{parameter} \mzninline{n}, and a two-dimensional +array of \glspl{variable} (marked by the \mzninline{var} keyword) \mzninline{x}. +Each variable in \mzninline{x} is restricted to the set of integers +\mzninline{1..n}, which is called the variable's \gls{domain}. The constraints +specify the requirements of the problem: for each row \mzninline{r}, the +\mzninline{x} variables of all columns must take pairwise different values (and +the same for each column \mzninline{c}). This is modelled using the +\mzninline{all_different} function, one of hundreds of pre-defined constraints +in \minizinc's library. + +Given ground assignments to input \glspl{parameter}, a \minizinc\ model is +translated (via a process called \emph{flattening}) into a set of variables and +primitive constraints. Here is the result of flattening for \mzninline{n=2}: + +\begin{mzn} +var 1..2: x_1_1; +var 1..2: x_1_2; +var 1..2: x_2_1; +var 1..2: x_2_2; +constraint all_different([x_1_1, x_1_2]); +constraint all_different([x_2_1, x_2_2]); +constraint all_different([x_1_1, x_2_1]); +constraint all_different([x_1_2, x_2_2]); +\end{mzn} + +This \emph{flat} problem will be passed to some \gls{solver}, which will attempt +to determine an assignment to each decision variable \verb|x_i_j| that satisfies +all constraints, or report that there is no such assignment. + +This type of combinatorial problem is typically called a \gls{csp}. \minizinc +also supports the modelling of \gls{cop}, where a \gls{csp} is augmented with an +\gls{objective} $z$. In this case the goal is to find an assignment that +satisfies all constraints while minimising (or maximising) $z$. diff --git a/chapters/4_rewriting.tex b/chapters/4_rewriting.tex index de3c789..0c2aed5 100644 --- a/chapters/4_rewriting.tex +++ b/chapters/4_rewriting.tex @@ -2,66 +2,10 @@ \chapter{Rewriting Constraint Modelling Languages}\label{ch:rewriting} %************************************************ -\minizinc\ \autocite{nethercote-2007-minizinc} is a high-level, solver- and -data-independent modelling language for discrete satisfiability and optimisation -problems. Its expressive language and extensive library of constraints allow -users to easily model complex problems. - -Let us introduce the language by modelling the well-known \emph{Latin squares} -problem \autocite{wallis-2011-combinatorics}: Given an integer $n$, find an -$n \times n$ matrix, such that each row and column is a permutation of values -$1 \ldots n$. A \minizinc\ model encoding this problem could look as follows: - -\begin{mzn} -int: n; -array [1..n, 1..n] of var 1..n: x; - -constraint forall (r in 1..n) ( - all_different([x[r, c] | c in 1..n]) -); -constraint forall (c in 1..n) ( - all_different([x[r, c] | r in 1..n]) -); -\end{mzn} - -The model introduces a \gls{parameter} \mzninline{n}, and a two-dimensional -array of \glspl{variable} (marked by the \mzninline{var} keyword) \mzninline{x}. -Each variable in \mzninline{x} is restricted to the set of integers -\mzninline{1..n}, which is called the variable's \gls{domain}. The constraints -specify the requirements of the problem: for each row \mzninline{r}, the -\mzninline{x} variables of all columns must take pairwise different values (and -the same for each column \mzninline{c}). This is modelled using the -\mzninline{all_different} function, one of hundreds of pre-defined constraints -in \minizinc's library. - -Given ground assignments to input \glspl{parameter}, a \minizinc\ model is -translated (via a process called \emph{flattening}) into a set of variables and -primitive constraints. Here is the result of flattening for \mzninline{n=2}: - -\begin{mzn} -var 1..2: x_1_1; -var 1..2: x_1_2; -var 1..2: x_2_1; -var 1..2: x_2_2; -constraint all_different([x_1_1, x_1_2]); -constraint all_different([x_2_1, x_2_2]); -constraint all_different([x_1_1, x_2_1]); -constraint all_different([x_1_2, x_2_2]); -\end{mzn} - -This \emph{flat} problem will be passed to some \gls{solver}, which will attempt -to determine an assignment to each decision variable \verb|x_i_j| that satisfies -all constraints, or report that there is no such assignment. - -This type of combinatorial problem is typically called a \gls{csp}. \minizinc -also supports the modelling of \gls{cop}, where a \gls{csp} is augmented with an -\gls{objective} $z$. In this case the goal is to find an assignment that -satisfies all constraints while minimising (or maximising) $z$. - -\minizinc\ was originally designed as a convenient input language for \gls{cp} -\autocite{marriott-1998-clp} solvers. A user would write a model consisting -(like the one above) of a few loops or comprehensions; given a data file for the -parameters, this would be translated into a relatively small set of constraints +High-level \cmls, like \minizinc, were originally designed as a convenient input +language for \gls{cp} solvers \autocite{marriott-1998-clp}. A user would write a +model consisting of a few loops or comprehensions; given a data file for the +parameters, this would be rewritten into a relatively small set of constraints which would be fed whole into the solver. The existing process for translating \minizinc\ into solver-specific constraints is a somewhat ad-hoc, (mostly) single-pass, recursive unrolling procedure, and many aspects (such as call @@ -70,16 +14,16 @@ acceptable: models (both before and after translation) were small, translation was fast, and the expected results of translation were obvious. The current architecture is illustrated in \Cref{sfig:4-oldarch}. -But over time, the uses of \minizinc\ have expanded greatly from this original -vision. It is now used to generate input for wildly different solver +But over time, the uses of high-level \cmls\ have expanded greatly from this +original vision. It is now used to generate input for wildly different solver technologies: not just constraint programming, but also MIP \autocite{wolsey-1988-mip}, SAT \autocite{davis-1962-dpll} and Constraint-based Local Search \autocite{bjordal-2015-fzn2oscarcbls} solvers. Crucially, the same -\minizinc\ model can be used with any of these solvers, which is achieved -through the use of solver-specific libraries of constraint definitions (written -in \minizinc). +constraint model can be used with any of these solvers, which is achieved +through the use of solver-specific libraries of constraint definitions. In +\minizinc, these solver libraries are written in the same language. -The language itself turned out to be much more expressive than expected, so more +\minizinc\ turned out to be much more expressive than expected, so more and more preprocessing and model transformation has been added to both the core \minizinc\ library, and users' models. And in addition to one-shot solving, \minizinc\ is frequently used as the basis of a larger meta-optimisation tool @@ -92,18 +36,16 @@ dashed feedback loop in \Cref{sfig:4-oldarch}. \begin{subfigure}[b]{.45\columnwidth} \centering \includegraphics[width=\columnwidth]{assets/img/4_old_process} - \caption{} - \label{sfig:4-oldarch} + \caption{\label{sfig:4-oldarch}} \end{subfigure}% \hspace{0.05\columnwidth}% \begin{subfigure}[b]{.50\columnwidth} \centering \includegraphics[width=\columnwidth]{assets/img/4_new_process} - \caption{} - \label{sfig:4-newarch} + \caption{\label{sfig:4-newarch}} \end{subfigure} - \caption{(a) Current \minizinc\ architecture, and (b) new \minizinc\ architecture} - \label{fig:4-arch} + \caption{\label{fig:4-arch} (a) Current \minizinc\ architecture, and (b) new + \minizinc\ architecture} \end{figure} To a great extent, this is testament to the effectiveness of the language. @@ -113,7 +55,7 @@ weaknesses of the existing \minizinc\ tool chain. In particular: \begin{itemize} \item Flattening is inefficient. The flattener does a surprisingly large amount of work for each expression (especially resolving sub-typing and - overloading), which may be repeated many times -- for example, inside + overloading), which may be repeated many times --- for example, inside the body of a comprehension. And as models generated for other solver technologies (particularly MIP) can be quite large, the resulting flattening procedure can be intolerably slow. As the model @@ -139,10 +81,10 @@ weaknesses of the existing \minizinc\ tool chain. In particular: to the next, closely related, problem. \end{itemize} -In this paper, we revisit the translation of \minizinc\ into solver-level -\flatzinc. We describe a new \textbf{systematic view of \minizinc\ execution}, -and build on this to propose a new tool chain. We show how this tool chain -allows us to: +In this chapter, we revisit the rewriting of high-level \cmls, like \minizinc, +into solver-level constraint models. We describe a new \textbf{systematic view + of the execution of \cmls}, and build on this to propose a new tool chain. We +show how this tool chain allows us to: \begin{itemize} \item easily integrate a range of \textbf{optimisation and simplification} @@ -153,28 +95,28 @@ allows us to: integration with solvers providing native incremental features. \end{itemize} -The new architecture is shown in \Cref{sfig:4-newarch}. The model is first -compiled to byte code (called \emph{MicroZinc}), independent of the data. The -byte code is interpreted with the data to produce \emph{NanoZinc} code, an -extension of the current \flatzinc\ format. The interpreter is incremental: when -used in meta optimization, no recompilation is required. +The new architecture is shown in \Cref{sfig:4-newarch}. The model is first +compiled to byte code (called \microzinc), independent of the data. The byte +code is interpreted with the data to produce \nanozinc\ code, an extension of +the existing \flatzinc\ format. The interpreter can even be made incremental: in +\cref{ch:incremental} we discuss how in meta optimisation, no recompilation is +required. We have developed a prototype of this tool chain, and present experimental validation of these advantages. The prototype is still very experimental, but preliminary results suggest the new tool chain can perform flattening much -faster, and produce better models, than the official \minizinc\ compiler. +faster, and produce better models, than the current \minizinc\ compiler. -The rest of the paper is organised as follows. \Cref{sec:4-background} gives some -background on the MiniZinc language and its current tool chain. -\Cref{sec:4-micronano} introduces the \microzinc\ and \nanozinc\ languages, the -new intermediate representation we propose that enables more efficient -flattening. \Cref{sec:4-simplification} describes how we can perform various -processing and simplification steps on this representation, and in -\Cref{sec:4-incremental} we describe how it can support incremental flattening -and solving. Finally, \Cref{sec:4-conclusion} presents our conclusions. +The rest of the paper is organised as follows. \Cref{sec:4-micronano} introduces +the \microzinc\ and \nanozinc\ languages, the new intermediate representation we +propose that enables more efficient flattening. \Cref{sec:4-simplification} +describes how we can perform various processing and simplification steps on this +representation, and in \cref{sec:4-experiments} we report on the experimental +results of the prototype implementation. Finally, \Cref{sec:4-conclusion} +presents our conclusions. -\section{MicroZinc and NanoZinc} -\label{sec:4-micronano} +\section{\glsentrytext{microzinc} and + \glsentrytext{nanozinc}}\label{sec:4-micronano} This section introduces the two sub-languages of \minizinc\ at the core of the new abstract machine model we have developed. @@ -241,7 +183,8 @@ you can assume it is empty). } \end{figure} -\subsection{\minizinc\ to MicroZinc/NanoZinc} +\subsection{\glsentrytext{minizinc} to + \glsentrytext{microzinc}/\glsentrytext{nanozinc}} The translation from \minizinc\ to \microzinc\ involves the following steps: @@ -263,6 +206,50 @@ The translation from \minizinc\ to \microzinc\ involves the following steps: arguments are the parameters of the model. \end{enumerate} +% \begin{algorithm}[H] +% \KwData{A \minizinc\ model $M = \langle{}V, C, F\rangle{}$, where $V$ is a set of declaration +% items, $C$ is a set of constraint items, and $F$ are the definitions for all +% functions used in $M$.} +% \KwResult{A list of \microzinc\ defintions $\mu$ and a \nanozinc\ program $n$.} + +% \While{not at end of this document}{ +% read current\; +% \eIf{understand}{ +% go to next section\; +% current section becomes this one\; +% }{ +% go back to the beginning of current section\; +% } +% } +% \caption{MicrowaveModel} +% \end{algorithm} + + +% \begin{algorithm}[H] +% \KwData{A \minizinc\ expression $e$ and $rn$ a function to rename identifiers} +% \KwResult{A \microzinc\ expression $e'$.} + +% \Switch{$e$} { +% \Case{\textup{\texttt{if \(c\) then \(e_{1}\) else \(e_{2}\) endif}}}{ +% \eIf{\(type(c) \in \textup{\texttt{var}}\)}{ +% \(e' \longleftarrow \texttt{MicroExpr}(\texttt{slv\_if\_then\_else(\(c\), \(e_{1}\), \(e_{2}\))}, rn)\) +% }{ +% \(c' \longleftarrow \texttt{MicroExpr(\(c, rn\))}\)\; +% \(e_{1}' \longleftarrow \texttt{MicroExpr(\(e_{1}, rn\))}\)\; +% \(e_{2}' \longleftarrow \texttt{MicroExpr(\(e_{2}\), rn)}\)\; +% \(e' \longleftarrow \texttt{if \(c'\) then \(e_{1}'\) else \(e_{2}'\) endif}\) +% } +% } +% \Case{\textup{(call)} \(n(e_{1}, ..., e_{n})\)}{ + +% } +% \lCase{\textup{(ident)} $id$} {\(e' \longleftarrow rn(id)\)} +% \lOther{\(e' \longleftarrow e\)} +% } +% \Return{\(e'\)} +% \caption{MicroExpr} +% \end{algorithm} + \begin{example} Consider the following \minizinc\ model, a simple unsatisfiable ``pigeonhole problem'', trying to assign $n$ pigeons to $n-1$ holes: @@ -303,13 +290,12 @@ The translation from \minizinc\ to \microzinc\ involves the following steps: } in true; \end{mzn} - For a concrete instance of the model, we can then create a simple \nanozinc - program that calls \mzninline{main}, for example for $n=10$: \mzninline{true - @$\mapsto$@ main(10) @$\sep$@ []}. + For a concrete instance of the model, we can then create a simple \nanozinc\ + program that calls \mzninline{main}, for example for $n=10$: + \mzninline{true @$\mapsto$@ main(10) @$\sep$@ []}. \end{example} -\subsection{Partial Evaluation of NanoZinc} -\label{sub:4-partial} +\subsection{Partial Evaluation of \glsentrytext{nanozinc}}\label{sub:4-partial} Each call in a \nanozinc\ program is either a primitive, or it has a corresponding \microzinc\ definition. The goal of partial evaluation is to @@ -362,20 +348,21 @@ introduced to define results of function calls. In order to track these dependencies, \nanozinc\ attaches constraints that define a variable to the item that introduces the variable. -\subsection{NanoZinc}\label{sec:4-nanozinc} +\subsection{\glsentrytext{nanozinc}}\label{sec:4-nanozinc} -A \nanozinc\ model (\cref{fig:4-nzn-syntax}) consists of a -topologically-ordered list of definitions. Each definition binds a variable to -the result of a call or another variable, and it is associated with a list of -identifiers of auxiliary constraints. The \nanozinc\ model contains a special -definition $\true$, containing all ``root-context'' constraints of the model, -i.e., those that have to hold globally and are not just used to define an -auxiliary variable. Only root-context constraints (attached to $\true$) can -effectively constrain the overall problem. Constraints attached to definitions -originate from function calls, and since all functions are guaranteed to be -total, attached constraints can only define the function result. +A \nanozinc\ model (\cref{fig:4-nzn-syntax}) consists of a topologically-ordered +list of definitions. Each definition binds a variable to the result of a call or +another variable, and it is associated with a list of identifiers of auxiliary +constraints. The \nanozinc\ model contains a special definition $\true$, +containing all ``root-context'' constraints of the model, i.e., those that have +to hold globally and are not just used to define an auxiliary variable. Only +root-context constraints (attached to $\true$) can effectively constrain the +overall problem. Constraints attached to definitions originate from function +calls, and since all functions are guaranteed to be total, attached constraints +can only define the function result. + +\begin{example}\label{ex:4-absnzn} -\begin{example} \label{ex:4-absnzn} Consider the \minizinc\ fragment\\\mzninline{constraint abs(x) > y;}\\ where \mzninline{x} and \mzninline{y} have domain $[-10, 10]$. After rewriting all definitions, the resulting \nanozinc\ program could look like this: @@ -390,17 +377,16 @@ total, attached constraints can only define the function result. true @$\mapsto$@ true @$\sep$@ [b1] \end{mzn} -The variables \mzninline{x} and \mzninline{y} result in identifiers bound to -the special built-in \mzninline{mkvar}, which records the variable's domain -in its arguments. Evaluating the call \mzninline{abs(x)} introduces a new -variable \mzninline{z} (with unrestricted domain) and the \mzninline{int_abs} -constraint that is used to define \mzninline{z}, bound to a fresh identifier -\mzninline{b0}. A fresh identifier \mzninline{t} is introduced to capture -both the result \mzninline{z} as well as the list of defining constraints, -\mzninline{[b0]}. The top-level constraint \mzninline{abs(x) > y} is -rewritten into \mzninline{int_gt(t,y)} and bound to fresh identifier -\mzninline{b1}, which is added to the root-context definition -\mzninline{true}. + The variables \mzninline{x} and \mzninline{y} result in identifiers bound to + the special built-in \mzninline{mkvar}, which records the variable's domain in + its arguments. Evaluating the call \mzninline{abs(x)} introduces a new + variable \mzninline{z} (with unrestricted domain) and the \mzninline{int_abs} + constraint that is used to define \mzninline{z}, bound to a fresh identifier + \mzninline{b0}. A fresh identifier \mzninline{t} is introduced to capture both + the result \mzninline{z} as well as the list of defining constraints, + \mzninline{[b0]}. The top-level constraint \mzninline{abs(x) > y} is rewritten + into \mzninline{int_gt(t,y)} and bound to fresh identifier \mzninline{b1}, + which is added to the root-context definition \mzninline{true}. \end{example} \subsection{Rewriting Rules} @@ -422,13 +408,13 @@ suitable alpha renaming. \begin{prooftree} \AxiomC{$\ptinline{F($p_1, \ldots, p_k$) = E;} \in \Prog$ where the $p_i$ are fresh} - \noLine + \noLine\ \UnaryInfC{\Sem{$a_i$}{\Prog, \Env_{i-1}} $\Rightarrow~ \tuple{v_i, \Env_i'}, \ \Env_0=\Env, \Env_i=\Env_i'\cup\{p_i\mapsto v_i\sep[]\} \quad \forall 1 \leq i \leq k$} - \noLine + \noLine\ \UnaryInfC{\Sem{E}{\Prog, \Env_k} $\Rightarrow ~ \tuple{v, \Env'}$} \RightLabel{(Call)} @@ -506,14 +492,14 @@ simply the function applied to the evaluated parameters. (\ptinline{true}, \Env$)} \end{prooftree} \begin{prooftree} - \AxiomC{\Sem{$\mathbf{I}$}{\Prog, \Env \cup \{x \mapsto \texttt{mkvar()} \sep []\}} $\Rightarrow (\Ctx, \Env')$} + \AxiomC{\Sem{$\mathbf{I}$}{\Prog, \Env\ \cup\ \{x \mapsto\ \texttt{mkvar()} \sep\ []\}} $\Rightarrow (\Ctx, \Env')$} \RightLabel{(ItemT)} \UnaryInfC{\Sem{$t:x, \mathbf{I}$}{\Prog, \Env} $\Rightarrow (\Ctx, \Env')$} \end{prooftree} \begin{prooftree} \AxiomC{\Sem{$E$}{\Prog, \Env} $\Rightarrow \tuple{v, \Env'}$} - \AxiomC{\Sem{$\mathbf{I}$}{\Prog, \Env' \cup \{x \mapsto v \sep [] \}} $\Rightarrow (\Ctx, \Env'')$} + \AxiomC{\Sem{$\mathbf{I}$}{\Prog, \Env' \cup\ \{x \mapsto\ v \sep\ [] \}} $\Rightarrow (\Ctx, \Env'')$} \RightLabel{(ItemTE)} \BinaryInfC{\Sem{$t:x = E, \mathbf{I}$}{\Prog, \Env} $\Rightarrow (\Ctx, \Env'')$} @@ -546,14 +532,14 @@ is the base case for a list of let items. \RightLabel{(IdC)} \AxiomC{$x \in \langle ident \rangle$} \AxiomC{$v \in \langle literal \rangle$} - \BinaryInfC{\Sem{$x$}{\Prog, \{x \mapsto v \sep []\} \cup \Env} $\Rightarrow \tuple{v, \{x \mapsto v \sep [] \} \cup \Env}$} + \BinaryInfC{\Sem{$x$}{\Prog, \{x \mapsto\ v \sep\ []\} \cup\ \Env} $\Rightarrow \tuple{v, \{x \mapsto v \sep [] \} \cup \Env}$} \end{prooftree} \begin{prooftree} \RightLabel{(IdX)} \AxiomC{$x \in \langle ident \rangle$} \AxiomC{$v$} \AxiomC{$\phi$ otherwise} - \TrinaryInfC{\Sem{$x$}{\Prog, \{x \mapsto v \sep \phi \} \cup \Env} $\Rightarrow \tuple{x, \{x \mapsto v \sep \phi \} \cup \Env}$} + \TrinaryInfC{\Sem{$x$}{\Prog, \{x \mapsto\ v \sep\ \phi\ \} \cup\ \Env} $\Rightarrow \tuple{x, \{x \mapsto v \sep \phi \} \cup \Env}$} \end{prooftree} \begin{prooftree} \RightLabel{(Const)} @@ -586,8 +572,8 @@ is the base case for a list of let items. \begin{prooftree} \AxiomC{\Sem{$\mathit{PE}$}{\Prog,\Env} $\Rightarrow \tuple{S, \_}$} \noLine\ - \UnaryInfC{\Sem{$[E ~|~ \mathit{GE} \mbox{~where~} G]$}{\Prog, \Env \cup \{ x\mapsto - v \sep []\}}} + \UnaryInfC{\Sem{$[E ~|~ \mathit{GE} \mbox{~where~} G]$}{\Prog, \Env\ \cup\ \{ x\mapsto\ + v \sep\ []\}}} \noLine\ \UnaryInfC{\hspace{8em} $\Rightarrow \tuple{A_v, \Env \cup \{x \mapsto v \sep []\} \cup \Env_v}, \forall v \in S $} @@ -628,7 +614,7 @@ new technology. The \textbf{interpreter} evaluates \microzinc\ bytecode and \nanozinc\ programs based on the rewrite system introduced in this section. It can read parameter -data from a file or via an API. The interpreter constructs the call to the +data from a file or via an API\@. The interpreter constructs the call to the \mzninline{main} function, and then performs the rewriting into flat \nanozinc. Memory management is implemented using reference counting, which lends itself well to the optimisations discussed in the next section. The interpreter is @@ -647,13 +633,12 @@ scripting. The source code for the complete system will be made available under an open source license. -\section{NanoZinc Simplification} -\label{sec:4-simplification} +\section{NanoZinc Simplification}\label{sec:4-simplification} The previous section introduced the basic evaluation model of \nanozinc: each call that has a corresponding \microzinc\ definition can be rewritten into a set of calls by evaluating the \microzinc\ code. This section looks at further steps -that the system can perform in order to produce better, more compact \nanozinc +that the system can perform in order to produce better, more compact \nanozinc\ code. \subsection{Removing dead variables and constraints} @@ -661,7 +646,7 @@ code. The most basic form of simplification is to identify variables that are not used by any call, and remove them. This is correct because the semantics of \nanozinc\ requires us to find values for all variables such that all -constraints are satisfied -- but if a variable does not occur in any +constraints are satisfied --- but if a variable does not occur in any constraint, we can pick any value for it. The model without the variable is therefore equisatisfiable with the original model. @@ -726,7 +711,7 @@ auxiliary definitions (the \mzninline{@$\sep\phi$@} part). \end{example} Note how \mzninline{t} acts as a unique name for the result of the let -expression (and thus the function call) -- we could not simply have used +expression (and thus the function call) --- we could not simply have used \mzninline{z}, because then \mzninline{b0} would have kept \mzninline{z} alive. It is worth noting the difference between the constraints \mzninline{b0} and @@ -735,7 +720,7 @@ constraint list of another variable. This means that it is a constraint that defines the value of a variable, and it can be enforced globally (it must hold, under all circumstances). On the other hand, \mzninline{b1} is referenced as an argument to the \mzninline{bool_or} call. This means that the constraint does -not have to hold globally -- rather, we require the variable \mzninline{b1} to +not have to hold globally --- rather, we require the variable \mzninline{b1} to reflect whether the constraint holds or not, so that the \mzninline{bool_or} can implement the disjunction. These types of constraints are typically called \emph{reified} in constraint modelling. So \nanozinc\ uses the same syntax for @@ -839,7 +824,7 @@ equality constraint by replacing all occurrences of \mzninline{x} with \mzninline{y}, and remove the definition of \mzninline{x}. The more complicated case for \mzninline{x=y} is when both \mzninline{x} and -\mzninline{y} are the result of function calls, e.g. from a model fragment such +\mzninline{y} are the result of function calls, \eg\ from a model fragment such as \mzninline{constraint f(a)=g(b)}. Let us assume that part of the corresponding \nanozinc\ code looks like this: @@ -882,7 +867,7 @@ equalities to the target constraint solver. \paragraph{Implementation} -Rewriting a function call that has a \microzinc definition and rewriting a +Rewriting a function call that has a \microzinc\ definition and rewriting a constraint due to propagation are very similar. The interpreter therefore simply interleaves both forms of rewriting. Efficient constraint propagation engines ``wake up'' a constraint only when one of its variables has received an update @@ -908,7 +893,7 @@ that the system becomes non-confluent, and some orders of execution may produce This predicate expresses the constraint \mzninline{b -> x<=0}, using a well-known method called ``big-M transformation''. The expression - \mzninline{ub(x)} returns a valid upper bound for \mzninline{x}, i.e. a fixed + \mzninline{ub(x)} returns a valid upper bound for \mzninline{x}, \ie\ a fixed value known to be greater than or equal to \mzninline{x}. This could be the initial upper bound \mzninline{x} was declared with, or the current value in the corresponding \nanozinc\ \mzninline{mkvar} call. If \mzninline{b} takes @@ -929,7 +914,7 @@ evaluation is best. In our \microzinc/\nanozinc\ system, we allow predicates and functions to be \emph{annotated} as potential candidates for \emph{delayed rewriting}. Any annotated constraint is handled by the (CallPrimitive) rule rather than the -(Call) rule, which means that it is simply added as a call into the \nanozinc +(Call) rule, which means that it is simply added as a call into the \nanozinc\ code, without evaluating its body. When propagation reaches a fix-point, all annotations are removed from the current \nanozinc\ program, and evaluation resumes with the predicate bodies. @@ -939,10 +924,9 @@ example, a model targeted at a MIP solver is rewritten into Boolean and reified constraints, whose definitions are annotated to be delayed. The \nanozinc\ model can then be fully simplified by constraint propagation, before the Boolean and reified constraints are rewritten into lower-level linear primitives suitable -for MIP. +for MIP\@. -\subsection{Common Subexpression Elimination} -\label{sec:4-cse} +\subsection{Common Subexpression Elimination}\label{sec:4-cse} The simplifications introduced above remove definitions from the model when we can detect that they are no longer needed. In some cases, it is even possible to @@ -1059,7 +1043,7 @@ addition recognises constraints in \emph{positive} contexts, also known as \emph{half-reified} \autocite{feydy-2011-half-reif}. We omit the details here for brevity, but the CSE approach generalises to this extension. -\section{Experiments} +\section{Experiments}\label{sec:4-experiments} We have created a prototype implementation of the architecture presented in the preceding sections. It consists of a compiler from \minizinc\ to \microzinc, and @@ -1087,7 +1071,7 @@ those constraints rather than using solver built-ins, in order to stress-test the flattening). We measured pure flattening time, i.e., without time required to parse and typecheck in version 2.4.3, and without time required for compilation to \microzinc\ in the new system (compilation is usually very fast). -Times are averages of 10 runs. \footnote{All models obtained from +Times are averages of 10 runs.\footnote{All models obtained from \url{https://github.com/minizinc/minizinc-benchmarks}: \texttt{\justify{}accap, amaze, city-position, community-detection, depot-placement, freepizza, groupsplitter, kidney-exchange, median-string, @@ -1110,18 +1094,15 @@ unoptimised prototype to a mature piece of software. \begin{subfigure}[b]{.48\columnwidth} \centering \includegraphics[width=\columnwidth]{assets/img/4_compare_runtime} - \caption{flattening run time (ms)} - \label{sfig:4-compareruntime} + \caption{\label{sfig:4-compareruntime}flattening run time (ms)} \end{subfigure}% \hspace{0.04\columnwidth}% \begin{subfigure}[b]{.48\columnwidth} \centering \includegraphics[width=\columnwidth]{assets/img/4_compare_memory} - \caption{flattening memory (MB)} - \label{sfig:4-comparemem} + \caption{\label{sfig:4-comparemem}flattening memory (MB)} \end{subfigure} - \caption{Performance on flattening 100 MiniZinc Challenge instances. - \minizinc\ 2.4.3 (x-axis) vs new architecture (y-axis), log-log plot. Dots - below the line indicate the new system is better.} - \label{fig:4-runtime} + \caption{\label{fig:4-runtime}Performance on flattening 100 MiniZinc Challenge + instances. \minizinc\ 2.4.3 (x-axis) vs new architecture (y-axis), log-log + plot. Dots below the line indicate the new system is better.} \end{figure} diff --git a/chapters/5_half_reif.tex b/chapters/5_half_reif.tex index bc90852..95d4f4d 100644 --- a/chapters/5_half_reif.tex +++ b/chapters/5_half_reif.tex @@ -2243,7 +2243,7 @@ $b \half c$ run at the lowest priority. This means that Boolean variables are still fixed at the same time, but there is less overhead. -\section{Implementation in the \minizinc{} Distribution} +\section{Implementation in the \glsentrytext{minizinc} Distribution} Implementing half-reification within the \minizinc\ distribution consists of two parts: the \minizinc\ compiler needs to be extended with an implementation of diff --git a/chapters/6_incremental.tex b/chapters/6_incremental.tex index 9217e05..e53d1c6 100644 --- a/chapters/6_incremental.tex +++ b/chapters/6_incremental.tex @@ -81,16 +81,14 @@ and remove constraints from an existing model while avoiding recompilation. Finally, \Cref{sec:6-conclusion} presents the conclusions. -\section{Modelling of Restart-Based Meta-Search} -\label{sec:6-modelling} +\section{Modelling of Restart-Based Meta-Search}\label{sec:6-modelling} This section introduces a \minizinc\ extension that enables modellers to define \gls{meta-search} algorithms in \cmls. This extension is based on the construct introduced in \minisearch\ \autocite{rendl-2015-minisearch}, as summarised below. -\subsection{Meta-Search in \glsentrytext{minisearch}} -\label{sec:6-minisearch} +\subsection{Meta-Search in \glsentrytext{minisearch}}\label{sec:6-minisearch} % Most \gls{lns} literature discusses neighbourhoods in terms of ``destroying'' part of % a solution that is later repaired. However, from a declarative modelling point @@ -391,8 +389,7 @@ solution. We store the solution values in \mzninline{s1} and \mzninline{s2} arrays. Before each restart we add constraints removing pareto dominated solutions, based on each previous solution. -\section{Compilation of Meta-Search Specifications} -\label{sec:6-solver-extension} +\section{Compilation of Meta-Search Specifications}\label{sec:6-solver-extension} The neighbourhoods defined in the previous section can be executed with \minisearch\ by adding support for the \mzninline{status} and @@ -617,8 +614,7 @@ against being invoked before the first solution is found, since the solution has been recorded yet, but we use this simple example to illustrate how these Boolean conditions are compiled and evaluated. -\section{An Incremental Interface for Constraint Modelling Languages} -\label{sec:6-incremental-compilation} +\section{An Incremental Interface for Constraint Modelling Languages}\label{sec:6-incremental-compilation} As an alternative approach to run \gls{meta-search} algorithm we propose the possibility of incremental flattening. The execution of any @@ -642,7 +638,7 @@ After evaluating the first constraint, the domain of \mzninline{x} is changed to be less than 10. Evaluating the second constraint causes the domain of \mzninline{y} to be less than 9. If we now, however, try to remove the first constraint, it is not just the direct inference on the domain of \mzninline{x} -that has to be undone, but also any further effects of those changes -- in this +that has to be undone, but also any further effects of those changes --- in this case, the changes to the domain of \mzninline{y}. \end{example} @@ -712,8 +708,7 @@ therefore support solvers with different levels of an incremental interface: \end{itemize} -\section{Experiments} -\label{sec:6-experiments} +\section{Experiments}\label{sec:6-experiments} We have created a prototype implementation of the architecture presented in the preceding sections. It consists of a compiler from \minizinc\ to \microzinc, and @@ -738,7 +733,8 @@ offers, we present a runtime evaluation of two meta-heuristics implemented using our prototype interpreter. For both meta-heuristics, we evaluate the performance of fully re-evaluating and solving the instance from scratch, compared to the fully incremental evaluation and solving. The solving in both tests is performed -by the Gecode solver, version 6.1.2, connected using the fully incremental API. +by the Gecode solver, version 6.1.2, connected using the fully incremental +API\@. \paragraph{GBAC} The Generalised Balanced Academic Curriculum (GBAC) problem @@ -836,19 +832,19 @@ spent solving is reduced by 33\%. \newcommand{\chuffedStd}{\textsf{chuffed}} \newcommand{\chuffedMzn}{\textsf{chuffed-fzn}} -We will now show that a solver that evaluates the compiled \flatzinc \gls{lns} +We will now show that a solver that evaluates the compiled \flatzinc\ \gls{lns} specifications can (a) be effective and (b) incur only a small overhead compared to a dedicated implementation of the neighbourhoods. To measure the overhead, we implemented our new approach in -Gecode~\autocite{gecode-2021-gecode}. The resulting solver (\gecodeMzn in the tables +Gecode~\autocite{gecode-2021-gecode}. The resulting solver (\gecodeMzn\ in the tables below) has been instrumented to also output the domains of all model variables after propagating the new special constraints. We implemented another extension to Gecode (\gecodeReplay) that simply reads the stream of variable domains for -each restart, essentially replaying the \gls{lns} of \gecodeMzn without incurring any +each restart, essentially replaying the \gls{lns} of \gecodeMzn\ without incurring any overhead for evaluating the neighbourhoods or handling the additional variables and constraints. Note that this is a conservative estimate of the overhead: -\gecodeReplay has to perform \emph{less} work than any real \gls{lns} implementation. +\gecodeReplay\ has to perform \emph{less} work than any real \gls{lns} implementation. In addition, we also present benchmark results for the standard release of Gecode 6.0 without \gls{lns} (\gecodeStd); as well as \chuffedStd, the development @@ -905,9 +901,9 @@ period and frees all courses that are assigned to it. \end{table} The results for \texttt{gbac} in \cref{tab:6-gbac} show that the overhead -introduced by \gecodeMzn w.r.t.~\gecodeReplay is quite low, and both their +introduced by \gecodeMzn\ w.r.t.~\gecodeReplay\ is quite low, and both their results are much better than the baseline \gecodeStd. Since learning is not very -effective for \texttt{gbac}, the performance of \chuffedStd is inferior to +effective for \texttt{gbac}, the performance of \chuffedStd\ is inferior to Gecode. However, \gls{lns} again significantly improves over standard Chuffed. \subsubsection{\texttt{steelmillslab}} @@ -934,13 +930,14 @@ slab. \caption{\label{tab:6-steelmillslab}\texttt{steelmillslab} benchmarks} \end{table} -For this problem a solution with zero wastage is always optimal. The use of \gls{lns} -makes these instances easy, as all the \gls{lns} approaches find optimal solutions. As -\cref{tab:6-steelmillslab} shows, \gecodeMzn is again slightly slower than -\gecodeReplay (the integral is slightly larger). While \chuffedStd significantly -outperforms \gecodeStd on this problem, once we use \gls{lns}, the learning in -\chuffedMzn is not advantageous compared to \gecodeMzn or \gecodeReplay. Still, -\chuffedMzn outperforms \chuffedStd by always finding an optimal solution. +For this problem a solution with zero wastage is always optimal. The use of +\gls{lns} makes these instances easy, as all the \gls{lns} approaches find +optimal solutions. As \cref{tab:6-steelmillslab} shows, \gecodeMzn\ is again +slightly slower than \gecodeReplay\ (the integral is slightly larger). While +\chuffedStd\ significantly outperforms \gecodeStd\ on this problem, once we use +\gls{lns}, the learning in \chuffedMzn\ is not advantageous compared to +\gecodeMzn\ or \gecodeReplay. Still, \chuffedMzn\ outperforms \chuffedStd\ by +always finding an optimal solution. % RCPSP/wet \subsubsection{\texttt{rcpsp-wet}} @@ -967,26 +964,25 @@ that time interval, which allows a reshuffling of these tasks. \caption{\label{tab:6-rcpsp-wet}\texttt{rcpsp-wet} benchmarks} \end{table} -\cref{tab:6-rcpsp-wet} shows that \gecodeReplay and \gecodeMzn perform almost -identically, and substantially better than baseline \gecodeStd for these -instances. The baseline learning solver \chuffedStd is best overall on the easy -examples, but \gls{lns} makes it much more robust. The poor performance of \chuffedMzn -on the last instance is due to the fixed search, which limits the usefulness of -nogood learning. +\cref{tab:6-rcpsp-wet} shows that \gecodeReplay\ and \gecodeMzn\ perform almost +identically, and substantially better than baseline \gecodeStd\ for these +instances. The baseline learning solver \chuffedStd\ is best overall on the easy +examples, but \gls{lns} makes it much more robust. The poor performance of +\chuffedMzn\ on the last instance is due to the fixed search, which limits the +usefulness of nogood learning. \subsubsection{Summary} The results show that \gls{lns} outperforms the baseline solvers, except for benchmarks where we can quickly find and prove optimality. However, the main result from these experiments is that the overhead introduced -by our \flatzinc interface, when compared to an optimal \gls{lns} implementation, is -relatively small. We have additionally calculated the rate of search nodes -explored per second and, across all experiments, \gecodeMzn achieves around 3\% -fewer nodes per second than \gecodeReplay. This overhead is caused by -propagating the additional constraints in \gecodeMzn. Overall, the experiments -demonstrate that the compilation approach is an effective and efficient way of -adding \gls{lns} to a modelling language with minimal changes to the solver. - -\section{Conclusions} -\label{sec:6-conclusion} +by our \flatzinc\ interface, when compared to an optimal \gls{lns} +implementation, is relatively small. We have additionally calculated the rate of +search nodes explored per second and, across all experiments, \gecodeMzn\ +achieves around 3\% fewer nodes per second than \gecodeReplay. This overhead is +caused by propagating the additional constraints in \gecodeMzn. Overall, the +experiments demonstrate that the compilation approach is an effective and +efficient way of adding \gls{lns} to a modelling language with minimal changes +to the solver. +\section{Conclusions}\label{sec:6-conclusion}