This repository has been archived on 2025-03-06. You can view files and clone it, but cannot push or open issues or pull requests.
dekker-phd-thesis/chapters/1_introduction.tex
2021-07-29 14:45:28 +10:00

289 lines
22 KiB
TeX

%************************************************
\chapter{Introduction}\label{ch:introduction}
%************************************************
\noindent{}A \gls{dec-prb} is any problem that requires us to make decisions according to a set of rules.
Many important and difficult problems in the real world are \glspl{dec-prb}, for example, deciding on the country's rail timetable or the stand-by locations for ambulances in the region.
Formally, we define a \gls{dec-prb} as a set of \variables{} subject to a set of logical \constraints{}.
A \gls{sol} to such a problem is the \gls{assignment} of all \variables{} to values that abide by the logic of the \constraints{}.
These problems are also highly computationally complex: even with the fastest computers, finding a solution can take a long time.
They get even more complex when we consider \glspl{opt-prb}: if there are multiple solutions, then one may be preferred over the other.
Although these problems are hard to solve, finding a (good) solution for these problems is essential in many walks of life.
The field of \gls{or} uses advanced computational methods to help make (better) decisions.
Famous classes of decision and \glspl{opt-prb}, such as \gls{sat} \autocite{biere-2021-sat}, \gls{cp} \autocite{rossi-2006-cp} and \gls{mip} \autocite{wolsey-1988-mip}, have been studied extensively.
Over the years, highly specialized \solvers{} have been created to find \glspl{sol} for these classes of problems.
Nowadays, most decision and \glspl{opt-prb} are solved by encoding them into one of the above-mentioned classes of problems.
Encoding one problem in terms of another is, however, a difficult task.
The problem classes can be restrictive: the input to the \solver{}, the \gls{slv-mod}, can only contain \gls{native} \variables{} and \constraints{}, meaning those that are directly supported by the \solver{}.
Furthermore, \constraints{} can only refer to \variables{}, they cannot be (directly) dependent on other \constraints{}.
For example, \gls{sat} \solvers{} can only reason about Boolean \variables{}, deciding if something is true or false.
\Constraints{} for \gls{sat} \solvers{} have to be in the form of disjunctions of Boolean \variables{}, or their negations.
Not only does the encoding have to be correct, the encoding also has to be performant.
There are often many possible correct encodings of a problem, but there can be a significant difference in how quickly the \solver{} finds a \gls{sol} for them and the preferred encoding can differ between \solvers{}.
Two \solvers{} designed to solve the same problem class can perform very differently.
\Cmls{} have been designed to tackle these issues.
They serve as a level of abstraction between the modeller and the \solver{}.
The typical process of using a \cml{} is visualized in \cref{fig:intro-cml-flow}.
Instead of providing a flat list of \variables{} and \constraints{}, a modeller creates a \cmodel{} using the more natural syntax of the \cml{}.
A \cmodel{} can generally describe a class of problems through the use of \prbpars{}: values assigned by external input.
Once given a complete \gls{assignment} of the \prbpars{}, the \cmodel{} forms an \instance{}.
The \gls{rewriting} process of a \cml{} transforms an \instance{} into a \gls{slv-mod}: the encoding of the problem for the \solver{}.
\begin{figure}
\centering
\includegraphics[width=\columnwidth]{assets/img/intro_cml_flow}
\caption{\label{fig:intro-cml-flow}A flow diagram of the typical process of using a \cml{}.}
\end{figure}
Traditional \cmls{}, such as \emph{\glsxtrshort{ampl}} \autocite{fourer-2003-ampl}, \gls{essence} \autocite{frisch-2007-essence}, and \emph{\glsxtrshort{opl}} \autocite{van-hentenryck-1999-opl}, offer the modeller an expressive language that can be used to target a great variety of \solvers{}.
However, these languages lack a certain level of flexibility: the modeller is unable to capture common concepts used throughout a \cmodel{}, and the \solver{} cannot influence their way in which the problem is encoded.
This is a stark difference to \gls{clp} languages, such as \gls{sicstus} \autocite{carlsson-1997-sicstus}, which are said to be the precursor to \cmls{}.
In these languages, the modeller is encouraged to express high-level concepts by defining custom predicates and functions and declare the way in which they are rewritten into \gls{native} \constraints{}.
The downside of \gls{clp} languages is that, because of its search mechanism, the \solver{} is tightly integrated within the \gls{rewriting}.
These languages thus lack the \solver{}-independence of \cmls{}.
\minizinc{} \autocite{nethercote-2007-minizinc} is a functional \cml{} that operates on a level in between these two types of languages.
Like a traditional \cml{}, it is \solver{}-independent.
Like a \gls{clp} language, modellers can declare and reuse common concepts and control the encoding into the \gls{slv-mod} through the use of function definitions.
Fundamentally, in its \gls{rewriting} process \minizinc{} is a functional programming language.
It continuously evaluates the calls in a \minizinc{} \instance{} until it reaches \gls{native} \constraints{}.
Like other functional languages, \minizinc{} allows recursion; it can be used as a fully Turing complete programming language.
Although \minizinc{} is based on this powerful paradigm, its success has made certain problems apparent.
The language was originally designed to target \gls{cp} \solvers{}, where the \glspl{slv-mod} typically contain only few, highly complex \constraints{}.
Its use has however, over time, extended to rewrite for \gls{mip} and \gls{sat} \solvers{}, whose \glspl{slv-mod} often contain numerous \variables{} and simple \constraints{}, generated by a complex library of \minizinc{} functions.
For many applications, \minizinc{} now requires a significant, and sometimes prohibitive, amount of time to rewrite \instances{}.
Unlike \gls{clp} \gls{rewriting}, the \minizinc{} \gls{rewriting} process does not consider incremental changes to its \gls{slv-mod}.
This weakness has become particularly important, since new optimization methods require the solving of a sequence of closely related \instances{}.
The overhead of \gls{rewriting} all these \instances{} separately can be substantial.
In this thesis we revisit the \gls{rewriting} of functional \cmls{} into \glspl{slv-mod}.
We design and evaluate an architecture for \cmls{} that can accommodate the modern uses of these languages.
\section{Different Ways of Modelling with Constraints}
Let us now study the different ways of constraint modelling in more detail, starting with one of the most commonly used \cmls{}: \glsxtrshort{ampl}.
\glsxtrshort{ampl} was originally designed as a common interface to different \gls{mip} \solvers{}.
The language provides a natural way to define numeric \variables{} and express \constraints{} in the form of linear (in-)equations as described by the class of problem.
The same \glsxtrshort{ampl} model can then be used for different \solvers{}.
\glsxtrshort{ampl} was later extended to include other \solver{} targets, including \gls{cp} and quadratic \solvers{}.
As such, additional types of \constraints{} for these problem classes have been added to the language, removing the restriction that \constraints{} must be linear (in-)equations.
Let us introduce the \glsxtrshort{ampl} language by modelling the ``open shop'' problem.
In the open shop problem, the goal is to schedule jobs with multiple tasks.
Each task must be performed by an assigned machine.
A machine can only perform one task at any one time and only one task of the same job can be scheduled at the same time.
We assume that each job has a task for every machine.
As an \gls{opt-prb}, our goal is to find a schedule that minimizes the finishing time of the last task.
\Cref{lst:intro-open-shop} shows an \glsxtrshort{ampl} model for the open shop problem.
In order of occurrence, \lrefrange{line:intro:param:start}{line:intro:param:end} show the declarations of the \prbpars{}.
To create an \instance{} of the problem, the modeller provides the number of jobs and machines that are considered, and the duration of each task.
\Lrefrange{line:intro:var:start}{line:intro:var:end} show the \variable{} declarations: for each task we decide on its start time.
We also declare the end time of the last task as a \variable{}, to make it easier to model the \gls{objective}.
This \variable{} is made to be our optimization goal on \lref{line:intro:goal}.
Finally, \lrefrange{line:intro:con:start}{line:intro:con:end} express the \constraints{} of our problem in terms of equations bound by logic.
\begin{listing}
\amplfile[l]{assets/listing/intro_open_shop.mod}
\caption{\label{lst:intro-open-shop} An \glsxtrshort{ampl} model of the open shop problem.}
\end{listing}
The \glsxtrshort{ampl} model provides a clear definition of the problem class, but it can be argued that its meaning is hard to decipher.
\glsxtrshort{ampl} does not provide any way to capture common concepts, such as one task preceding another or that two tasks cannot overlap in our example.
Moreover, the process to encode an \instance{} into a \gls{slv-mod} all happens ``behind the scenes''.
The \solver{} cannot specify how, for example, an operator is best rewritten.
As such, \glsxtrshort{ampl} cannot rewrite all models for all its \solvers{}.
For example, since the model in \cref{lst:intro-open-shop} uses the \mzninline{or} operator (see lines \ref{line:intro:or:first} and \ref{line:intro:or:second}), it can only be encoded for \solvers{} that support \glsxtrshort{ampl}'s \gls{cp} interface.
Although they do support the \gls{rewriting} of models between different problem classes, other traditional \cmls{}, such as \gls{essence} and \glsxtrshort{opl}, exhibit the same problems.
They do not provide any way to capture common concepts, and, apart from changing the implementation of \glsxtrshort{opl} or \gls{essence}, the \solver{} is unable to influence their preferred encoding.
\gls{clp}, as used for example in the \gls{sicstus} language, offers a very different mechanism to create a \cmodel{}.
In \gls{clp} the main method to formulate a \cmodel{} and how to search for a solution it through the use of so-called \constraint{} predicates.
For example, the concepts of one task preceding another and non-overlapping of tasks could be defined in Prolog as follows.
\begin{minted}[
autogobble=true,
breaklines,
breakindent=4em,
numbers=none,
escapeinside=@@,
fontsize=\scriptsize,
tabsize=2,
]{prolog}
precedes(startA, durA, startB) :-
startA + durA < startB.
non_overlap(startA, durA, startB, durB) :-
precedes(startA, durA, startB) ; precedes(startB, durB, startA).
\end{minted}
The definition of \texttt{non\_overlap} requires that either task A precedes task B, or vice versa.
However, unlike the \glsxtrshort{ampl} model, Prolog would not provide this choice to the \solver{}.
Instead, it enforces one, tests if this works, and otherwise it enforces the other.
This is a powerful mechanism where choices are made in the \gls{rewriting} process, and not in the \solver{}.
The problem with the mechanism is that it requires a highly incremental interface to the \solver{} that can incrementally post and retract \constraints{}.
\Solvers{} are also not always able to verify if a certain set of \constraints{} is consistent.
This makes the behaviour of the \gls{clp} program dependent on the \solver{} that is used.
As such, a \gls{clp} program is not \solver{}-independent.
\minizinc{} is one of the most prominent \cmls{} and operates on a level in between these two types of languages.
It offers the modeller an expressive language that, in addition to user-defined functions, includes advanced features, such as many types of \variables{}, \glspl{annotation}, and an extensive standard library of \constraints{}.
All of these can be used with all \solver{} targets and, through its functional language, \solvers{} can control how they are encoded into a \gls{slv-mod}.
% As a result of the popularity and maturity of the language, there is a large suite of \minizinc{} models available that can be used as benchmarks.
% The language has also been used in multiple studies as a host for meta-optimization techniques \autocite{ingmar-2020-diverse,ek-2020-online}.
For example, a modeller could create the following \minizinc{} function to express the non-overlapping relation.
\begin{mzn}
function var bool: non_overlap(
var int: startA, int: durA,
var int: startB, int: durB
) =
startA + durA < startB \/ startB + durB < startA;
\end{mzn}
\noindent{}Since functions that return the truth value of an expression are common when constructing \constraints{}, \minizinc{} supports the ``\mzninline{predicate}'' keyword as equivalent to the ``\mzninline{function var bool:}'' syntax.
\minizinc{} also determines how an \instance{} is encoded for a \solver{} through function definitions.
All functionality in the \minizinc{} language is eventually expressed using function calls.
The \solver{}, then, has the decision to choose if this call is a \gls{native} \constraint{}, or how this call is rewritten.
Each \solver{} supplies its own library of \minizinc{} function definitions for common \constraints{}.
For example, the logical \mzninline{or}-operator, which was only supported for \gls{cp} \solvers{} in \glsxtrshort{ampl}, could be defined in the \minizinc{} library for a \gls{mip} \solver{} using the following definition.
\begin{mzn}
predicate bool_or(var bool: x, var bool: y) =
x + y >= 1;
\end{mzn}
\noindent{}Whereas, the \minizinc{} library of a \gls{cp} \solver{} could include the following definition to mark it as a \gls{native} \constraint{}, by not defining a body.
\begin{mzn}
predicate bool_or(var bool: x, var bool: y);
\end{mzn}
The functional paradigm employed by \minizinc{} has shown itself to be powerful and flexible for modeller and \solver{}.
However, the performance of the functional evaluation of the language can be a limiting factor.
\section{Why Rewriting MiniZinc is Difficult}
A \minizinc{} model generally consists of a few loops or \glspl{comprehension}; for a \gls{cp} solver, this would be rewritten into a relatively small set of \constraints{} which would be fed whole into the solver.
The existing process for translating \minizinc{} into a \gls{slv-mod} is a somewhat ad-hoc, (mostly) single-pass, recursive unrolling procedure, and many aspects (such as call overloading) are resolved dynamically.
In its original application, this was acceptable: models (both before and after \gls{rewriting}) were small.
Now, the language is also used to target low-level \solvers{}, such as \gls{sat} and \gls{mip}.
For them, the encoding of the same \minizinc{} \instance{} results in a much larger \gls{slv-mod}.
Additionally, \minizinc{} is used in progressively more \gls{meta-optimization} toolchains.
To a great extent, this is testament to the effectiveness of the language.
However, as they have become more common, these extended uses have revealed weaknesses of the existing \minizinc{} toolchain.
In particular:
\begin{itemize}
\item The existing implementation of \minizinc{} is inefficient.
It does a surprisingly large amount of work for each expression (especially resolving sub-typing and overloading), which may be repeated many times.
This means that, as models generated for low-level \solver{} technologies can be quite large, the resulting \gls{rewriting} procedure can be intolerably slow.
As the model transformations implemented in \minizinc{} become more sophisticated, these performance problems are simply magnified.
\item The generated models often contain unnecessary \constraints{}.
During \gls{rewriting}, functional expressions are replaced with \constraints{}.
But this breaks the functional dependencies: if the original expression later becomes redundant (due to model simplifications), \minizinc{} may fail to detect that the constraint can be removed.
\item The reasoning about \gls{reif} in \minizinc{} is limited.
An important decision that is made during \gls{rewriting} is if a \constraint{} can be enforced directly or if it is dependent on other \constraints{}.
In the second case, a \gls{reif} has to be introduced: an often costly form of the \constraint{} that determines whether a \constraint{} holds rather than enforcing the \constraint{} itself.
It is possible, however, that further \gls{rewriting} can reveal a \gls{reif} to be unnecessary.
The current \minizinc{} implementation cannot reverse any \gls{reif} decisions once they are made.
It also does not consider \gls{half-reif}, a cheaper alternative to \gls{reif} that is often applicable.
\item Monolithic \gls{rewriting} is wasteful.
When \minizinc{} is used as part of a \gls{meta-optimization} toolchain, there is typically a large base model, common to all sub-problems, and a small set of \constraints{}, which are added or removed in each iteration.
In the current toolchain, the whole model must be fully rewritten to a \gls{slv-mod} each time.
Not only does this repeat all the work done to rewrite the base model, this means a large (sometimes dominant) portion of runtime is simply spent \gls{rewriting} the core model over and over again.
This also prevents the \solver{} from carrying over anything it learnt from one problem to the next.
\end{itemize}
\section{Research Aims and Contributions}
Research into \gls{rewriting} of \cmls{}, as well as model transformation in general, has shown that it is difficult to achieve both high performance and generality.
We address these issues by reconsidering the rewriting process from the ground up to make the transformation efficient while accommodating the optimization techniques required to produce high-quality \gls{slv-mod}.
In order to adapt \cmls{} for modern day requirements, this thesis aims to \textbf{design, implement, and evaluate a modern architecture for functional \cmls{}}.
Crucially, this architecture will allow us to:
\begin{itemize}
\item easily integrate a range of well-known and new \textbf{optimization and simplification} techniques,
\item effectively manage the \gls{slv-mod} and \textbf{detect and eliminate} parts of the model that have become unused,
\item formally \textbf{reason about \gls{reif}} to avoid it or use \gls{half-reif} where possible, and
\item support \textbf{incremental usage} of the \cml{}.
\end{itemize}
In the design of this architecture, we analyse the \gls{rewriting} process from the ground up.
We first determine the foundation for the system: an execution model for the basic \gls{rewriting} system.
To ensure the quality of the produced \glspl{slv-mod}, we extend the system with many well-known simplification techniques.
In addition, we experiment with several new approaches to optimize the resulting \glspl{slv-mod}, including the use of \gls{half-reif}.
Crucially, we ensure that the resulting system is fully incremental, and we experiment with several \gls{meta-optimization} applications to evaluate this.
Overall, this thesis makes the following contributions.
\begin{itemize}
\item It presents a formal execution model of rewriting for the \minizinc{} language and extends this model with well-known optimization and simplification techniques.
This model includes a novel method of tracking of \constraints{} created as part of functional dependencies, ensuring the correct removal of unnecessary \constraints{}.
\item It presents an analysis technique to reason about in what (\gls{reified}) form a \constraint{} should be considered.
This analysis allows for the design and implementation of techniques to automatically introduce \gls{half-reified} \constraints{} in \minizinc{}, including a technique to simplify \glspl{slv-mod} by efficiently eliminating \glspl{implication-chain}.
\item It proposes two novel methods to reduce the overhead of using \cmls{} in incremental techniques: \gls{rbmo} and the \gls{incremental-rewriting} of changing \instances{} through an incremental interface.
\end{itemize}
\section{Organization of the Thesis}
This thesis is arranged into the following chapters.
\emph{\Cref{ch:background}} gives an overview of the area of \cmls{}.
It introduces the reader to \minizinc{}, how its models are formulated and how they are translated to \glspl{slv-mod}.
Then, we review different solving methods such as \gls{sat}, \gls{mip}, and \gls{cp}.
This is followed by a comparison of \minizinc{} with other \cmls{}.
This chapter also reviews techniques that are closely related to \cmls{}.
We conclude this chapter with a description of the current \minizinc{} implementation and the techniques it uses to simplify the \glspl{slv-mod} it produces.
\emph{\Cref{ch:rewriting}} presents a formal execution model for \minizinc{} and the core of our new architecture.
We construct a set of formal rewriting rules for a subset of \minizinc{} called \microzinc{}.
We show how any \minizinc{} model can be reduced to a \microzinc{} model and, as such, provide rewriting rules for the \minizinc{} language.
Applying the rewriting produces \nanozinc{}, an abstract \gls{slv-mod} language.
Crucially, we show how \nanozinc{} tracks \constraints{} that define a \variable{}, and can therefore correctly remove functional definitions.
This chapter also integrates well-known techniques used to simplify \glspl{slv-mod} into the architecture.
We compare the performance of an implementation of the presented architecture against the existing \minizinc{} infrastructure.
\emph{\Cref{ch:half-reif}} continues on the path of improving \glspl{slv-mod}.
In this chapter, we present a formal analysis technique to reason about \gls{reif}.
This analysis can help us decide whether a \constraint{} has to be reified.
In addition, the analysis allows us to determine whether to use \gls{half-reif}.
We thus present the first implementation of automatic \gls{half-reif}.
We conclude this chapter by analysing the impact of \gls{half-reif} for different types of \solvers{}.
\emph{\Cref{ch:incremental}} focuses on the development of incremental methods for \cmls{}.
We first present a novel technique that eliminates the need for incremental \gls{rewriting}.
Instead, it integrates a \gls{meta-optimization} specification, written in the \minizinc{} language, into the \gls{slv-mod}.
Then, we describe a technique to optimize the \gls{rewriting} process for incremental changes to an \instance{}.
This method prevents the repeated \gls{rewriting} of parts of the \instance{} that remain unchanged.
We conclude this chapter by testing the performance and computational overhead of these two techniques.
Finally, \emph{\Cref{ch:conclusions}} is the concluding chapter of the thesis.
It reiterates the discoveries and contributions of this research to theory and practice, comments on the scope and limitations of the presented architecture, and presents further avenues for research in this area.