This repository has been archived on 2025-03-06. You can view files and clone it, but cannot push or open issues or pull requests.
dekker-phd-thesis/chapters/1_introduction.tex

280 lines
21 KiB
TeX

%************************************************
\chapter{Introduction}\label{ch:introduction}
%************************************************
\noindent{}A \gls{dec-prb} is any problem that requires us to make decisions according to a set of rules.
Many important and difficult problems in the real world are \glspl{dec-prb}.
We can think, for instance, about the decision on the country's rail timetable or the stand-by locations for ambulances in the region.
Formally, we define a \gls{dec-prb} as a set of \variables{} subject to a set of logical \constraints{}.
A \gls{sol} to such a problem is the \gls{assignment} of all \variables{} to values that abide by the logic of the \constraints{}.
These problems are also highly computationally complex: even with the fastest computers there is no simple way to find a solution.
They get even more complex when we consider \glspl{opt-prb}: if there are multiple solutions, then one may be preferred over the other.
But, although these problems are hard to solve, finding a (good) solution for these problems is essential in many walks of life.
The field of \gls{or} uses advanced computational methods to help make (better) decisions.
Famous classes of decision and \glspl{opt-prb}, such as \gls{sat} \autocite{biere-2021-sat}, \gls{cp} \autocite{rossi-2006-cp} and \gls{mip} \autocite{wolsey-1988-mip}, have been studied extensively.
And, over the years, highly specialised \solvers{} have been created to find \glspl{sol} for these classes of problems.
Nowadays, most decision and \glspl{opt-prb} are solved by encoding them into one of the above-mentioned classes of problems.
Encoding one problem in terms of another is, however, a difficult problem.
The problem classes can be restrictive: the input to the \solver{}, the \gls{slv-mod}, can only contain \gls{native} \variables{} and \constraints{}, meaning those that are directly supported by the \solver{}.
Furthermore, \constraints{} can only refer to \variables{}, they cannot be (directly) dependent on other \constraints{}.
For example, \gls{sat} \solvers{} can only reason about Boolean \variables{}, deciding if something is true or false.
\Constraints{} for \gls{sat} \solvers{} have to be in the form of disjunctions of Boolean \variables{}, or their negations.
But, not only does the encoding have to be correct, the encoding also has to be performant.
There are often many possible correct encodings of a problem, but there can be a significant difference in how quickly the \solver{} finds a \gls{sol} for them.
The preferred encoding can however differ between \solvers{}.
Two \solvers{} designed to solve the same problem class can perform very differently.
\Cmls{} have been designed to tackle these issues.
They serve as a level of abstraction between the user and the \solver{}.
Instead of providing a flat list of \variables{} and \constraints{}, the user can create a \cmodel{} using the more natural syntax of the \cml{}.
A \cmodel{} can generally describe a class of problems through the use of \parameters{}, values assigned by external input.
Once given a complete \gls{assignment} of the \parameters{}, the \cmodel{} forms an \instance{}.
The \instance{} is transformed into a \gls{slv-mod} through a process class \gls{rewriting}.
A commonly used \cml{} is \glsxtrshort{ampl} \autocite{fourer-2003-ampl}.
\glsxtrshort{ampl} was originally designed as a common interface between different \gls{mip} \solvers{}.
The language provides a natural way to define numeric \variables{} and express \constraints{} in the form of linear (in-)equations as described by the class of problem.
Crucially, the same \glsxtrshort{ampl} model can be used between different \solvers{}.
\glsxtrshort{ampl} was later extended to include other \solver{} targets, including \gls{cp} and quadratic \solvers{}.
As such, additional types of \constraints{} for these problem classes have been added to the language, removing the restriction that \constraints{} must be linear (in-)equations.
Let us introduce the \glsxtrshort{ampl} language by modelling the ``open shop problem''.
In the open shop problem, the goal is to schedule jobs with multiple tasks..
Each task must be performed by an assigned machine.
A machine can only perform one task at the same time.
And, only one task of the same job can be scheduled at the same time.
We assume that each job has a task for every machine.
As an \gls{opt-prb}, our goal is to find a schedule that minimises the finishing time of the last task.
\begin{listing}
\pyfile{assets/listing/intro_open_shop.mod}
\caption{\label{lst:intro-open-shop} An \glsxtrshort{ampl} model of the open shop problem}
\end{listing}
\Cref{lst:intro-open-shop} shows an \glsxtrshort{ampl} model for the open shop problem.
In order of occurence, \lrefrange{line:intro:param:start}{line:intro:param:end} show the declarations of the \parameters{}.
To create an \instance{} of the problem, the user provides the number of jobs and machines that are considered, and the duration of each task.
\Lrefrange{line:intro:var:start}{line:intro:var:end} show the \variable{} declarations: for each task we decide on its start time.
Additionally, we declare the end time of the last task as a \variable{}, to ease the modelling of the problem.
This \variable{} is made to be our optimisation goal on \lref{line:intro:goal}.
Finally, \lrefrange{line:intro:con:start}{line:intro:con:end} express the \constraints{} of our problem in terms of equations bound by logic.
The \glsxtrshort{ampl} model provides a clear definition of the problem class, but it can be argued that its meaning is hard to decipher.
\glsxtrshort{ampl} does not provide any way to capture common concepts, such as one task preceding another or that two tasks cannot overlap in our example.
Additionally, the process to encode an \instance{} into a \gls{slv-mod} all happens ``behind the scenes''.
There is no way for a \solver{} to specify how, for example, an operator is best rewritten.
As such, \glsxtrshort{ampl} cannot rewrite all models for all its \solvers{}.
For example, since the model in \cref{lst:intro-open-shop} uses the \mzninline{or} operator, it can only be encoded for \solvers{} that support \glsxtrshort{ampl}'s \gls{cp} interface.
Although they do support the rewriting of models between different problem classes, other \cmls{}, such as \gls{essence} \autocite{frisch-2007-essence} and \glsxtrshort{opl} \autocite{van-hentenryck-1999-opl}, exhibit the same problems.
They do not provide any way to capture common concepts.
And, apart from changing the implementation of \glsxtrshort{opl} or \gls{essence}, there is no way for a \solver{} to influence their preferred encoding.
\gls{clp}, as used in the Prolog language, offers a very different mechanism to create a \cmodel{}.
In these languages, the modeller is encouraged to create high-level concepts and declare the way in which they are rewritten into \gls{native} \constraints{}.
For example, the concepts of one task preceding another and non-overlapping of tasks could be defined in Prolog as:
\begin{minted}[
autogobble=true,
breaklines,
breakindent=4em,
numbers=none,
escapeinside=@@,
fontsize=\scriptsize,
tabsize=2,
]{prolog}
precedes(startA, durA, startB) :-
startA + durA < startB.
non_overlap(startA, durA, startB, durB) :-
precedes(startA, durA, startB) ; precedes(startB, durB, startA).
\end{minted}
The definition of \mzninline{nonoverlap} requires that either task A precedes task B, or vice versa.
However, unlike the \glsxtrshort{ampl} model, Prolog would not provide this choice to the \solver{}.
Instead, it enforces one, test if this works, and it otherwise enforces the other.
This is a powerful mechanism where any choices are made in the \gls{rewriting} process, and not in the \solver{}.
The problem with the mechanism is that it requires a highly incremental interface with the \solver{} that can incrementally post and retract \constraints{}.
Additionally, \solvers{} are not always able to verify if a certain set of \constraints{} is consistent.
This makes the behaviour of the \gls{clp} program dependent on the \solver{} that is used.
As such, a \gls{clp} program is not \solver{}-independent.
\minizinc{} \autocite{nethercote-2007-minizinc} is a functional \cml{} that operates on a level in between these two types of languages.
Like \glsxtrshort{ampl}, it is a \solver{}-independent \cml{}.
And like \gls{clp} languages, modellers can declare common concepts and control the encoding into the \gls{slv-mod}.
The latter is accomplished through the use of function definitions.
For example, a user could create the following \minizinc{} function to express the non-overlapping relationship.
\begin{mzn}
function var bool: non_overlap(
var int: startA, int: durA,
var int: startB, int: durB
) =
startA + durA < startB \/ startB + durB < startA;
\end{mzn}
\noindent{}Because functions that return the truth value of an expression are common when constructing \constraints{}, \minizinc{} supports the ``\mzninline{predicate}'' keyword as equivalent to the ``\mzninline{function var bool:}'' syntax.
Fundamentally, in its \gls{rewriting} process \minizinc{} is a functional language.
It continuously evaluates the calls in a \minizinc{} \instance{} until it reaches \gls{native} \constraints{}.
Like other functional languages, \minizinc{} allows recursion.
It can be used as a fully Turing complete programming language.
Using the same mechanism, \minizinc{} defines how an \instance{} is encoded for a \solver{}.
All functionality in the \minizinc{} language is eventually expressed using function calls.
The \solver{}, then, has the decision to choose if this call is a \gls{native} \constraint{}, or how this call is rewritten.
For example, the logical or-operator, which was only supported for \gls{cp} \solvers{} in \glsxtrshort{ampl}, could be defined for \gls{mip} solvers in \minizinc{} using the following definition.
\begin{mzn}
predicate bool_or(var bool: x, var bool: y) =
x + y >= 1;
\end{mzn}
\noindent{}Whereas the following definition would mark it as a \gls{native} \constraint{} for a \gls{cp} \solver{}, by not defining a body.
\begin{mzn}
predicate bool_or(var bool: x, var bool: y);
\end{mzn}
Although \minizinc{} is based on this powerful paradigm, its success has surfaced certain problems.
The language was originally designed to target \gls{cp} \solvers{}, where the \glspl{slv-mod} typically contain a small number of highly complex \constraints{}.
Its use has extended to rewrite for \gls{mip} and \gls{sat} \solvers{}, whose \glspl{slv-mod} often contain large numbers \variables{} and simple \constraints{}, generated by a complex library of \minizinc{} functions.
For many applications, \minizinc{} now requires a significant, and sometimes prohibitive, amount of time to rewrite \instances{}.
Unlike \gls{clp} \gls{rewriting}, the \minizinc{} \gls{rewriting} process does not consider incremental changes to its \gls{slv-mod}.
This is another weakness that has become particularly important, since new optimisation methods require the solving of a sequence of closely related \instances{}.
The overhead of \gls{rewriting} all these \instances{} separately can be substantial.
In this thesis we revisit the \gls{rewriting} of functional \cmls{} into \glspl{slv-mod}.
We design and evaluate an architecture for \cmls{} that can accommodate the modern uses of these languages.
\section{The Problems of Rewriting MiniZinc}
\minizinc{} is one of the most prominent \cmls{}.
It is an expressive language that, in addition to user-defined functions, gives modellers access to advanced features, such as many types of \variables{}, annotations, and an extensive standard library of \constraints{}.
All of these can be used with all \solver{} targets.
Because of the popularity and maturity of the language, there is a large suite of \minizinc{} models available that can be used as benchmarks.
The language has also been used in multiple studies as a host for meta-optimisation techniques \autocite{ingmar-2020-diverse,ek-2020-online}.
A \minizinc{} model generally consists of a few loops or \glspl{comprehension}; for a \gls{cp} solver, this would be rewritten into a relatively small set of \constraints{} which would be fed whole into the solver.
The existing process for translating \minizinc{} into a \gls{slv-mod} is a somewhat ad-hoc, (mostly) single-pass, recursive unrolling procedure, and many aspects (such as call overloading) are resolved dynamically.
In its original application, this was acceptable: models (both before and after \gls{rewriting}) were small.
Now, the language is also used to target low-level \solvers{}, such as \gls{sat} and \gls{mip}.
For them, the encoding of the same \minizinc{} \instance{} results in a much larger \glspl{slv-mod}.
Additionally, \minizinc{} is used more and more as part of various \gls{meta-optimisation} toolchains.
To a great extent, this is testament to the effectiveness of the language.
However, as they have become more common, these extended uses have revealed weaknesses of the existing \minizinc{} tool chain.
In particular:
\begin{itemize}
\item The \minizinc{} \compiler{} is inefficient.
It does a surprisingly large amount of work for each expression (especially resolving sub-typing and overloading), which may be repeated many times.
And as models generated for low-level \solver{} technologies can be quite large, the resulting \gls{rewriting} procedure can be intolerably slow.
As the model transformations implemented in \minizinc{} become more sophisticated, these performance problems are simply magnified.
\item The generated models often contain unnecessary \constraints{}.
During \gls{rewriting}, functional expressions are replaced with \constraints{}.
But this breaks the functional dependencies: if the original expression later becomes redundant (due to model simplifications), \minizinc{} may fail to detect that the constraint can be removed.
\item The reasoning about \gls{reif} in the \minizinc{} \compiler{} is limited.
An important decision that is made during \gls{rewriting} is if a \constraint{} can be enforced directly or if it is dependent on other \constraints{}.
In the second case, a \gls{reif} has to be introduced, a often costly form of the \constraint{} that determines whether a \constraint{} holds rather than enforcing the \constraint{} itself.
It is possible, however, that further \gls{rewriting} can reveal a \gls{reif} to be unnecessary.
Currently, the \compiler{} cannot reverse any \gls{reif} decisions once they are made.
And, it also does not consider \gls{half-reif}, a cheaper alternative to \gls{reif} that is often applicable.
\item Monolithic \gls{rewriting} is wasteful.
When \minizinc{} is used as part of a meta-optimisation toolchain, there is typically a large base model common to all sub-problems, and a small set of \constraints{} which are added or removed in each iteration.
But with the existing \minizinc{} architecture, the whole model must be rewritten each time.
Many use cases involve generating a base model, then repeatedly adding or removing a few \constraints{} before re-solving.
In the current tool chain, the whole model must be fully rewritten to a \gls{slv-mod} each time.
Not only does this repeat all the work done to rewrite the base model, this means a large (sometimes dominant) portion of runtime is simply spent \gls{rewriting} the core model over and over again.
But it also prevents the \solver{} from carrying over anything it learnt from one problem to the next, closely related, problem.
\end{itemize}
\section{Research Objective and Contributions}
Research into \gls{rewriting} of \cmls{}, as well as model transformation more generally, has shown that it is difficult to achieve both high performance and generality.
We address these issues by reconsidering the rewriting process from the ground up to make the transformation efficient while accommodating the optimisation techniques to achieve the expected result.
In order to adapt \cmls{} for modern day requirements, this thesis aims to \textbf{design, implement, and evaluate a modern architecture for functional \cmls{}}.
Crucially, this architecture should allow us to:
\begin{itemize}
\item easily integrate a range of well-known and new \textbf{optimisation and simplification} techniques,
\item effectively manage the \gls{slv-mod} and \textbf{detect and eliminate} parts of the model that have become unused,
\item formally \textbf{reason about \gls{reif}} to avoid it or use \gls{half-reif} where possible, and
\item support \textbf{incremental usage} of the \cml{}.
\end{itemize}
In the design of this architecture, we analyse the \gls{rewriting} process from the ground up.
We first determine the foundation for the system: an execution model for the basic \gls{rewriting} system.
To ensure the quality of the produced \glspl{slv-mod}, we extend the system with many well-known simplification techniques.
In addition, we experiment with several new approaches to optimise the resulting \glspl{slv-mod}, including the use of \gls{half-reif}.
Crucially, we ensure that the resulting system is fully incremental, and we experiment with several \gls{meta-optimisation} applications to evaluate this.
Overall, this thesis makes the following contributions:
\begin{enumerate}
\item It presents a formal execution model of rewriting of the \minizinc\ language and extends this model with well-known optimisation and simplification techniques.
\item It provides a novel method of tracking of \constraints{} created as part of functional dependencies, ensuring the correct removal of dependencies no longer required.
\item It presents an analysis technique to reason about in what (\gls{reified}) form a \constraints{} should be considered.
\item It presents a design and implementation of techniques to automatically introduce \gls{half-reif} of \constraints{} in \minizinc{}.
\item It develops a technique to simplify problem specifications by efficiently eliminating implication chains.
\item It proposes two novel methods to reduce the overhead of using \cmls{} in incremental techniques: \gls{rbmo} and the \gls{incremental-rewriting} of changing \instances{}.
\end{enumerate}
\section{Organisation of the Thesis}
This thesis is arranged into the following chapters.
Following this introductory chapter, \emph{\cref{ch:background}} gives an overview of the area of \cmls{}.
First, it introduces the reader to \minizinc{}, how its models are formulated, and how they are translated to \solver{} specifications.
Then, we review different solving methods such as \gls{sat}, \gls{mip}, and \gls{cp}.
This is followed by a comparison of \minizinc{} with other \cmls{}.
This chapter also reviews techniques that are closely related to \cmls{}.
We conclude this chapter with a description of the current \minizinc{} compiler and the techniques it uses to simplify the \solver{} specifications it produces.
\emph{\Cref{ch:rewriting}} presents a formal execution model for \minizinc{} and the core of our new architecture.
We construct a set of formal rewriting rules for a subset of \minizinc{} called \microzinc{}.
We show how any \minizinc{} model can be reduced to a \microzinc{} model and, as such, provide rewriting rules for the \minizinc{} language.
Applying the rewriting produces \nanozinc{}, an abstract \gls{slv-mod} language.
Crucially, we show how \nanozinc{} tracks \constraints{} that define a \variable{}, and can therefore correctly remove functional definitions.
This chapter also integrates well-known techniques used to simplify \glspl{slv-mod} into the architecture.
We compare the performance of an implementation of the presented architecture against the existing \minizinc{} infrastructure.
\emph{\Cref{ch:half-reif}} continues on the path of improving \glspl{slv-mod}.
In this chapter, we present a formal analysis technique to reason about \gls{reif}.
This analysis can help us decide whether a \constraint{} has to be reified.
In addition, the analysis allows us to determine whether we use \gls{half-reif}.
We thus present the first implementation of the usage of automatic \gls{half-reif}.
We conclude this chapter by analysing the impact of the usage of \gls{half-reif} for different types of \solvers{}.
\emph{\Cref{ch:incremental}} focuses on the development of incremental methods for \cmls{}.
We first present a novel technique that eliminates the need for the incremental \gls{rewriting}.
Instead, it integrates a meta-optimisation specification, written in the \minizinc{} language, into the \gls{slv-mod}.
Then, we describe a technique to optimise the \gls{rewriting} process for incremental changes to an \instance{}.
This method ensures that no work is done for parts of the \instance{} that remain unchanged.
We conclude this chapter by testing the performance and computational overhead of these two techniques.
Finally, \emph{\Cref{ch:conclusions}} is the concluding chapter of the thesis.
It reiterates the discoveries and contributions of this research to theory and practice, comments on the scope and limitations of the presented architecture, and presents further avenues for research in this area.