dekker-phd-thesis/chapters/1_introduction.tex

%************************************************
\chapter{Introduction}\label{ch:introduction}
%************************************************

\noindent{}A \glspl{dec-prb} is any problem that requires us to make decisions according to a set of rules.
Many important and difficult problems in the real-world are \glspl{dec-prb}.
We can think, for instance, about the decision on the country's train system or the stand-by locations for ambulances in the region.
Formally, we define a \gls{dec-prb} as a set of \variables{} subject to a set of logical \constraints{}.
A \gls{sol} to such a problem is the \gls{assignment} of all \variables{} to values that abide by the logic of the \constraints{}.
These problems are also highly computationally complex: even with the fastest computers there is no simple way to find a solution.
They get even more complex when we consider \gls{opt-prb}: if there are multiple solution, one may be preferred over the other.
But, although these problems are hard to solve, finding a (good) solution for these problems is essential in many walks of life.

The field of \gls{or} uses advanced computational methods to help make (better) decisions.
Famous classes decision and \glspl{opt-prb}, such as \gls{sat} \autocite{biere-2021-sat}, \gls{cp} \autocite{rossi-2006-cp} and \gls{mip} \autocite{wolsey-1988-mip}, have been studied extensively.
And, over the years, highly specialised \solvers{} have been created to find \glspl{sol} for these classes of problems.
Nowadays, most decision and \glspl{opt-prb} problems are solved by encoding them to a well known class of problems.

Encoding one problem in terms of another is, however, a difficult problem.
The problem classes can be restrictive: the input to the \solver{}, the \gls{slv-mod}, can only contain \gls{native} \variables{} and \constraints{}.
This means that types of \variables{} and relationship in \constraints{} have to be directly supported the solver.
Furthermore, \constraints{} can only refer to \variables{}, they cannot be (directly) dependent on other \constraints{}.
For example, \gls{sat} \solvers{} can only reason about Boolean \variables{}, deciding if something is true or false.
Its \constraints{} have to be in the form of disjunctions of Boolean \variables{}, or their negations.

But, not only does the encoding have to be correct, the encoding also has to be performant.
There are often many possible correct encodings of a problem, but there can be a significant difference in how quickly the \solver{} finds a \gls{sol} for them.
The preferred encoding can however differ between \solvers{}.
Two \solvers{} designed to solve the same problem class can perform very differently.

\Cmls{} have been designed to tackle these issues.
They serve as a level of abstraction between the user and the \solver{}.
Instead of providing a flat list of \variables{} and \constraints[], the user can create a \cmodel{} using the more natural syntax of the \cml{}.
A \cmodel{} can generally describe a class of problems through the use of \parameters{}, values assigned by external input.
Once given a complete \gls{assignment} of the \parameters{}, the \cmodel{} forms an \instance{}.
The language then creates a \gls{slv-mod}, through a process called \gls{rewriting}, and interface with the \solver{} when trying to find an appropriate \gls{sol}.

A commonly used \cml{} is \glsxtrshort{ampl} \autocite{fourer-2003-ampl}.
\glsxtrshort{ampl} was originally designed as a common interface between different \gls{mip} \solvers{}.
The language provides a natural way to define numeric \variables{} and express \constraints{} in the form of linear equations as described by the class of problem.
Crucially, the same \glsxtrshort{ampl} model can be used between different \solvers{}.

\glsxtrshort{ampl} was later extended to include other \solver{} targets, including \gls{cp} and quadratic \solvers{}.
As such, additional the types of \constraints{} for these problem classes have been added to the language, removing the restriction that \constraints{} must be linear equations.

Let us introduce the \glsxtrshort{ampl} language by modelling the ``open shop problem''.
In the open shop problem we are tasked with scheduling jobs with multiple tasks.
Each task must be performed by an assigned machine.
A machine can only perform one task at the same time.
And, only one task of the same job can be scheduled at the same time.
We assume that each job has a task for every machine.
As an \gls{opt-prb}, our goal is to find an schedule that minimises the finishing time of the last task.

\begin{listing}
	\pyfile{assets/listing/intro_open_shop.mod}
	\caption{\label{lst:intro-open-shop} An \glsxtrshort{ampl} model of the open shop problem}
\end{listing}

\Cref{lst:intro-open-shop} shows an \glsxtrshort{ampl} model for the open shop problem.
In order of occurence, \lrefrange{line:intro:param:start}{line:intro:param:end} show the declarations of the parameters.
To create an \instance{} of the problem, the user provides the number of jobs and machines that are considered, and the duration of each task.
\Lrefrange{line:intro:var:start}{line:intro:var:end} show the \variable{} declarations: for each task we decide on its start time.
Additionally, we declare the end time of the last task as a \variable{}, to ease the modelling of the problem.
This \variable{} is made to be our optimisation goal on \lref{line:intro:goal}.
Finally, \lrefrange{line:intro:con:start}{line:intro:con:end} express the \constraints{} of our problem in terms of equations bound by logic.

The \glsxtrshort{ampl} model provides a clear definition of the problem class, but it can be argued that its meaning is hard to decipher.
\glsxtrshort{ampl} does not provide any way to capture common concepts, such as one task preceding another or that two tasks cannot overlap in our example.
Additionally, the process to encode an \instance{} into a \gls{slv-mod} all happens ``behind the scenes''.
There is no way for a \solver{} to specify how, for example, an operator is best rewritten.
As such, \glsxtrshort{ampl} cannot rewrite all models for all its \solvers{}.
For example, since the model in \cref{lst:intro-open-shop} uses a \mzninline{or} operator, it can only be encoded for \solvers{} that support their \gls{cp} interface.

Although they do support the rewriting of models between different problem classes, other \cmls{}, such as \gls{essence} \autocite{frisch-2007-essence} and \glsxtrshort{opl} \autocite{van-hentenryck-1999-opl}, exhibit the same problems.
They do not provide any way to capture common concepts.
And, apart from adapting the rewriting mechanism itself, there is no way for a \solver{} to influence their preferred encoding.

\gls{clp}, as used in the Prolog language, offers a very different mechanism to create a \cmodel{}.
In these languages, the modeller is encouraged to create high-level concepts and provide the way in which they are rewritten into \gls{native} \constraints{}.
For example, the concepts of one task preceding another and non-overlapping of tasks could be defined in Prolog as:

\begin{minted}[
  autogobble=true,
  breaklines,
  breakindent=4em,
  numbers=none,
  escapeinside=@@,
  fontsize=\scriptsize,
	tabsize=2,
]{prolog}
	pecedes(startA, durA, startB) :-
		startA + durA < startB.

	nonoverlap(startA, durA, startB, durB) :-
		precedes(startA, durA, startB) ; precedes(startB, durB, startA).
\end{minted}

The definition of \mzninline{nonoverlap} require that either task A precedes task B, or vice versa.
However, unlike the \gls{ampl} model, Prolog would not provide this choice to the \solver{}.
Instead, it enforces one, test if this works, and otherwise enforce the other.

This is a powerful mechanism where any choices are made in the \gls{rewriting} process, and not in the \solver{}.
The problem with the mechanism is that it requires a highly incremental interface with the \solver{} that can incrementally post and retract \constraints{}.
Additionally, \solvers{} are not always be able to verify if a certain set of \constraints{} is consistent.
This makes the behaviour of the \gls{clp} program dependent on the \solver{} that is used.
As such, a \gls{clp} program is not \solver{}-independent.

\minizinc{} \autocite{nethercote-2007-minizinc} is a functional \cml{} that operates on a level in between these two types of languages.
Like \glsxtrshort{ampl}, it is a \solver{}-independent \cml{}.
And like \gls{clp} languages, modellers can define common concepts and control encoding of the \gls{slv-mod}.
The latter is accomplished through the use of function definitions.
For example, a user could create a \minizinc{} function to express the non-overlapping relationship:

\begin{mzn}
	predicate nonoverlap(var int: startA, int: durA, var int: startB, int: durB) =
		startA + durA < startB \/ startB + durB < startA;
\end{mzn}

Fundamentally, in its \gls{rewriting} process \minizinc{} is a functional language.
It continuously evaluates the calls in a \minizinc{} \instance{} until it reaches \gls{native} \constraints{}.
Like other functional languages, \minizinc{} allows recursion.
It can be used as a fully Turing complete computational language.

Using the same mechanism, \minizinc{} defines how an \instance{} is encoded for a \solver{}.
All functionality in the \minizinc{} language is eventually expressed using function calls.
The \solver{}, then, has the decision to choose if this call is a \gls{native} \constraint{}, or how this call is rewritten.
For example the logical or-operator, that was only supported for \gls{cp} \solvers{} in \glsxtrshort{ampl}, could be defined for \gls{mip} solvers in \minizinc{} as:

\begin{mzn}
predicate bool_or(var bool: x, var bool: y) =
	x + y >= 1;
\end{mzn}

\noindent{}Whereas it could be marked as a \gls{native} \constraint{} for a \gls{cp} \solver{} by not defining the body:

\begin{mzn}
predicate bool_or(var bool: x, var bool: y);
\end{mzn}

Although \minizinc{} is based on this powerful paradigm, its success has surfaced certain problems.
The language was originally designed to target \gls{cp} \solvers{}, where the result contains a small number of highly complex \constraints{}.
Its use has extended to rewrite for \gls{mip} and \gls{sat} \solvers{}.
The result of which is a \gls{slv-mod} with large number of very simple \constraints{}, generated by a complex library of \minizinc{} functions.
For many applications, \minizinc{} now requires a significant, and sometimes prohibitive, amount of time to rewrite \instances{}.

Unlike \gls{clp} \gls{rewriting}, the \minizinc{} \gls{rewriting} process does not consider incremental changes to its \gls{slv-mod}.
This is another weakness that has become particularly important, since new optimisation methods require the solving of a sequence of closely related \instances{}.
The overhead of \gls{rewriting} all these \instances{} separately can be substantial.

In this thesis we revisit the \gls{rewriting} of functional \cmls{} into \glspl{slv-mod}.
It is our aim to design and evaluate an architecture for \cmls{} that can accommodate the modern uses of these languages.

\section{The Problems of Rewriting MiniZinc}

\minizinc{} is one of the most prominent \cmls{}.
It is an expressive language that, in addition to user-defined functions, gives modeller access to advanced features, such as many types of \variables{}, annotations, and an extensive standard library of \constraints{}.
All of which can be used with all \solver{} targets.
Because of the popularity and maturity of the language, there is a large suite of \minizinc{} models available that can be used as benchmarks.
The language has also been used in multiple studies as a host for meta-optimisation techniques \autocite{ingmar-2020-diverse,ek-2020-online}.

A \minizinc{} model generally consists of a few loops or \glspl{comprehension}; for a \gls{cp} solver, this would be rewritten into a relatively small set of \constraints{} which would be fed whole into the solver.
The existing process for translating \minizinc{} into a \gls{slv-mod} is a somewhat ad-hoc, (mostly) single-pass, recursive unrolling procedure, and many aspects (such as call overloading) are resolved dynamically.
In its original application, this was acceptable: models (both before and after \gls{rewriting}) were small.
Now, the language is also used to target low-level \solvers{}, such as \gls{sat} and \gls{mip}.
For them, the encoding of the same \minizinc{} \instance{} results in a much larger \glspl{slv-mod}.
Additionally, more and more \minizinc{} is used as part of various meta-optimisation toolchains.
To a great extent, this is testament to the effectiveness of the language.
However, as they have become more common, these extended uses have revealed weaknesses of the existing \minizinc{} tool chain.
In particular:

\begin{itemize}

	\item The \minizinc{} \compiler{} is inefficient.
		It does a surprisingly large amount of work for each expression (especially resolving sub-typing and overloading), which may be repeated many times.
		And as models generated for low-level \solver{} technologies can be quite large, the resulting \gls{rewriting} procedure can be intolerably slow.
		As the model transformations implemented in \minizinc{} become more sophisticated, these performance problems are simply magnified.

	\item The generated models often contain unnecessary \constraints{}.
		During \gls{rewriting}, functional expressions are replaced with \constraints{}.
		But this breaks the functional dependencies: if the original expression later becomes redundant (due to model simplifications), \minizinc{} may fail to detect that the constraint can be removed.

	\item The reasoning about \gls{reif} in the \minizinc{} \compiler{} is limited.
		An important decision that is made during \gls{rewriting} is if a \constraint{} can be enforced directly or if it is dependent on other \constraints{}.
		In the second case, a \gls{reif} has to be introduced, a often costly form of the \constraint{} that determines whether a \constraint{} holds rather than enforcing the \constraint{} itself.
		It is possible, however, that further \gls{rewriting} can reveal a \gls{reif} to be unnecessary.
		Currently, the \compiler{}  cannot reverse any \gls{reif} decisions once they are made.
		And, it also does not consider \gls{half-reif}, a cheaper alternative to \gls{reif} that is often applicable.

	\item Monolithic \gls{rewriting} is wasteful.
		When \minizinc{} is used as part of a meta-optimisation toolchain, there is typically a large base model common to all sub-problems, and a small set of \constraints{} which are added or removed in each iteration.
		But with the existing \minizinc{} architecture, the whole model must be rewritten each time.
		Many use cases involve generating a base model, then repeatedly adding or removing a few \constraints{} before re-solving.
		In the current tool chain, the whole model must be fully rewritten to a \gls{slv-mod} each time.
		Not only does this repeat all the work done to rewrite the base model, this means a large (sometimes dominant) portion of runtime is simply spent \gls{rewriting} the core model over and over again.
		But it also prevents the \solver{} from carrying over anything it learnt from one problem to the next, closely related, problem.

\end{itemize}

\section{Research Objective and Contributions}

Research into \cmls{} \gls{rewriting}, as well as other research into model transformation, has shown that it is difficult to achieve both high performance and generality.
We address these issues by reconsidering the rewriting process from the ground up to make the transformation efficient while accommodating the optimisation techniques to achieve the expected result.
In order to adapt \cmls{} for modern day requirements, this thesis aims to \textbf{design, implement, and evaluate a modern architecture for functional \cmls{}}.
Crucially, this architecture should allow us to:

\begin{itemize}

	\item easily integrate a range of well-known and new \textbf{optimisation and simplification} techniques,

	\item effectively manage the \gls{slv-mod} and \textbf{detect and eliminate} parts of the model that have become unused,

	\item formally \textbf{reason about \gls{reif}} to minimize its strain, and

	\item support \textbf{incremental usage} of the \cml{} architecture.

\end{itemize}

In the design of this architecture, we analyse the \gls{rewriting} process from the ground up.
We first determine the foundation for the system: an execution model for the basic \gls{rewriting} system.
To ensure the quality of the produced \glspl{slv-mod}, we extend the system with many well-known optimisation techniques.
In addition, we experiment with several new approaches to optimise the resulting \glspl{slv-mod}, including the use of \gls{half-reif}.
Crucially, we ensure the full incrementalism of the system and experiment with several meta-optimisation applications to evaluate its effects.

Overall, this thesis makes the following contributions:

\begin{enumerate}

	\item It presents a formal execution model of rewriting of the \minizinc\ language and extends this model with well-known optimisation and simplification techniques.

	\item It provides a novel method of tracking of \constraints{} created as part of functional dependencies, ensuring the correct removal of dependencies no longer required.

	\item It presents an analysis technique to reason about in what (refied) form a \constraints{} should be considered.

	\item It presents a design and implementation of techniques to automatically introduce \gls{half-reif} of \constraints{} in \minizinc{}.

	\item It develops a technique to simplify problem specifications by efficiently eliminating implication chains.

	\item It proposes two novel methods to reduce the overhead of using \cmls{} in incremental techniques: \emph{restart-based} meta-search and the \emph{incremental} rewriting of changing models.

\end{enumerate}

\section{Organisation of the Thesis}

This thesis is arranged into the following chapters.

Following this introductory chapter, \emph{\cref{ch:background}} gives an overview of the area of \cmls{}.
First, it introduces the reader to \minizinc{}, how its models are formulated, and how they are translated to \solver{} specifications.
Then, we review different solving methods such as \gls{sat}, \gls{mip}, and \gls{cp}.
This is followed by a comparison of \minizinc{} with other \cmls{}.
This chapter also reviews techniques that are closely related to \cmls{}.
We conclude this chapter with a description of the current \minizinc{} compiler and the techniques it uses to simplify the \solver{} specifications it produces.

\emph{\Cref{ch:rewriting}} presents a formal execution model for \minizinc{} and the core of our new architecture.
We construct a set of formal rewriting rules for a subset of \minizinc{} called \microzinc{}
We show how any \minizinc{} model can be reduced to a \microzinc{} model and, as such, provide rewriting rules for the \minizinc{} language.
Applying the rewriting produces \nanozinc{}, an abstract \solver{} specification language.
Crucially, we show how \nanozinc{} tracks \constraints{} that define a \variable{}, and can therefore correctly remove functional definitions.
This chapter also integrates well-known techniques used to simplify \glspl{slv-mod} into the architecture.
We compare the performance of an implementation of the presented architecture against the existing \minizinc{} infrastructure.

\emph{\Cref{ch:half-reif}} continues on the path of improving \glspl{slv-mod}.
In this chapter, we present an formal analysis technique to reason about \gls{reif}.
This analysis can help us decide whether a \constraint{} has to be reified.
In addition, the analysis allows us to determine whether we use \gls{half-reif}.
We thus present the first implementation of the usage of automatic \gls{half-reif}.
We conclude this chapter by analysing the impact of the usage of \gls{half-reif} for different types of \solvers{}.

\emph{\Cref{ch:incremental}} focuses on the development of incremental methods for \cmls{}.
We introduce two new techniques that allow for the incremental usage of \cmls{}.
We first present a novel technique that eliminates the need for the incremental \gls{rewriting}
Instead, it integrates a meta-optimisation specification, written in the \minizinc{} language, into the \gls{slv-mod}.
Then, we describe a technique to optimise the \gls{rewriting} process for incremental changes to an \instance{}.
This method ensures that no work is done for parts of the \instance{} that remain unchanged.
We conclude this chapter by testing the performance and computational overhead of these two techniques.

Finally, \emph{\Cref{ch:conclusions}} is the concluding chapter of the thesis.
It reiterates the discoveries and contributions of this research to theory and practice, comments on the scope and limitations of the presented architecture, and presents further avenues for research in this area.