This repository has been archived on 2025-03-06. You can view files and clone it, but cannot push or open issues or pull requests.
dekker-phd-thesis/chapters/1_introduction.tex

178 lines
15 KiB
TeX

%************************************************
\chapter{Introduction}\label{ch:introduction}
%************************************************
\noindent{} Many problems can be described as making a number of decisions according to a set of rules, or \constraints{}.
These \emph{decision problems} are some of the hardest computational problems.
Even with the fastest computer there is no simple way to, for example, decide on the schedule for a country's train system or the stand-by locations for the ambulances in the region.
These kinds of problems get even more complex when we consider \emph{optimisation problems}: if there are multiple solution, one might be preferred over the other.
But, although these problems might be hard to solve, finding a (good) solution for these problems is essential in many walks of life.
The field of \gls{or} uses advanced computational methods to help make (better) decisions.
And, although some problems require the creation of specialised algorithms, more often than not problems are solved by reducing them to another problem.
There are famous problems, like \gls{sat} \autocite{biere-2021-sat} and \gls{mip} \autocite{wolsey-1988-mip}, for which over the years highly specialised \solvers{} have been created to find solutions.
Because of universality of these problems and the effectiveness of their \solvers{}, formulating a new problem in terms of on of these famous problems is often the best way to find a solution.
This reformulation process presents various challenges:
\begin{itemize}
\item The problem likely reasons at a different level than the targeted \solver{}.
For example, expressing the rules of a train schedule in term of Boolean logic (for a \gls{sat} \solver{}) is a complicated process.
\item There are many possible formulations of the problem, but there is a significant difference in how quickly in how quickly it can be solved.
\item All \solvers{} are different. Even when two \solvers{} are designed to solve the same problem set, they might perform substantially better when specific formulations are used or support slight extensions of the original problem set.
\end{itemize}
\Cmls{}, like \gls{ampl} \autocite{fourer-2003-ampl}, \gls{opl} \autocite{van-hentenryck-1999-opl}, \minizinc{} \autocite{nethercote-2007-minizinc}, and \gls{essence} \autocite{frisch-2007-essence}, have been designed to tackle these problems.
They define a common language that can be used between different \solvers{}.
So instead of defining the problem for a \solver{}, the user creates formalises the problem in the \cml{}, creating a \emph{model} of the problem.
The language is then tasked with creating an optimised formulation for the targeted \solver{}.
Some of these language even allow the modeller to use common patterns, \glspl{global}, that are not directly supported by the solver.
These \emph{High-level} languages will then \emph{decompose} these \glspl{global} into an efficient structure for the targeted solver.
The usage of \cmls{}, especially high-level languages, can simplify the process and reduce the amount of expert knowledge that is required about the \solver{}.
The problem, however, is that high-level \cmls{} were originally designed to target \gls{cp} problems.
Extraordinarily, instances of \gls{cp} problems can directly include many of the \glspl{global} available in the language.
This meant that only little effort was required to rewrite a model for the targeted solver.
But over time these languages have expanded greatly from this original vision.
They are now used to generate input for wildly different solver technologies: not just \gls{cp}, but also \gls{mip} and \gls{sat}.
Crucially, for these \solver{} types the same \glspl{global} have to be replaced by much bigger structures in the problem specification.
This means more and more processing and model transformation has been required to rewrite models into a specification for the targeted \solver{}.
Furthermore, new meta-optimisation techniques are becoming increasingly prominent.
Instead of creating a single model that describes the full problem, these techniques programmatically generate a model, solve it, and then using the results generate another slightly different model.
This results in the rewriting of an abundance of related models in quick succession As such, the rewriting process has become an obstacle for both sizable models and models that require incremental changes.
In this thesis we revisit the rewriting of high-level \cmls\ into \solver{}-level problem specifications.
It is our aim to design and test an architecture for \cmls{} that can accommodate the modern uses of these languages.
\section{MiniZinc}
One of the most prominent high-level \cmls{} is \minizinc{}.
In this thesis, we choose \minizinc{} as the primary \cml{} for several reasons.
First, the \minizinc{} language is one of the most extensive \cmls{}.
It contains features such as annotations and user-defined functions, not found in other \cmls{}.
This means that, despite the usage of \minizinc{}, the methods explored in this thesis would still be applicable to other \cmls{}.
Second, because of the popularity and maturity of the language, there is a large suite of models available that can be used as benchmarks.
Third, the language has been used in multiple studies as a host for meta-optimisation techniques \autocite{ingmar-2020-diverse,ek-2020-online}.
Finally, many of the researchers involved in the design and implementation of \minizinc{} have settled at Monash University.
People with the expertise to answer complex questions about the language were therefore close at hand.
Once designed as a standard for different \gls{cp} solvers, \minizinc{} exhibits many of the earlier described problems.
A \minizinc{} model generally consists of a few loops or comprehensions; for a \gls{cp} solver, this would be rewritten into a relatively small set of constraints which would be fed whole into the solver.
The existing process for translating \minizinc{} into solver-specific constraints is a somewhat ad-hoc, (mostly) single-pass, recursive unrolling procedure, and many aspects (such as call overloading) are resolved dynamically.
In its original application, this was acceptable: models (both before and after translation) were small.
Now, the language is also used to target \solvers{} for low-level problems, such as \gls{sat} and \gls{mip}, and is used as part of various meta-optimisation toolchains.
To a great extent, this is testament to the effectiveness of the language.
However, as they have become more common, these extended uses have revealed weaknesses of the existing \minizinc\ tool chain.
In particular:
\begin{itemize}
\item The \minizinc\ compiler is inefficient.
It does a surprisingly large amount of work for each expression (especially resolving sub-typing and overloading), which may be repeated many times --- for example, inside the body of a comprehension.
And as models generated for other solver technologies can be quite large, the resulting flattening procedure can be intolerably slow.
As the model transformations implemented in \minizinc\ become more sophisticated, these performance problems are simply magnified.
\item The generated models often contain unnecessary constraints.
During the transformation, functional expressions are replaced with constraints.
But this breaks the functional dependencies: if the original expression later becomes redundant (due to model simplifications), \minizinc\ may fail to detect that the constraint can be removed.
\item Monolithic flattening is wasteful.
When \minizinc\ is used as part of a meta-optimisation toolchain, there is typically a large base model common to all sub-problems, and a small set of constraints which are added or removed in each iteration.
But with the existing \minizinc\ architecture, the whole model must be rewritten each time.
Many use cases involve generating a base model, then repeatedly adding or removing a few constraints before re-solving.
In the current tool chain, the whole model must be fully rewritten to a \solver{} specification each time.
Not only does this repeat all the work done to rewrite the base model, This means a large (sometimes dominant) portion of runtime is simply rewriting the core model over and over again.
But it also prevents the \solver{} from carrying over anything it learnt from one problem to the next, closely related, problem.
\end{itemize}
In addition, Feydy et al.\ once argued for the usage of \gls{half-reif} in \gls{cp} solvers and \cmls{} \autocite*{feydy-2011-half-reif}.
Where a \gls{reification} is a technique to reason about the truth-value of a \constraint{}, as an improvement \gls{half-reif} reasons only about the logical implication of this value.
The authors even show how, for a subset of \minizinc{}, \gls{half-reif} could automatically be introduced as an alternative to \gls{reification}.
Although it is shown that \gls{half-reif} can lead to significant improvements, automatically introducing \gls{half-reif} was never used in rewriting \minizinc{} models.
In part, because the method presented for the subset of \minizinc{} did not extend to the full language.
\section{Research Objective and Contributions}
Research on the rewriting of \cmls{} has long focused on the improvement of the \solver{} specification.
For each of these improvements it can be shown that they are highly effective when used.
It is well-known that the optimisation of the \solver{} specification can lead to profound reductions in solving time.
But while each optimisation might only trigger in a fraction of models, in implementation eagerly searching for applicable optimisations can take a significant amount of time.
% All things considered, these individual
We address these issues by instead of considering optimisations individually, reconsider the rewriting process overall.
In order to adapt high-level \cmls{} for its modern day requirements, this thesis aims to \textbf{design, implement, and evaluate a modern architecture for high-level \cmls{}}.
Crucially, this architecture should allow us to:
\begin{itemize}
\item easily integrate a range of well-known and new \textbf{optimisation and simplification} techniques,
\item effectively manage the \solver{} specification and \textbf{detect and eliminate} parts of models that have become unused, and
\item support \textbf{incremental usage} of the \cml{} infrastructure.
\end{itemize}
In the design of this architecture, we start analysing the rewriting process from the ground up.
We first determine the foundation for the system: an execution model for the basic rewriting system.
To ensure the quality of the produced \solver{} specifications, we extend the system with many well-known optimisation techniques.
In addition, we experiment with several new approaches to optimise the resulting \solver{} specification, including the use of \gls{half-reif}.
Crucially, we ensure that the system is integrates well within meta-optimisation toolchains and experiment with several meta-optimisation applications.
Overall, this thesis makes the following contributions:
\begin{enumerate}
\item It presents a formal execution model of rewriting of the \minizinc\ language and extends this model with well-known optimisation and simplification techniques.
\item It provides a novel method of tracking of \constraints{} created as part of functional dependencies, ensuring it can be correctly removed when it is no longer required.
\item It presents a design and implementation of techniques to automatically introduce \gls{half-reif} of constraints in \minizinc{}.
\item It develops a technique to simplify problem specifications by efficiently eliminating implication chains.
\item It proposes two methods novel methods to reduce the required overhead of use \cmls in meta-optimisation techniques: \emph{restart-based} meta-search and the \emph{incremental} rewriting of changing models.
\end{enumerate}
\section{Organisation of the Thesis}
This thesis is partitioned into the following chapters.
Following this introductory chapter, \emph{\cref{ch:background}} reviews relevant information in the area of \cmls{}.
First, it introduces the reader to \minizinc{}, how its models are formulated, and how they are translated to \solver{} specifications.
Then, we review different solving methods method such as \gls{sat}, \gls{mip}, and \gls{cp}.
This is followed by a comparison of \minizinc{} with other \cmls{}.
This chapter also reviews techniques that are closely related to \cmls{}.
We conclude this chapter with a description of the current \minizinc{} compiler and the techniques it uses to simplify the \solver{} specifications it produces.
\emph{\Cref{ch:rewriting}} presents a formal execution model for \minizinc{} and the core of our new architecture.
We introduce \microzinc{}, a minimal language to which \minizinc{} can be reduced.
We use this language to construct formal rewriting rules.
The product of applying these rules is in terms of \nanozinc{}, an abstract \solver{} specification language.
Crucially, we show how \nanozinc{} tracks the \constraints{} that define the variable such that functional definitions can correctly be removed.
This chapter also integrates well-known techniques used to simplify the \solver{} specification into the architecture.
We compare the performance of an implementation of the presented architecture against the existing \minizinc{} infrastructure.
\emph{\Cref{ch:half-reif}} continues on the path of creating better \solver{} specifications.
In this chapter, we present the first implementation of the usage of automatic \gls{half-reif}.
We consider the effects of using \gls{half-reif} both inside a \gls{cp} solver and as part of a decomposition.
We then describe a new analysis to determine when \gls{half-reif} can be used.
This extends an earlier approach that does not extend to the full \minizinc{} language.
We also comment on the influence that \gls{half-reif} has on other techniques used in the rewriting process.
We conclude this chapter by analysing performance changes incurred by the usage of this technique on both small and large scale.
\emph{\Cref{ch:incremental}} focuses on the use of meta-optimisation methods.
We introduce two methods that can be employed to describe and employ these technique.
We first present a novel technique that considers a meta-search specification in a \minizinc{} and compiles it into the \solver{} specification.
Then, we describe a method to optimise the rewriting process for incremental changes to a model.
This method ensures that no work is done for parts of the model that remain unchanged.
We conclude this chapter by testing the performance and computational overhead of these two techniques.
Finally, \emph{\Cref{ch:conclusions}} is the concluding chapter of the thesis.
It reiterates the discoveries and contributions of this research to theory and practice, comments on the scope and limitations of the presented system, and presents further avenues for research in this area.