From 83c77ab5687fedf9850677d878f5496d9e00bc91 Mon Sep 17 00:00:00 2001 From: "Jip J. Dekker" Date: Thu, 27 May 2021 17:13:54 +1000 Subject: [PATCH] Half Reification --- chapters/4_half_reif.tex | 341 +++++++++++++++++++++++++++++++++------ 1 file changed, 289 insertions(+), 52 deletions(-) diff --git a/chapters/4_half_reif.tex b/chapters/4_half_reif.tex index 1c93db5..9a8e48d 100644 --- a/chapters/4_half_reif.tex +++ b/chapters/4_half_reif.tex @@ -1,34 +1,125 @@ %************************************************ -\chapter{Context Analysis and Half Reification}\label{ch:half-reif} +\chapter{Half Reification}\label{ch:half-reif} %************************************************ -In this chapter we study the usage of \gls{half-reif}. When a constraint \mzninline{pred(...)} is reified a +The complex expressions language used in \cmls{}, such as \minizinc{}, often +require the use of \gls{reification} in the flattening process to reach a +solver level constraint model. If the Boolean expression \mzninline{pred(...)} +is seen in a non-root context, then a new Boolean \variable{} \mzninline{b} is +introduced to replace the expression. The flattener then enforces a +\constraint{} \mzninline{pred_reif(...,b)}, which binds the \variable{} to be +the \emph{truth-value} of the expression (\ie\ \mzninline{b <-> pred(...)}). +A weakness of reification is that each reified version of a constraint requires +further implementation to create, and indeed most solvers do not provide any +reified versions of their \gls{global} \constraints{}. + +\begin{example}\label{ex:hr-alldiff} + Consider the complex constraint + + \begin{mzn} + constraint i <= 4 -> all_different([i,x-i,x]); + \end{mzn} + + The usual flattened form would be + + \begin{mzn} + constraint int_le_reif(i, 4, b1); % b1 holds iff i <= 4 + constraint int_minus(x, i, t1); % t1 = x - i + constraint all_different_reif([i,t1,x], b2); + constraint bool_clause([b2], [b1]) % b1 implies b2 + \end{mzn} + + but no solver we are aware of implements the third primitive + constraint.\footnote{Although there are versions of soft + \mzninline{all_different}, they do not define this form.} +\end{example} + +Reified \gls{global} \constraints{} are not implemented because a reified constraint +\mzninline{b <-> pred(...)} must also implement a propagator for \mzninline{not +pred(...)} (in the case that \mzninline{b = false}). While for some global +\constraints{}, \eg\ \mzninline{all_different}, this may be reasonable to +implement, for most, such as \texttt{cumulative}, the task seems to be very +difficult. + +Another weakness of the reification is that it may keep track of more +information than is required. In a typical solver, the first reified constraint +\mzninline{b1 <-> i <= 4} will wake up whenever the upper bound of \texttt{i} +changes in order to check whether it should set \texttt{b1} to +\mzninline{true}. But setting \mzninline{b1} to \mzninline{true} will +\emph{never} cause any further propagation. There is no reason to check this. + +This is particularly important when the target solver is a mixed integer +programming solver. In order to linearise a reified linear constraint we need +to create two linear \constraints{}, but if we are only interested in half of the +behaviour we can manage this with one linear constraint. + +\begin{example} + Consider the constraint \mzninline{b1 <-> i <= 4}, where \texttt{i} can take + values in the domain \mzninline{0..10} then its linearisation is + + \begin{mzn} + constraint i <= 10 - 6 * b1; % b1 -> i <= 4 + constraint i >= 5 - 5 * b1; % not b1 -> i >= 5 + \end{mzn} + + But in the system of \constraints{} where this constraint occurs knowing that + \texttt{b1} is 0 will never cause the system to fail, hence we do not need to + keep track of it. We can simply use the second constraint in the + linearisation, which always allows that \texttt{b1} takes the value 0. +\end{example} + +The simple flattening used above treats partial functions in the following +manner. Application of a partial function to a value for which it is not defined +gives value \undefined, and this \undefined\ function percolates up through +every expression to the top level conjunction, making the model unsatisfiable. +For the example + +In this chapter we study the usage of \gls{half-reif}. \gls{half-reif} follows +from the notion that in many cases it might be sufficient to use the logical +implication of an expression, \mzninline{b -> pred(...)}, instead of the +logical equivalence, \mzninline{b <-> pred(...)}. Flattening with +half-reification is an approach that improves upon all these weaknesses of +flattening with \emph{full} reification. \begin{itemize} - \item Flattening with half reification can naturally produce the relational - semantics when flattening partial functions in positive contexts. - \item Half reified constraints add no burden to the solver writer; if they - have a propagator for constraint \mzninline{pred(....)} then they can - straightforwardly construct a half reified propagator for \mzninline{b - -> pred(...)}. - \item Half reified constraints \mzninline{b -> pred(...)} can implement fully - reified constraints without any loss of propagation strength (assuming - reified constraints are negatable). - \item Flattening with half reification can produce more efficient propagation - when flattening complex constraints. - \item Flattening with half reification can produce smaller linear models when - used with a mixed integer programming solver. + \item Half reified \constraints{} add no burden to the solver writer; if they + have a propagator for constraint \mzninline{pred(....)} then they can + straightforwardly construct a half reified propagator for \mzninline{b -> + pred(...)}. + + \item Flattening with \gls{half-reif} can produce smaller linear models when + used with a mixed integer programming solver. + + \item Half reified \constraints{} \mzninline{b -> pred(...)} can implement fully + reified \constraints{} without any loss of propagation strength (assuming + reified \constraints{} are negatable). \jip{TODO:\ should this still be here?} + + \item Flattening with half reification can naturally produce the relational + semantics when flattening partial functions in positive contexts. + \jip{TODO:\ should this still be here?} + + \item Flattening with half reification can produce more efficient propagation + when flattening complex \constraints{}. \jip{TODO:\ should this still be + here?} \end{itemize} +The remainder of the chapter is organised as follows. +\Cref{sec:half-propagation} discusses the propagation of half-reified +\constraints{}. \Cref{sec:half-decomposition} discusses the decomposition of +half-reified constraint. \Cref{sec:half-context} introduces the notion of +context analysis: a way to determine if half-reification can be used for a +certain expression. Finally, \cref{sec:half-flattening} explains how this +information can be used during the flattening process. + \section{Propagation and Half Reification}% \label{sec:half-propagation} \begin{itemize} - \item custom implemented half-reified propagations. - \item benefits - \item downside - \item experimental results + \item custom implemented half-reified propagations. + \item benefits + \item downside + \item experimental results \end{itemize} A propagation engine gains certain advantages from half-reification, but also @@ -41,13 +132,13 @@ the propagator, and hence make its propagation faster. When full reification is applicable (where we are not using half reified predicates) an alternative to half reification is to implement full reification \mzninline{x <-> pred(...)} by two half reified propagators \mzninline{x -> - pred(...)}, \mzninline{y \half \neg pred(...)}, \mzninline{x <-> not y}. This -does not lose propagation strength. For Booleans appearing in a positive context -we can make the propagator \mzninline{y -> not pred(...)} run at the lowest -priority, since it will never cause failure. Similarly in negative contexts we -can make the propagator \mzninline{b -> pred(...)} run at the lowest priority. -This means that Boolean variables are still fixed at the same time, but there is -less overhead. +pred(...)}, \mzninline{y \half \neg pred(...)}, \mzninline{x <-> not y}. This +does not lose propagation strength. For Booleans appearing in a positive +context we can make the propagator \mzninline{y -> not pred(...)} run at the +lowest priority, since it will never cause failure. Similarly in negative +contexts we can make the propagator \mzninline{b -> pred(...)} run at the +lowest priority. This means that Boolean variables are still fixed at the same +time, but there is less overhead. \section{Decomposition and Half Reification}% \label{sec:half-decomposition} @@ -55,39 +146,185 @@ less overhead. \section{Context Analysis}% \label{sec:half-context} +\Gls{half-reif} can be used instead of full \gls{reification} when the +\gls{reification} can never be forced to be false. We see this in, for example, +a disjunction \(a \lor b\). No matter the value of \(a\), setting the value of +\(b\) to be true can never make the overall expression false. At any \(b\) is thus +never forced to be false. This requirement follows from the difference between +implication and logical equivalences. Setting the left hand side of a +implication to false, does not influence the value on the right hand side. So +if we know that this is never required in the overall expression, then using an +implication instead of a logical equivalence, \ie a \gls{half-reif} instead of +a full \gls{reification}, does not change the meaning of the constraint. + +This property can be extended to include non-Boolean expressions. Since Boolean +expressions in \minizinc{} can be used in, for example, integer expressions, we +can apply similar reasoning to these types of expressions. For example the left +hand side of the constraint +% +\begin{mzn} + constraint count(x in arr)(x = 5) > 5; +\end{mzn} +% +is an integer expression that contains the Boolean expression \mzninline{x = +5}. Since the increasing left hand side of the constraint will only ever help +satisfy the constraint, the expression \mzninline{x = 5} will never forced to +be false. This means that we can half-reify the expression. + +To systematically analyse whether Booelean expressions can be half-reified, we +introduce extra distinctions in the context of expressions. Before, we would +merely distinguish between \rootc{} context and \emph{non-root} context. Now, +we will categorise the latter into: + +\begin{description} + + \item[\posc{} context] when an expression must reach \emph{at least} a + certain value to satisfy its enclosing constraint. The expression is never + forced to take a lower value. + + \item[\negc{} context] when an expression can reach \emph{at most} a certain + value to satisfy its enclosing constraint. The expression is never forced + to take a higher value. + + \item[\mixc{} context] when an expression must take an \emph{exact value}, be + within a \emph{specified range} or when during flattening it cannot be + determined whether the expression must be increased or decreased to satisfy + the enclosing constraint. + +\end{description} + +As previously explained, \gls{half-reif} can be used for expressions in \posc{} +context. Although expressions in a \negc{} context cannot be directly +half-reified, the negation of a expression in a \negc{} context can be +half-reified. Consider, for example, the constraint +% +\begin{mzn} + constraint b \/ not (x = 5); +\end{mzn} +% +The expression \mzninline{x = 5} is in a \negc{} context. Although a +\gls{half-reif} cannot be used directly, in some cases the solver can negate +the expression which are then placed in a \posc{} context. Our example can be +transformed into: +% +\begin{mzn} + constraint b \/ x != 5; +\end{mzn} +% +The transformed expression, \mzninline{x != 5}, is now in a \posc{} context. We +can also speak of this process as ``pushing the negation inwards''. + +Expressions in a \mixc{} context are in a position where \gls{half-reif} is +impossible. Only full \gls{reification} can be used for expressions in that are +in this context. This occurs, for example, when using an exclusive or +expression in a constraint. The value that one side must take directly depends +on the value that the other side takes. Each side can thus be forced to be true +or false. The \mixc{} context can also be used as a ``fall back'' context; if +it cannot be determined if an expression is in a \posc{} or \negc{} context, +then it is always safe to say the expression is in a \mixc{} context. + +When taking into account the possible undefinedness of an expression, every +expression in a \minizinc{} model has two different contexts: the context in +which the expression itself occurs, its \emph{value context}, and the context +in which the partiality of the expression is captured, its \emph{partiality +context}. As described in \cref{subsec:back-mzn-partial}, \minizinc{} uses +relational semantics of partial values. This means that if a function does not +have a result, then its nearest enclosing Boolean expression is set to false. +In practice, this means that a condition that tests if the function will return +a value is added to the nearest enclosing Boolean expression. The +\emph{partiality} context is the context in which this condition is placed. + +We now specify two context transformations that will be used in further +algorithms to transition between different contexts: \changepos{} and +\changeneg{}. The transformations have the following behaviour: + \begin{tabular}{ccc} - \( - \begin{array}{lcl} - \changepos \rootc & = & \posc \\ - \changepos \posc & = & \posc \\ - \changepos \negc & = & \negc \\ - \changepos \mixc & = & \mixc - \end{array} - \) - & ~ & - \( - \begin{array}{lcl} - \changeneg \rootc & = & \negc \\ - \changeneg \posc & = & \negc \\ - \changeneg \negc & = & \posc \\ - \changeneg \mixc & = & \mixc - \end{array} - \) + \( + \begin{array}{lcl} + \changepos \rootc & = & \posc \\ + \changepos \posc & = & \posc \\ + \changepos \negc & = & \negc \\ + \changepos \mixc & = & \mixc + \end{array} + \) + & ~ & + \( + \begin{array}{lcl} + \changeneg \rootc & = & \negc \\ + \changeneg \posc & = & \negc \\ + \changeneg \negc & = & \posc \\ + \changeneg \mixc & = & \mixc + \end{array} + \) \end{tabular} -\begin{itemize} - \item What (again) are the contexts? - \item Why are they important - \item When can we use half-reification - \item How does the analysis work? -\end{itemize} - +\jip{TODO:\ Insert algorithm that attaches the context to the different expressions} \section{Flattening and Half Reification}% \label{sec:half-flattening} +During the flattening process the contexts assigned to the different +expressions can be used directly to determine if and how a expression has to be +reified. The flattening with \gls{half-reif} does, however, interact with some +of the optimisations used during the flattening process. Most importantly, +\gls{half-reif} has to be considered when using \gls{cse}. + +When using full \gls{reification}, all \glspl{reification} are stored in the +\gls{cse} table. This ensure that if we see the same expression is reified +twice, then the resulting \variable{} would be reusing. This avoids that the +solver has to enforce the same functional relationship twice. + +If the flattener uses \gls{half-reif}, in addition to full \gls{reification}, +then \gls{cse} needs to ensure not just that the expressions are equivalent, +but also that the context of the two expressions are compatible. For example, +if an expression was first found in a \posc{} context and later found in a +\mixc{} context, then the resulting \gls{half-reification} from the first +cannot be used for the second expression. In general: + \begin{itemize} - \item cse - \item chain compression + + \item The flattening result of a \posc{} context, a \gls{half-reif}, can only + be reused if the same expression is again found in \posc{} context. + + \item The flattening result of a \negc{} context, a \gls{half-reif} with a + negation pushed inwards, can only be reused if the same expression is again + found in \negc{} context. + + \item The flattening result of a \mixc{} context, a \gls{reification}, can be + reused in \posc{}, \negc{}, and \mixc{} context. Since we assume that the + result of a flattening an expression in \negc{} context pushes the negation + inwards, the \gls{reification} does, however, need to be negated. + + \item If the expression was already flattened in \rootc{} context, then any + repeated usage of the expression can be assumed to take the value + \mzninline{true} (or \mzninline{false} in \negc{} context). + \end{itemize} +When considering these compatibility rules, the result of flattening would be +highly dependent on the order in which expressions are seen by the flattener. +It would always be better to encounter the expression in a context that results +in a reusable expression, \eg{} \mixc{}, before seeing the same expression in +another context, \eg{} \posc{}. This avoids creating both a full +\gls{reification} and a \gls{half-reif} of the same expression. + +In the \microzinc{} interpreter, this problem is resolved by only keeping the +result of the \emph{most compatible} context. If an expression is found another +time in another context that is compatible with more contexts, then only the +result of evaluating this context is kept in the \gls{cse} table. Every usage +of the less compatible, is replaced by the newly created version. Because of +dependency tracking of the constraints that define variables, we can be sure +that all \variables{} and \constraints{} created in defining the earlier +version are correctly removed. + +In addition, if the same expression is found in both \posc{} and \negc{} +context, then we would create both the \gls{half-reif} of the expression and +its negation. The propagation of these two \glspl{half-reif} would be +equivalent to propagating the full \gls{reification} of the same expression. It +is therefore better to actually create the full \gls{reification} as it would +be able to be reused during flattening. + +This problem is solved by introducing a canonical form for expressions where +negations can be pushed inwards. In this form an expression and its negation +should map to the same value in the \gls{cse} table, although in different +contexts. As we discussed before, \negc{}