From 3294400a15a14332ec593f8f5873f5e5e76a45ad Mon Sep 17 00:00:00 2001 From: "Jip J. Dekker" Date: Fri, 28 May 2021 11:44:25 +1000 Subject: [PATCH] More half-reification work and split on sentence --- assets/packages.tex | 6 +- chapters/1_introduction.tex | 100 ++++------ chapters/4_half_reif.tex | 386 +++++++++++++++++++----------------- 3 files changed, 249 insertions(+), 243 deletions(-) diff --git a/assets/packages.tex b/assets/packages.tex index df4ac9a..aaa1e83 100644 --- a/assets/packages.tex +++ b/assets/packages.tex @@ -47,6 +47,9 @@ \usepackage{amssymb} \usepackage{unicode-math} +% Algorithms +\usepackage[ruled,vlined]{algorithm2e} + % References \usepackage[ style=apa, @@ -79,9 +82,6 @@ \setchessboard{showmover=false} -% Algorithms -% \usepackage[ruled,vlined]{algorithm2e} - % % TODO: What am I doing with this? \newcommand*\justify{% \fontdimen2\font=0.4em% interword space diff --git a/chapters/1_introduction.tex b/chapters/1_introduction.tex index f06391d..4053c37 100644 --- a/chapters/1_introduction.tex +++ b/chapters/1_introduction.tex @@ -2,76 +2,52 @@ \chapter{Introduction}\label{ch:introduction} %************************************************ -High-level \cmls{}, like \minizinc{}, were originally designed as a convenient input -language for \gls{cp} solvers \autocite{marriott-1998-clp}. A user would write a -model consisting of a few loops or comprehensions; given a data file for the -parameters, this would be rewritten into a relatively small set of constraints -which would be fed whole into the solver. The existing process for translating -\minizinc\ into solver-specific constraints is a somewhat ad-hoc, (mostly) -single-pass, recursive unrolling procedure, and many aspects (such as call -overloading) are resolved dynamically. In its original application, this was -acceptable: models (both before and after translation) were small, translation -was fast, and the expected results of translation were obvious. The current -architecture is illustrated in \Cref{sfig:4-oldarch}. +High-level \cmls{}, like \minizinc{}, were originally designed as a convenient input language for \gls{cp} solvers \autocite{marriott-1998-clp}. +A user would write a model consisting of a few loops or comprehensions; given a data file for the parameters, this would be rewritten into a relatively small set of constraints which would be fed whole into the solver. +The existing process for translating \minizinc\ into solver-specific constraints is a somewhat ad-hoc, (mostly) single-pass, recursive unrolling procedure, and many aspects (such as call overloading) are resolved dynamically. +In its original application, this was acceptable: models (both before and after translation) were small, translation was fast, and the expected results of translation were obvious. +The current architecture is illustrated in \Cref{sfig:4-oldarch}. -But over time, the uses of high-level \cmls{} have expanded greatly from this -original vision. It is now used to generate input for wildly different solver -technologies: not just \gls{cp}, but also \gls{mip} \autocite{wolsey-1988-mip}, -\gls{sat} \autocite{davis-1962-dpll} and \gls{cbls} -\autocite{bjordal-2015-fzn2oscarcbls} solvers. Crucially, the same constraint -model can be used with any of these solvers, which is achieved through the use -of solver-specific libraries of constraint definitions. In \minizinc{}, these -solver libraries are written in the same language. +But over time, the uses of high-level \cmls{} have expanded greatly from this original vision. +It is now used to generate input for wildly different solver technologies: not just \gls{cp}, but also \gls{mip} \autocite{wolsey-1988-mip}, \gls{sat} \autocite{davis-1962-dpll} and \gls{cbls} \autocite{bjordal-2015-fzn2oscarcbls} solvers. +Crucially, the same constraint model can be used with any of these solvers, which is achieved through the use of solver-specific libraries of constraint definitions. +In \minizinc{}, these solver libraries are written in the same language. -As such, \minizinc\ turned out to be much more expressive than expected, so more -and more preprocessing and model transformation has been added to both the core -\minizinc\ library, and users' models. And in addition to one-shot solving, -\minizinc\ is frequently used as the basis of a larger meta-optimisation tool -chain: programmatically generating a model, solving it, then using the results -to generate another slightly different model. +As such, \minizinc\ turned out to be much more expressive than expected, so more and more preprocessing and model transformation has been added to both the core \minizinc\ library, and users' models. +And in addition to one-shot solving, \minizinc\ is frequently used as the basis of a larger meta-optimisation tool chain: programmatically generating a model, solving it, then using the results to generate another slightly different model. To a great extent, this is testament to the effectiveness of the language. -However, as they have become more common, these extended uses have revealed -weaknesses of the existing \minizinc\ tool chain. In particular: +However, as they have become more common, these extended uses have revealed weaknesses of the existing \minizinc\ tool chain. +In particular: \begin{itemize} - \item The \minizinc\ compiler is inefficient. It does a surprisingly large - amount of work for each expression (especially resolving sub-typing - and overloading), which may be repeated many times --- for example, - inside the body of a comprehension. And as models generated for - other solver technologies (particularly \gls{mip}) can be quite - large, the resulting flattening procedure can be intolerably slow. - As the model transformations implemented in \minizinc\ become more - sophisticated, these performance problems are simply magnified. - \item The generated models often contain unnecessary constraints. During - the transformation, functional expressions are replaced with - constraints. But this breaks the functional dependencies: if the - original expression later becomes redundant (due to model - simplifications), \minizinc\ may fail to detect that the constraint - can be removed. - \item Monolithic flattening is wasteful. When \minizinc\ is used for - multi-shot solving, there is typically a large base model common to - all sub-problems, and a small set of constraints which are added or - removed in each iteration. But with the existing \minizinc\ - architecture, the whole model must be re-flattened each time. Many - use cases involve generating a base model, then repeatedly adding or - removing a few constraints before re-solving. In the current tool - chain, the whole model must be fully re-flattened each time. Not - only does this repeat all the work done to flatten the base model, - This means a large (sometimes dominant) portion of runtime is simply - flattening the core model over and over again. But it also prevents - \emph{the solver} from carrying over anything it learnt from one - problem to the next, closely related, problem. + + \item The \minizinc\ compiler is inefficient. + It does a surprisingly large amount of work for each expression (especially resolving sub-typing and overloading), which may be repeated many times --- for example, inside the body of a comprehension. + And as models generated for other solver technologies (particularly \gls{mip}) can be quite large, the resulting flattening procedure can be intolerably slow. + As the model transformations implemented in \minizinc\ become more sophisticated, these performance problems are simply magnified. + + \item The generated models often contain unnecessary constraints. + During the transformation, functional expressions are replaced with constraints. + But this breaks the functional dependencies: if the original expression later becomes redundant (due to model simplifications), \minizinc\ may fail to detect that the constraint can be removed. + + \item Monolithic flattening is wasteful. + When \minizinc\ is used for multi-shot solving, there is typically a large base model common to all sub-problems, and a small set of constraints which are added or removed in each iteration. + But with the existing \minizinc\ architecture, the whole model must be re-flattened each time. + Many use cases involve generating a base model, then repeatedly adding or removing a few constraints before re-solving. + In the current tool chain, the whole model must be fully re-flattened each time. + Not only does this repeat all the work done to flatten the base model, This means a large (sometimes dominant) portion of runtime is simply flattening the core model over and over again. + But it also prevents \emph{the solver} from carrying over anything it learnt from one problem to the next, closely related, problem. \end{itemize} -In this thesis, we revisit the rewriting of high-level \cmls\ into solver-level -constraint models and describe an architecture that allows us to: +In this thesis, we revisit the rewriting of high-level \cmls\ into solver-level constraint models and describe an architecture that allows us to: \begin{itemize} - \item easily integrate a range of \textbf{optimisation and simplification} - techniques, - \item effectively \textbf{detect and eliminate dead code} introduced by - functional definitions, and - \item support \textbf{incremental flattening and solving}, and better - integration with solvers providing native incremental features. + + \item easily integrate a range of \textbf{optimisation and simplification} techniques, + + \item effectively \textbf{detect and eliminate dead code} introduced by functional definitions, and + + \item support \textbf{incremental flattening and solving}, and better integration with solvers providing native incremental features. + \end{itemize} diff --git a/chapters/4_half_reif.tex b/chapters/4_half_reif.tex index 9a8e48d..5385714 100644 --- a/chapters/4_half_reif.tex +++ b/chapters/4_half_reif.tex @@ -2,17 +2,13 @@ \chapter{Half Reification}\label{ch:half-reif} %************************************************ -The complex expressions language used in \cmls{}, such as \minizinc{}, often -require the use of \gls{reification} in the flattening process to reach a -solver level constraint model. If the Boolean expression \mzninline{pred(...)} -is seen in a non-root context, then a new Boolean \variable{} \mzninline{b} is -introduced to replace the expression. The flattener then enforces a -\constraint{} \mzninline{pred_reif(...,b)}, which binds the \variable{} to be -the \emph{truth-value} of the expression (\ie\ \mzninline{b <-> pred(...)}). +\section{Introduction to Half Reification} -A weakness of reification is that each reified version of a constraint requires -further implementation to create, and indeed most solvers do not provide any -reified versions of their \gls{global} \constraints{}. +The complex expressions language used in \cmls{}, such as \minizinc{}, often require the use of \gls{reification} in the flattening process to reach a solver level constraint model. +If the Boolean expression \mzninline{pred(...)} is seen in a non-root context, then a new Boolean \variable{} \mzninline{b} is introduced to replace the expression. +The flattener then enforces a \constraint{} \mzninline{pred_reif(...,b)}, which binds the \variable{} to be the \emph{truth-value} of the expression (\ie\ \mzninline{b <-> pred(...)}). + +A weakness of reification is that each reified version of a constraint requires further implementation to create, and indeed most solvers do not provide any reified versions of their \gls{global} \constraints{}. \begin{example}\label{ex:hr-alldiff} Consider the complex constraint @@ -30,29 +26,22 @@ reified versions of their \gls{global} \constraints{}. constraint bool_clause([b2], [b1]) % b1 implies b2 \end{mzn} - but no solver we are aware of implements the third primitive - constraint.\footnote{Although there are versions of soft - \mzninline{all_different}, they do not define this form.} + but no solver we are aware of implements the third primitive constraint. + % + \footnote{Although there are versions of soft \mzninline{all_different}, they + do not define this form.} \end{example} -Reified \gls{global} \constraints{} are not implemented because a reified constraint -\mzninline{b <-> pred(...)} must also implement a propagator for \mzninline{not -pred(...)} (in the case that \mzninline{b = false}). While for some global -\constraints{}, \eg\ \mzninline{all_different}, this may be reasonable to -implement, for most, such as \texttt{cumulative}, the task seems to be very -difficult. +Reified \gls{global} \constraints{} are not implemented because a reified constraint \mzninline{b <-> pred(...)} must also implement a propagator for \mzninline{not pred(...)} (in the case that \mzninline{b = false}). +While for some global \constraints{}, \eg\ \mzninline{all_different}, this may be reasonable to implement, for most, such as \texttt{cumulative}, the task seems to be very difficult. -Another weakness of the reification is that it may keep track of more -information than is required. In a typical solver, the first reified constraint -\mzninline{b1 <-> i <= 4} will wake up whenever the upper bound of \texttt{i} -changes in order to check whether it should set \texttt{b1} to -\mzninline{true}. But setting \mzninline{b1} to \mzninline{true} will -\emph{never} cause any further propagation. There is no reason to check this. +Another weakness of the reification is that it may keep track of more information than is required. +In a typical solver, the first reified constraint \mzninline{b1 <-> i <= 4} will wake up whenever the upper bound of \texttt{i} changes in order to check whether it should set \texttt{b1} to \mzninline{true}. +But setting \mzninline{b1} to \mzninline{true} will \emph{never} cause any further propagation. +There is no reason to check this. -This is particularly important when the target solver is a mixed integer -programming solver. In order to linearise a reified linear constraint we need -to create two linear \constraints{}, but if we are only interested in half of the -behaviour we can manage this with one linear constraint. +This is particularly important when the target solver is a mixed integer programming solver. +In order to linearise a reified linear constraint we need to create two linear \constraints{}, but if we are only interested in half of the behaviour we can manage this with one linear constraint. \begin{example} Consider the constraint \mzninline{b1 <-> i <= 4}, where \texttt{i} can take @@ -64,22 +53,29 @@ behaviour we can manage this with one linear constraint. \end{mzn} But in the system of \constraints{} where this constraint occurs knowing that - \texttt{b1} is 0 will never cause the system to fail, hence we do not need to - keep track of it. We can simply use the second constraint in the - linearisation, which always allows that \texttt{b1} takes the value 0. + \texttt{b1} is 0 will never cause the system to fail, hence we do not need to + keep track of it. +% + We can simply use the second constraint in the linearisation, which always + allows that \texttt{b1} takes the value 0. \end{example} The simple flattening used above treats partial functions in the following -manner. Application of a partial function to a value for which it is not defined -gives value \undefined, and this \undefined\ function percolates up through -every expression to the top level conjunction, making the model unsatisfiable. +manner. +% +Application of a partial function to a value for which it is not defined gives +value \undefined, and this \undefined\ function percolates up through every +expression to the top level conjunction, making the model unsatisfiable. +% For the example +% +\jip{TODO:\ What goes here???} In this chapter we study the usage of \gls{half-reif}. \gls{half-reif} follows from the notion that in many cases it might be sufficient to use the logical implication of an expression, \mzninline{b -> pred(...)}, instead of the logical equivalence, \mzninline{b <-> pred(...)}. Flattening with -half-reification is an approach that improves upon all these weaknesses of +\gls{half-reif} is an approach that improves upon all these weaknesses of flattening with \emph{full} reification. \begin{itemize} @@ -105,12 +101,10 @@ flattening with \emph{full} reification. \end{itemize} The remainder of the chapter is organised as follows. -\Cref{sec:half-propagation} discusses the propagation of half-reified -\constraints{}. \Cref{sec:half-decomposition} discusses the decomposition of -half-reified constraint. \Cref{sec:half-context} introduces the notion of -context analysis: a way to determine if half-reification can be used for a -certain expression. Finally, \cref{sec:half-flattening} explains how this -information can be used during the flattening process. +\Cref{sec:half-propagation} discusses the propagation of half-reified \constraints{}. +\Cref{sec:half-decomposition} discusses the decomposition of half-reified constraint. +\Cref{sec:half-context} introduces the notion of context analysis: a way to determine if \gls{half-reif} can be used for a certain expression. +Finally, \cref{sec:half-flattening} explains how this information can be used during the flattening process. \section{Propagation and Half Reification}% \label{sec:half-propagation} @@ -122,23 +116,15 @@ information can be used during the flattening process. \item experimental results \end{itemize} -A propagation engine gains certain advantages from half-reification, but also -may suffer certain penalties. Half reification can cause propagators to wake up -less frequently, since variables that are fixed to true by full reification will -never be fixed by half reification. This is advantageous, but a corresponding -disadvantage is that variables that are fixed can allow the simplification of -the propagator, and hence make its propagation faster. +A propagation engine gains certain advantages from \gls{half-reif}, but also may suffer certain penalties. +Half reification can cause propagators to wake up less frequently, since variables that are fixed to true by full reification will never be fixed by half reification. +This is advantageous, but a corresponding disadvantage is that variables that are fixed can allow the simplification of the propagator, and hence make its propagation faster. -When full reification is applicable (where we are not using half reified -predicates) an alternative to half reification is to implement full reification -\mzninline{x <-> pred(...)} by two half reified propagators \mzninline{x -> -pred(...)}, \mzninline{y \half \neg pred(...)}, \mzninline{x <-> not y}. This -does not lose propagation strength. For Booleans appearing in a positive -context we can make the propagator \mzninline{y -> not pred(...)} run at the -lowest priority, since it will never cause failure. Similarly in negative -contexts we can make the propagator \mzninline{b -> pred(...)} run at the -lowest priority. This means that Boolean variables are still fixed at the same -time, but there is less overhead. +When full reification is applicable (where we are not using half reified predicates) an alternative to half reification is to implement full reification \mzninline{x <-> pred(...)} by two half reified propagators \mzninline{x -> pred(...)}, \mzninline{y \half \neg pred(...)}, \mzninline{x <-> not y}. +This does not lose propagation strength. +For Booleans appearing in a positive context we can make the propagator \mzninline{y -> not pred(...)} run at the lowest priority, since it will never cause failure. +Similarly in negative contexts we can make the propagator \mzninline{b -> pred(...)} run at the lowest priority. +This means that Boolean variables are still fixed at the same time, but there is less overhead. \section{Decomposition and Half Reification}% \label{sec:half-decomposition} @@ -146,97 +132,76 @@ time, but there is less overhead. \section{Context Analysis}% \label{sec:half-context} -\Gls{half-reif} can be used instead of full \gls{reification} when the -\gls{reification} can never be forced to be false. We see this in, for example, -a disjunction \(a \lor b\). No matter the value of \(a\), setting the value of -\(b\) to be true can never make the overall expression false. At any \(b\) is thus -never forced to be false. This requirement follows from the difference between -implication and logical equivalences. Setting the left hand side of a -implication to false, does not influence the value on the right hand side. So -if we know that this is never required in the overall expression, then using an -implication instead of a logical equivalence, \ie a \gls{half-reif} instead of -a full \gls{reification}, does not change the meaning of the constraint. +\Gls{half-reif} can be used instead of full \gls{reification} when the \gls{reification} can never be forced to be false. +We see this in, for example, a disjunction \(a \lor b\). +No matter the value of \(a\), setting the value of \(b\) to be true can never make the overall expression false. +At any \(b\) is thus never forced to be false. +This requirement follows from the difference between implication and logical equivalences. +Setting the left hand side of a implication to false, does not influence the value on the right hand side. +So if we know that this is never required in the overall expression, then using an implication instead of a logical equivalence, \ie a \gls{half-reif} instead of a full \gls{reification}, does not change the meaning of the constraint. + +This property can be extended to include non-Boolean expressions. +Since Boolean expressions in \minizinc{} can be used in, for example, integer expressions, we can apply similar reasoning to these types of expressions. +For example the left hand side of the constraint -This property can be extended to include non-Boolean expressions. Since Boolean -expressions in \minizinc{} can be used in, for example, integer expressions, we -can apply similar reasoning to these types of expressions. For example the left -hand side of the constraint -% \begin{mzn} constraint count(x in arr)(x = 5) > 5; \end{mzn} -% -is an integer expression that contains the Boolean expression \mzninline{x = -5}. Since the increasing left hand side of the constraint will only ever help -satisfy the constraint, the expression \mzninline{x = 5} will never forced to -be false. This means that we can half-reify the expression. -To systematically analyse whether Booelean expressions can be half-reified, we -introduce extra distinctions in the context of expressions. Before, we would -merely distinguish between \rootc{} context and \emph{non-root} context. Now, -we will categorise the latter into: +is an integer expression that contains the Boolean expression \mzninline{x = 5}. +Since the increasing left hand side of the constraint will only ever help satisfy the constraint, the expression \mzninline{x = 5} will never forced to be false. +This means that we can half-reify the expression. + +To systematically analyse whether Booelean expressions can be half-reified, we introduce extra distinctions in the context of expressions. +Before, we would merely distinguish between \rootc{} context and \emph{non-root} context. +Now, we will categorise the latter into: \begin{description} - \item[\posc{} context] when an expression must reach \emph{at least} a - certain value to satisfy its enclosing constraint. The expression is never - forced to take a lower value. + \item[\posc{} context] when an expression must reach \emph{at least} a certain value to satisfy its enclosing constraint. + The expression is never forced to take a lower value. - \item[\negc{} context] when an expression can reach \emph{at most} a certain - value to satisfy its enclosing constraint. The expression is never forced - to take a higher value. + \item[\negc{} context] when an expression can reach \emph{at most} a certain value to satisfy its enclosing constraint. + The expression is never forced to take a higher value. - \item[\mixc{} context] when an expression must take an \emph{exact value}, be - within a \emph{specified range} or when during flattening it cannot be - determined whether the expression must be increased or decreased to satisfy - the enclosing constraint. + \item[\mixc{} context] when an expression must take an \emph{exact value}, be within a \emph{specified range} or when during flattening it cannot be determined whether the expression must be increased or decreased to satisfy the enclosing constraint. \end{description} -As previously explained, \gls{half-reif} can be used for expressions in \posc{} -context. Although expressions in a \negc{} context cannot be directly -half-reified, the negation of a expression in a \negc{} context can be -half-reified. Consider, for example, the constraint -% +As previously explained, \gls{half-reif} can be used for expressions in \posc{} context. +Although expressions in a \negc{} context cannot be directly half-reified, the negation of a expression in a \negc{} context can be half-reified. +Consider, for example, the constraint + \begin{mzn} constraint b \/ not (x = 5); \end{mzn} -% -The expression \mzninline{x = 5} is in a \negc{} context. Although a -\gls{half-reif} cannot be used directly, in some cases the solver can negate -the expression which are then placed in a \posc{} context. Our example can be -transformed into: -% + +The expression \mzninline{x = 5} is in a \negc{} context. +Although a \gls{half-reif} cannot be used directly, in some cases the solver can negate the expression which are then placed in a \posc{} context. +Our example can be transformed into: + \begin{mzn} constraint b \/ x != 5; \end{mzn} -% -The transformed expression, \mzninline{x != 5}, is now in a \posc{} context. We -can also speak of this process as ``pushing the negation inwards''. -Expressions in a \mixc{} context are in a position where \gls{half-reif} is -impossible. Only full \gls{reification} can be used for expressions in that are -in this context. This occurs, for example, when using an exclusive or -expression in a constraint. The value that one side must take directly depends -on the value that the other side takes. Each side can thus be forced to be true -or false. The \mixc{} context can also be used as a ``fall back'' context; if -it cannot be determined if an expression is in a \posc{} or \negc{} context, -then it is always safe to say the expression is in a \mixc{} context. +The transformed expression, \mzninline{x != 5}, is now in a \posc{} context. +We can also speak of this process as ``pushing the negation inwards''. -When taking into account the possible undefinedness of an expression, every -expression in a \minizinc{} model has two different contexts: the context in -which the expression itself occurs, its \emph{value context}, and the context -in which the partiality of the expression is captured, its \emph{partiality -context}. As described in \cref{subsec:back-mzn-partial}, \minizinc{} uses -relational semantics of partial values. This means that if a function does not -have a result, then its nearest enclosing Boolean expression is set to false. -In practice, this means that a condition that tests if the function will return -a value is added to the nearest enclosing Boolean expression. The -\emph{partiality} context is the context in which this condition is placed. +Expressions in a \mixc{} context are in a position where \gls{half-reif} is impossible. +Only full \gls{reification} can be used for expressions in that are in this context. +This occurs, for example, when using an exclusive or expression in a constraint. +The value that one side must take directly depends on the value that the other side takes. +Each side can thus be forced to be true or false. +The \mixc{} context can also be used as a ``fall back'' context; if it cannot be determined if an expression is in a \posc{} or \negc{} context, then it is always safe to say the expression is in a \mixc{} context. -We now specify two context transformations that will be used in further -algorithms to transition between different contexts: \changepos{} and -\changeneg{}. The transformations have the following behaviour: +When taking into account the possible undefinedness of an expression, every expression in a \minizinc{} model has two different contexts: the context in which the expression itself occurs, its \emph{value context}, and the context in which the partiality of the expression is captured, its \emph{partiality context}. +As described in \cref{subsec:back-mzn-partial}, \minizinc{} uses relational semantics of partial values. +This means that if a function does not have a result, then its nearest enclosing Boolean expression is set to false. +In practice, this means that a condition that tests if the function will return a value is added to the nearest enclosing Boolean expression. +The \emph{partiality} context is the context in which this condition is placed. + +We now specify two context transformations that will be used in further algorithms to transition between different contexts: \changepos{} and \changeneg{}. +The transformations have the following behaviour: \begin{tabular}{ccc} \( @@ -263,68 +228,133 @@ algorithms to transition between different contexts: \changepos{} and \section{Flattening and Half Reification}% \label{sec:half-flattening} -During the flattening process the contexts assigned to the different -expressions can be used directly to determine if and how a expression has to be -reified. The flattening with \gls{half-reif} does, however, interact with some -of the optimisations used during the flattening process. Most importantly, -\gls{half-reif} has to be considered when using \gls{cse}. +During the flattening process the contexts assigned to the different expressions can be used directly to determine if and how a expression has to be reified. -When using full \gls{reification}, all \glspl{reification} are stored in the -\gls{cse} table. This ensure that if we see the same expression is reified -twice, then the resulting \variable{} would be reusing. This avoids that the -solver has to enforce the same functional relationship twice. +\jip{TODO: Add example of flattening with \gls{half-reif}} -If the flattener uses \gls{half-reif}, in addition to full \gls{reification}, -then \gls{cse} needs to ensure not just that the expressions are equivalent, -but also that the context of the two expressions are compatible. For example, -if an expression was first found in a \posc{} context and later found in a -\mixc{} context, then the resulting \gls{half-reification} from the first -cannot be used for the second expression. In general: +The flattening with \gls{half-reif} does, however, interact with some of the optimisations used during the flattening process. +Most importantly, \gls{half-reif} has to be considered when using \gls{cse}. +In \cref{subsec:half-cse} we will discuss how \gls{cse} can be adjusted to handle \gls{half-reif}. + +As shown in \jip{insert reference}, a consequence of the use of \gls{half-reif} is that it might form so called \emph{implication chains}. +This happens when the right hand side of an implication is half reifed and a new Boolean variable is created to represent the variable. +Instead, we could have directly posted the half-reified constraint using the left hand side of the implication as its control variable. +In \cref{subsec:half-compress} we present a new post-processing method, \emph{chain compression}, that can be used to eliminate these implication chains. + +\subsection{Common Sub-expression Elimination}% +\label{subsec:half-cse} + +When using full \gls{reification}, all \glspl{reification} are stored in the \gls{cse} table. +This ensure that if we see the same expression is reified twice, then the resulting \variable{} would be reusing. +This avoids that the solver has to enforce the same functional relationship twice. + +If the flattener uses \gls{half-reif}, in addition to full \gls{reification}, then \gls{cse} needs to ensure not just that the expressions are equivalent, but also that the context of the two expressions are compatible. +For example, if an expression was first found in a \posc{} context and later found in a \mixc{} context, then the resulting \gls{half-reif} from the first cannot be used for the second expression. +In general: \begin{itemize} - \item The flattening result of a \posc{} context, a \gls{half-reif}, can only - be reused if the same expression is again found in \posc{} context. + \item The flattening result of a \posc{} context, a \gls{half-reif}, can only be reused if the same expression is again found in \posc{} context. - \item The flattening result of a \negc{} context, a \gls{half-reif} with a - negation pushed inwards, can only be reused if the same expression is again - found in \negc{} context. + \item The flattening result of a \negc{} context, a \gls{half-reif} with its negation pushed inwards, can only be reused if the same expression is again found in \negc{} context. - \item The flattening result of a \mixc{} context, a \gls{reification}, can be - reused in \posc{}, \negc{}, and \mixc{} context. Since we assume that the - result of a flattening an expression in \negc{} context pushes the negation - inwards, the \gls{reification} does, however, need to be negated. + \item The flattening result of a \mixc{} context, a \gls{reification}, can be reused in \posc{}, \negc{}, and \mixc{} context. + Since we assume that the result of a flattening an expression in \negc{} context pushes the negation inwards, the \gls{reification} does, however, need to be negated. - \item If the expression was already flattened in \rootc{} context, then any - repeated usage of the expression can be assumed to take the value - \mzninline{true} (or \mzninline{false} in \negc{} context). + \item If the expression was already flattened in \rootc{} context, then any repeated usage of the expression can be assumed to take the value \mzninline{true} (or \mzninline{false} in \negc{} context). \end{itemize} -When considering these compatibility rules, the result of flattening would be -highly dependent on the order in which expressions are seen by the flattener. -It would always be better to encounter the expression in a context that results -in a reusable expression, \eg{} \mixc{}, before seeing the same expression in -another context, \eg{} \posc{}. This avoids creating both a full -\gls{reification} and a \gls{half-reif} of the same expression. +When considering these compatibility rules, the result of flattening would be highly dependent on the order in which expressions are seen by the flattener. +It would always be better to encounter the expression in a context that results in a reusable expression, \eg{} \mixc{}, before seeing the same expression in another context, \eg{} \posc{}. +This avoids creating both a full \gls{reification} and a \gls{half-reif} of the same expression. -In the \microzinc{} interpreter, this problem is resolved by only keeping the -result of the \emph{most compatible} context. If an expression is found another -time in another context that is compatible with more contexts, then only the -result of evaluating this context is kept in the \gls{cse} table. Every usage -of the less compatible, is replaced by the newly created version. Because of -dependency tracking of the constraints that define variables, we can be sure -that all \variables{} and \constraints{} created in defining the earlier -version are correctly removed. +In the \microzinc{} interpreter, this problem is resolved by only keeping the result of the \emph{most compatible} context. +If an expression is found another time in another context that is compatible with more contexts, then only the result of evaluating this context is kept in the \gls{cse} table. +Every usage of the less compatible, is replaced by the newly created version. +Because of dependency tracking of the constraints that define variables, we can be sure that all \variables{} and \constraints{} created in defining the earlier version are correctly removed. -In addition, if the same expression is found in both \posc{} and \negc{} -context, then we would create both the \gls{half-reif} of the expression and -its negation. The propagation of these two \glspl{half-reif} would be -equivalent to propagating the full \gls{reification} of the same expression. It -is therefore better to actually create the full \gls{reification} as it would -be able to be reused during flattening. +In addition, if the same expression is found in both \posc{} and \negc{} context, then we would create both the \gls{half-reif} of the expression and its negation. +The propagation of these two \glspl{half-reif} would be equivalent to propagating the full \gls{reification} of the same expression. +It is therefore better to actually create the full \gls{reification} as it would be able to be reused during flattening. + +This problem is solved by introducing a canonical form for expressions where negations can be pushed inwards. +In this form the result of flattening an expression and its negation are collected in the same place within the \gls{cse} table. +If it is found that for an expression that is about to be half reified there already exists an \gls{half-reif} for its negation, then we instead evaluate the expression in mixed context, reifying the expression and replacing the existing half reified expression. + +This canonical form for expressions and their negations can also be used for the expressions in other contexts. +Using the canonical form we can now also be sure that we never create a full \gls{reification} for both an expression and its negation. +Instead, when one is created, the negation of the resulting \variable{} releasises its negation. +Moreover, this mechanism also allows us to detect when an expression and its negation occur in \rootc{} context. +This is simple way to detect conflicts between \constraints{} and, by extend, prove that the constraint model is unsatisfiable. +Clearly, a \constraint{} and its negation cannot both hold at the same time. + +\subsection{Chain compression}% +\label{subsec:half-compress} + +As shown in \cref{ex:hr-half}, flattening with half reification will in many cases result in implication chains: \mzninline{b1 -> b2 /\ b2 -> c}, where \texttt{b2} has no other occurrences. +In this case the conjunction can be replaced by \mzninline{b1 -> c} and \texttt{b2} can be removed from the model. +The case shown in the example can be generalised to + +\begin{mzn} + b1 -> b2 /\ forall(i in N)(b2 -> c[i]) +\end{mzn} + +\noindent{}which, if \texttt{b2} has no other usage in the instance, can be resolved to + +\begin{mzn} + forall(i in N)(b1 -> c[i]) +\end{mzn} + +\noindent{}after which \texttt{b1} can be removed from the model. + + +An algorithm to remove these chains of implications is best visualised through the use of an implication graph. +An implication graph \(\tuple{V,E}\) is a directed graph where the vertices \(V\) represent the variables in the instance and an edge \(\tuple{x,y} \in E\) represents the presence of an implication \mzninline{x -> y} in the instance. +Additionally, for the benefit of the algorithm, a vertex is marked when it is used in other constraints in the constraint model. +The goal of the algorithm is now to identify and remove vertices that are not marked and have only one incoming edge. +\Cref{alg:half-compression} provides a formal specification of the chain compression method in pseudo code. + +\begin{algorithm} + \KwData{An implication constraint graph \(G=\tuple{V, E}\) and a set \(M + \subseteq{} V\) of vertices used in other constraints.} + + \KwResult{An equisatisfiable graph \(G'=\tuple{V', E'}\) where chained + implications have been removed.} + + \(V' \longleftarrow V\)\; + \(E' \longleftarrow E\)\; + \For{\(x \in V\)} { + \If{\(x \not\in M\) \textbf{ and }\(\left|\left\{\tuple{a,x}| \tuple{a,x} \in E\right\}\right| = 1\)}{ + \For{\(\tuple{x, b} \in E\)}{ + \(E' \longleftarrow E' \cup \{ \tuple{a,b} \} \)\; + \(E' \longleftarrow E' \backslash \{ \tuple{x,b} \} \)\; + } + \(E' \longleftarrow E' \backslash \{ \tuple{a,x} \} \)\; + \(V' \longleftarrow V' \backslash \{ x \} \)\; + } + } + \(G' \longleftarrow \tuple{V', E'}\)\; + \caption{\label{alg:half-compression} Implication chain compression algorithm} +\end{algorithm} + +The algorithm can be further improved by considering implied conjunctions. +These can be split up into multiple implications: + +\begin{mzn} + b -> forall(x in N)(x) +\end{mzn} + +\noindent{}is equivalent to + +\begin{mzn} + forall(x in N)(b -> x) +\end{mzn} + +Adopting this transformation both simplifies a complicated constraint and possibly allows for the further compression of implication chains. +It should however be noted that although this transformation can result in an increase in the number of constraints, it generally increases the propagation efficiency. + +To adjust the algorithm to simplify implied conjunctions more introspection from the \minizinc{} compiler is required. +The compiler must be able to tell if a variable is (only) a control variable of a reified conjunction and what the elements of that conjunction are. +In the case where a variable has one incoming edge, but it is marked as used in other constraint, we can now check if it is only a control variable for a reified conjunction and perform the transformation in this case. -This problem is solved by introducing a canonical form for expressions where -negations can be pushed inwards. In this form an expression and its negation -should map to the same value in the \gls{cse} table, although in different -contexts. As we discussed before, \negc{}