2.7. Effective Modelling Practices in MiniZinc
There are almost always multiple ways to model the same problem, some of which generate models which are efficient to solve, and some of which are not. In general it is very hard to tell a priori which models are the most efficient for solving a particular problem, and indeed it may critically depend on the underlying solver used, and search strategy. In this chapter we concentrate on modelling practices that avoid inefficiency in generating models and generated models. For solver-specific recommendations see User Manual/Solver Backends.
2.7.1. Variable Bounds
Finite domain propagation engines, which are the principle type of solver targeted by MiniZinc, are more effective the tighter the bounds on the variables involved. They can also behave badly with problems which have subexpressions that take large integer values, since they may implicitly limit the size of integer variables.
var int: item1;
var int: item2;
var int: item3;
var int: item4;
constraint item1 + item2 + item3 + item4 == 711;
constraint item1 * item2 * item3 * item4 == 711 * 100 * 100 * 100;
constraint 0 < item1 /\ item1 <= item2
/\ item2 <= item3 /\ item3 <= item4;
solve satisfy;
output ["{", show(item1), ",", show(item2), ",", show(item3), ",",
show(item4),"}\n"];
The grocery problem shown in Listing 2.7.1 finds 4 items whose prices in dollars add up to 7.11 and multiply up to 7.11. The variables are declared unbounded. Running
$ minizinc --solver chuffed grocery.mzn
does not yield a solution because the
intermediate expressions in the multiplication
are also var int
and are given default bounds in the solver
\(-1,000,000 \dots 1,000,000\),
and these ranges are too small to hold the
values that the intermediate expressions may need to take.
Modifying the model so that the items are declared with tight bounds
var 1..711: item1;
var 1..711: item2;
var 1..711: item3;
var 1..711: item4;
results in a better model, since now MiniZinc can infer bounds on the intermediate expressions and use these rather than the default bounds. With this modification, executing the model gives
{120,125,150,316}
----------
Note however that even the improved model may be too difficult for some solvers.
Bounding variables
Always try to use bounded variables in models.
When using let
declarations to introduce new variables, always try to define them
with correct and tight bounds. This will make your model more efficient,
and avoid the possibility of unexpected overflows.
One exception is when you introduce a new variable which is
immediately defined as equal to an expression. Usually MiniZinc will be
able to infer effective bounds from the expression.
2.7.2. Effective Generators
Imagine we want to count the number of triangles (\(K_3\) subgraphs)
appearing in a graph. Suppose the graph is defined by
an adjacency matrix: adj[i,j]
is true if nodes i
and j
are
adjacent. We might write
int: count = sum ([ 1 | i,j,k in NODES where i < j /\ j < k
/\ adj[i,j] /\ adj[i,k] /\ adj[j,k]]);
which is certainly correct, but it examines all triples of nodes.
If the graph is sparse we can do better by realising that some
tests can be applied as soon as we select i
and j
.
int: count = sum ([ 1 | i,j in NODES where i < j /\ adj[i,j],
k in NODES where j < k /\ adj[i,k] /\ adj[j,k]]);
You can use the builitin trace
function to help
determine what is happening inside generators.
Tracing
The function trace(s,e)
prints the string s
before
evaluating the expression e
and returning its value.
It can be used in any context.
For example, we can see how many times the test is performed in the inner loop for both versions of the calculation.
int:count=sum([ 1 | i,j,k in NODES where
trace("+", i<j /\j<k /\ adj[i,j] /\ adj[i,k] /\ adj[j,k]) ]);
adj = [| false, true, true, false
| true, false, true, false
| true, true, false, true
| false, false, true, false |];
constraint trace("\n",true);
solve satisfy;
Produces the output:
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
----------
indicating the inner loop is evaluated 64 times while
int: count = sum ([ 1 | i,j in NODES where i < j /\ adj[i,j],
k in NODES where trace("+", j < k /\ adj[i,k] /\ adj[j,k])]);
Produces the output:
++++++++++++++++
----------
indicating the inner loop is evaluated 16 times.
Note that you can use the dependent strings in trace
to
understand what is happening during model creation.
int: count = sum( i,j in NODES where i < j /\ adj[i,j])(
sum([trace("("++show(i)++","++show(j)++","++show(k)++")",1) |
k in NODES where j < k /\ adj[i,k] /\ adj[j,k]]));
will print out each of triangles that is found in the calculation. It produces the output
(1,2,3)
----------
We have to admit that we cheated a bit here: In certain circumstances, the MiniZinc compiler is in fact able to re-order the arguments in a where clause automatically, so that they are evaluated as early as possible. In this case, adding the trace function in fact prevented this optimisation. In general, it is however a good idea to help the compiler get it right, by splitting the where clauses and placing them as close to the generators as possible.
2.7.3. Redundant Constraints
The form of a model will affect how well the constraint solver can solve it. In many cases adding constraints which are redundant, i.e. are logically implied by the existing model, may improve the search for solutions by making more information available to the solver earlier.
Consider the magic series problem from Complex Constraints.
Running this for n = 16
as follows:
$ minizinc --solver gecode --all-solutions --statistics magic-series.mzn -D "n=16;"
might result in output
s = [12, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0];
----------
==========
and the statistics showing 117 failures required.
We can add redundant constraints to the model to improve the performance.
Since each number
in the sequence counts the number of occurrences of a number, we know
that they sum up to n
. Similarly we know that the sum of
s[i] * i
must also add up to n
because the sequence is magic.
These constraints do not actually constrain the model further: any solution of the original model is still a solution of the new model that includes the redundant constraints. It is good practice to make the MiniZinc compiler aware that a constraint is redundant. That way, the compiler can decide to ignore the constraint for particular solvers, since not all solvers benefit from redundant constraints. We therefore wrap redundant constraints in a call to the library predicate redundant_constraint
.
The magic sequence model with these redundant constraints is shown in Listing 2.7.2.
int: n;
array[0..n-1] of var 0..n: s;
constraint forall(i in 0..n-1) (
s[i] = (sum(j in 0..n-1)(bool2int(s[j]=i))));
constraint redundant_constraint(sum(i in 0..n-1)(s[i]) = n);
constraint redundant_constraint(sum(i in 0..n-1)(s[i] * i) = n);
solve satisfy;
output [ "s = ", show(s), ";\n" ] ;
Running the same problem as before
$ minizinc --solver gecode --all-solutions --statistics magic-series2.mzn -D "n=16;"
results in the same output, but with statistics showing just 14 failures explored. The redundant constraints have allowed the solver to prune the search much earlier.
Because we wrapped the redundant constraints in the special predicate, we can now switch them off easily on the command line by setting the flag mzn_ignore_redundant_constraints=true
:
$ minizinc --solver gecode --all-solutions --statistics magic-series2.mzn -D "n=16;mzn_ignore_redundant_constraints=true;"
2.7.4. Modelling Choices
There are many ways to model the same problem in MiniZinc, although some may be more natural than others. Different models may have very different efficiency of solving, and worse yet, different models may be better or worse for different solving backends. There are however some guidelines for usually producing better models:
Choosing between models
The better model is likely to have some of the following features
smaller number of variables, or at least those that are not functionally defined by other variables
smaller domain sizes of variables
more succinct, or direct, definition of the constraints of the model
uses global constraints as much as possible
In reality all this has to be tempered by how effective the search is for the model. Usually the effectiveness of search is hard to judge except by experimentation.
Consider the problem of finding permutations of \(n\) numbers
from 1 to \(n\) such that the differences between adjacent numbers
also form a permutation of numbers 1 to \(n-1\).
Note that the u
variables are functionally defined by
the x
variables so the raw search space is \(n^n\).
The obvious way to model this problem is shown in Listing 2.7.3.
include "alldifferent.mzn";
int: n;
array[1..n] of var 1..n: x; % sequence of numbers
array[1..n-1] of var 1..n-1: u; % sequence of differences
constraint alldifferent(x);
constraint alldifferent(u);
constraint forall(i in 1..n-1)(u[i] = abs(x[i+1] - x[i]));
solve :: int_search(x, first_fail, indomain_min)
satisfy;
output ["x = \(x);\n"];
In this model the array x
represents the permutation of the n
numbers and the constraints are naturally represented using alldifferent
.
Running the model
$ minizinc --solver gecode --all-solutions --statistics allinterval.mzn -D "n=10;"
finds all solutions in 16077 nodes and 71ms
An alternate model uses array y
where y[i]
gives the
position of the number i
in the sequence.
We also model the positions of the differences using variables
v
. v[i]
is the position in the sequence where the absolute difference
i
occurs. If the values of y[i]
and y[j]
differ by one
where j > i
, meaning the
positions are adjacent, then v[j-i]
is constrained to be the earliest
of these positions.
We can add two redundant constraints to this model:
since we know that a difference of n-1
must result, we know that
the positions of 1 and n
must be adjacent (abs( y[1] - y[n] ) = 1
),
which also tell us that the position of difference n-1
is
the earlier of y[1]
and y[n]
, i.e.
v[n-1] = min(y[1], y[n])
.
With this we can model the problem
as shown in Listing 2.7.4. The output statement recreates the
original sequence x
from the array of positions y
.
include "alldifferent.mzn";
int: n;
array[1..n] of var 1..n: y; % position of each number
array[1..n-1] of var 1..n-1: v; % position of difference i
constraint alldifferent(y);
constraint alldifferent(v);
constraint forall(i,j in 1..n where i < j)(
(y[i] - y[j] = 1 -> v[j-i] = y[j]) /\
(y[j] - y[i] = 1 -> v[j-i] = y[i])
);
constraint abs(y[1] - y[n]) = 1 /\ v[n-1] = min(y[1], y[n]);
solve :: int_search(y, first_fail, indomain_min)
satisfy;
array[1..n] of 1..n: x :: output_only
= [ sum(i in 1..n)(i*(fix(y[j]) = i)) | j in 1..n ];
output [ "x = \(x);\n" ]
The inverse model has the same size as the original model, in terms of
number of variables and domain sizes. But the inverse model has a much more
indirect way of modelling the relationship between the y
and v
variables
as opposed to the relationship between x
and u
variables.
Hence we might expect the original model to be better.
The command
$ minizinc --solver gecode --all-solutions --statistics allinterval2.mzn -D "n=10;"
finds all the solutions in 98343 nodes and 640 ms. So the more direct modelling of the constraints is clearly paying off.
Note that the inverse model prints out the answers using the same x
view of
the solution. The way this is managed is using output_only
annotations.
The array x
is defined as a fixed array and annotated as output_only
.
This means it will only be evaluated, and can only be used in output statements.
Once a solution for y
is discovered the value of x
is calculated
during output processing, and hence can be displayed in the output.
Output_only annotation
The output_only
annotation can be applied to variable definitions.
The variable defined must not be a var
type, it can only be par
.
The variable must also have a right hand side definition giving its value.
This right hand side definition can make use of fix
functions to access
the values of decision variables, since it is evaluated at solution processing
time
2.7.5. Multiple Modelling and Channels
When we have two models for the same problem it may be useful to use both models together by tying the variables in the two models together, since each can give different information to the solver.
include "inverse.mzn";
int: n;
array[1..n] of var 1..n: x; % sequence of numbers
array[1..n-1] of var 1..n-1: u; % sequence of differences
constraint forall(i in 1..n-1)(u[i] = abs(x[i+1] - x[i]));
array[1..n] of var 1..n: y; % position of each number
array[1..n-1] of var 1..n-1: v; % position of difference i
constraint inverse(x,y);
constraint inverse(u,v);
constraint abs(y[1] - y[n]) = 1 /\ v[n-1] = min(y[1], y[n]);
solve :: int_search(y, first_fail, indomain_min)
satisfy;
output ["x = ",show(x),"\n"];
Listing 2.7.5 gives a dual model combining features of
allinterval.mzn and allinterval2.mzn.
The beginning of the model is taken from allinterval.mzn.
We then introduce the y
and v
variables from allinterval2.mzn.
We tie the variables together using the
global
inverse
constraint:
inverse(x,y)
holds if y
is the inverse function of x
(and vice versa),
that is x[i] = j <-> y[j] = i
. A definition
is shown in Listing 2.7.6.
The model does not include the constraints relating the
y
and v
variables, they are redundant (and indeed propagation
redundant)
so they do not add information for a
propagation solver. The alldifferent
constraints are also missing since
they are made redundant (and propagation redundant) by the inverse
constraints.
The only constraints are the relationships of the x
and u
variables
and the redundant constraints on y
and v
.
predicate inverse(array[int] of var int: f,
array[int] of var int: invf) =
forall(j in index_set(invf))(invf[j] in index_set(f)) /\
forall(i in index_set(f))(
f[i] in index_set(invf) /\
forall(j in index_set(invf))(j == f[i] <-> i == invf[j])
);
One of the benefits of the dual model is that there is more scope for defining different search strategies. Running the dual model,
$ minizinc --solver gecode --all-solutions --statistics allinterval3.mzn -D "n=10;"
which uses the search strategy of
the inverse model, labelling the y
variables,
finds all solutions in 126031 nodes and 250ms.
Note that running the same model with labelling on the x
variables
requires 35245 nodes and 100ms.
2.7.6. Symmetry
Symmetry is very common in constraint satisfaction and optimisation problems. To illustrate this, let us look again at the n-queens problem from Listing 2.6.1. The top left chess board in Fig. 2.7.1 shows a solution to the 8-queens problems (labeled “original”). The remaining chess boards show seven symmetric variants of the same solution: rotated by 90, 180 and 270 degrees, and flipped vertically.
If we wanted to enumerate all solutions to the 8-queens problem, we could obviously save the solver some work by only enumerating non-symmetric solutions, and then generating the symmetric variants ourselves. This is one reason why we want to get rid of symmetry in constraint models. The other, much more important reason, is that the solver may also explore symmetric variants of non-solution states!
For example, a typical constraint solver may try to place the queen in column 1 into row 1 (which is fine), and then try to put the column 2 queen into row 3, which, at first sight, does not violate any of the constraints. However, this configuration cannot be completed to a full solution (which the solver finds out after a little search). Fig. 2.7.2 shows this configuration on the top left chess board. Now nothing prevents the solver from trying, e.g., the second configuration from the left in the bottom row of Fig. 2.7.2, where the queen in column 1 is still in row 1, and the queen in column 3 is placed in row 2. Therefore, even when only searching for a single solution, the solver may explore many symmetric states that it has already seen and proven unsatisfiable before!
2.7.6.1. Static Symmetry Breaking
The modelling technique for dealing with symmetry is called symmetry breaking, and in its simplest form, involves adding constraints to the model that rule out all symmetric variants of a (partial) assignment to the variables except one. These constraints are called static symmetry breaking constraints.
The basic idea behind symmetry breaking is to impose an order. For example, to rule out any vertical flips of the chess board, we could simply add the constraint that the queen in the first column must be in the top half of the board:
constraint q[1] <= n div 2;
Convince yourself that this would remove exactly half of the symmetric variants in Fig. 2.7.1. You can tell the compiler that a particular constraint is a symmetry breaking constraint by wrapping it in a special call:
constraint symmetry_breaking_constraint(q[1] <= n div 2);
This has the advantage that solvers can ignore the constraint or treat it differently. It also means that you can easily switch off symmetry breaking by setting mzn_ignore_symmetry_breaking_constraints=true
, either in your model, or a data file, or as extra data using the -D command line option.
In order to remove all symmetry in the n-queens problem, we need to work a bit harder. Whenever we can express all symmetries as permutations of the array of variables, a set of lexicographic ordering constraints can be used to break all symmetry. For example, if the array of variables is called x
, and reversing the array is a symmetry of the problem, then the following constraint will break that symmetry:
constraint symmetry_breaking_constraint(lex_lesseq(x, reverse(x)));
How about two-dimensional arrays? Lexicographic ordering works just the same, we only have to coerce the arrays into one dimension. For example, the following breaks the symmetry of flipping the array along one of the diagonals (note the swapped indices in the second comprehension):
array[1..n,1..n] of var int: x;
constraint symmetry_breaking_constraint(
lex_lesseq([ x[i,j] | i,j in 1..n ], [ x[j,i] | i,j in 1..n ])
);
The great thing about using lexicographic ordering constraints is that we can add multiple ones (to break several symmetries simultaneously), without them interfering with each other, as long as we keep the order in the first argument the same.
For the n-queens problem, unfortunately this technique does not immediately apply, because some of its symmetries cannot be described as permutations of the q
array. The trick to overcome this is to express the n-queens problem in terms of Boolean variables that model, for each field of the board, whether it contains a queen or not. Now all the symmetries can be modeled as permutations of this array. Since the main constraints of the n-queens problem are much easier to express with the integer q
array, we simply use both models together and add channeling constraints between them, as explained in Multiple Modelling and Channels.
The full model, with added Boolean variables, channeling constraints and symmetry breaking constraints is shown in Listing 2.7.7. We can conduct a little experiment to check whether it successfully breaks all the symmetry. Try running the model with increasing values for n
, e.g. from 1 to 10, counting the number of solutions (e.g., by using the -s flag with the Gecode solver, or selecting “Print all solutions” as well as “Statistics for solving” in the IDE). You should get the following sequence of numbers of solutions: 1, 0, 0, 1, 2, 1, 6, 12, 46, 92. To verify the sequence, you can search for it in the On-Line Encyclopedia of Integer Sequences (http://oeis.org).
constraint symmetry_breaking_constraint(
let {
% Alternative Boolean model:
% Map each position i,j to a Boolean telling us whether there is a queen at i,j
array[1..n,1..n] of var bool: qb;
} in
% Channeling constraint
forall (i,j in 1..n) ( qb[i,j] <-> (q[i]=j) )
% Lexicographic symmetry breaking constraints
/\ lex_lesseq(array1d(qb), [ qb[j,i] | i,j in 1..n ])
/\ lex_lesseq(array1d(qb), [ qb[i,j] | i in reverse(1..n), j in 1..n ])
/\ lex_lesseq(array1d(qb), [ qb[j,i] | i in 1..n, j in reverse(1..n) ])
/\ lex_lesseq(array1d(qb), [ qb[i,j] | i in 1..n, j in reverse(1..n) ])
/\ lex_lesseq(array1d(qb), [ qb[j,i] | i in reverse(1..n), j in 1..n ])
/\ lex_lesseq(array1d(qb), [ qb[i,j] | i,j in reverse(1..n) ])
/\ lex_lesseq(array1d(qb), [ qb[j,i] | i,j in reverse(1..n) ])
);
2.7.6.2. Other Examples of Symmetry
Many other problems have inherent symmetries, and breaking these can often make a significant difference in solving performance. Here is a list of some common cases:
Bin packing: when trying to pack items into bins, any two bins that have the same capacity are symmetric.
Graph colouring: When trying to assign colours to nodes in a graph such that adjacent nodes must have different colours, we typically model colours as integer numbers. However, any permutation of colours is again a valid graph colouring.
Vehicle routing: if the task is to assign customers to certain vehicles, any two vehicles with the same capacity may be symmetric (this is similar to the bin packing example).
Rostering/time tabling: two staff members with the same skill set may be interchangeable, just like two rooms with the same capacity or technical equipment.