3.6. Automatic Solution Checking, Model Validation, and Benchmarking

MiniZinc supports checking the correctness of solutions of a model in two ways. First, it is the autonomous checking where compiler substitutes solution values into the model and tries to find inconsistencies, possibly invoking a solver to a much smaller instance. Second, it is by using a checker model. This latter method can also be seen as model validation.

3.6.1. Autonomous Automatic Solution Checking and Benchmarking

For this method, compiler needs the values of as many variables as possible, ideally all variables without the right-hand side. For all top-level variables (not those inside let constructs) the values can be output using --output-mode dzn compiler option. For optimization problems, when no explicit objective variable exists, a variable named _objective is added by --output-objective. Then, each produced solution can be copied into a .dzn and compiled together with the original instance, adding --allow-multiple-assignments option if --output-objective was used. In the case that some variables inside let constructs need to be assigned, a solver has to be invoked for the smaller instance remaining after fixing the supplied values. For example, for the following model:

var int: a;
var float: b = a;
solve
  maximize a-b;
output
  [ "a+b is \(a+b)\n" ];

running it with minizinc --solver gecode test_output_mode.mzn --output-mode dzn --output-objective ignores the provided output annotation and prints

a = -2147483646;
_objective = 0.0;
----------
==========

which can be added as an extra .dzn file. The process of compilation for autonomous checking and re-solving with the output variables fixed, is automated by a Python script tests/benchmarking/mzn-test.py. To solve an instance with autonomous checking by variable value substitution, run, e.g.,

mzn-test.py --solver gecode model.mzn data.dzn

Moreover, mzn-test.py provides facility to run a list of instances and compare results from various test runs and different solvers.

3.6.2. Model Validation: Automatic Solution Checking with a Checker Model

This approach has two main applications:

  • Instructors can provide checker models for student assignments. This provides students with immediate, detailed feedback on their modelling attempts.

  • A simplified checker model can be used to verify a complex model used for solving. This can be useful when experimenting with new decompositions of constraints, or for post-processing solutions if some constraints cannot be added to the original model (e.g. in case the solver does not support certain constructs).

Running MiniZinc with solution checking is easy. On the command line, simply pass the name of the checker model in addition to the problem model:

minizinc model.mzn model.mzc.mzn data.dzn

Checker models can be pre-compiled in order to obfuscate their contents (e.g. if they contain clues to students how to model the problem):

minizinc --compile-solution-checker model.mzc.mzn

This will create the compiled checker model model.mzc, which can be used instead of the clear text version for checking:

minizinc model.mzn model.mzc data.dzn

The MiniZinc IDE has built-in support for solution checkers. Whenever the current project contains a file with the same file name as the current model but file extension .mzc or .mzc.mzn, the “Run” button turns into a “Run+Check” button.

The rest of this section describes how to implement checker models.

3.6.2.1. Basic checker models

At its core, a checker model takes each solution that a solver produces as input, and outputs whether the solution is correct or not. Let’s use the simple map colouring model from the tutorial as an example. Here is the model again:

列表 3.6.1 A MiniZinc model aust.mzn for colouring the states and territories in Australia
% Colouring Australia using nc colours
int: nc = 3;

var 1..nc: wa;   var 1..nc: nt;  var 1..nc: sa;   var 1..nc: q;
var 1..nc: nsw;  var 1..nc: v;   var 1..nc: t;

constraint wa != nt;
constraint wa != sa;
constraint nt != sa;
constraint nt != q;
constraint sa != q;
constraint sa != nsw;
constraint sa != v;
constraint q != nsw;
constraint nsw != v;
solve satisfy;

output ["wa=\(wa)\t nt=\(nt)\t sa=\(sa)\n",
        "q=\(q)\t nsw=\(nsw)\t v=\(v)\n",
         "t=", show(t),  "\n"];

A checker model for this model requires the values of the variables wa, nt, and so on, for each solution, and then has to test whether all constraints hold. The output of the checker model should contain a line starting with CORRECT if the solution passes the test, or INCORRECT if it doesn’t.

Since these values will be fixed in any solution, checker models simply declare parameters with the same name as the model variables:

列表 3.6.2 A MiniZinc checker model aust.mzc.mzn for the map colouring problem
int: wa;
int: nt;
int: sa;
int: q;
int: nsw;
int: v;
int: t;

output [
  if wa!=nt /\ wa!=sa /\ nt!=sa /\ nt!=q /\ sa!=q /\
     sa!=nsw /\ sa!=v /\ q!=nsw /\ nsw!=v
  then "CORRECT\n"
  else "INCORRECT\n"
  endif
];

Running the model and the checker will produce output like this:

wa=3   nt=2    sa=1
q=3    nsw=2   v=3
t=3
% Solution checker report:
% CORRECT
----------

The solution checker report is embedded as comments in the original output.

3.6.2.2. Detailed feedback

The basic checker model above only reports whether the solutions satisfy the constraints, but it doesn’t provide any insights into the nature of the error if a constraint is violated.

We can use standard MiniZinc functionality to provide much more detailed feedback. The following checker model introduces a helper function check which outputs a detailed error message if a constraint doesn’t hold. The results of all the checks are combined, and if any of the constraints was violated, the output is INCORRECT, otherwise it is CORRECT.

列表 3.6.3 A checker model aust-2.mzc.mzn for the map colouring problem with more detailed error messages
int: wa;
int: nt;
int: sa;
int: q;
int: nsw;
int: v;
int: t;

test check(bool: b,string: s) =
  if b then true else trace_stdout("ERROR: "++s++"\n",false) endif;

output [
  if check(wa!=nt, "wa and nt have the same colour")
  /\ check(wa!=sa, "wa and sa have the same colour")
  /\ check(nt!=sa, "nt and sa have the same colour")
  /\ check(nt!=q, "nt and q have the same colour")
  /\ check(sa!=q, "sa and q have the same colour")
  /\ check(sa!=nsw, "sa and nsw have the same colour")
  /\ check(sa!=v, "sa and v have the same colour")
  /\ check(q!=nsw, "q and nsw have the same colour")
  /\ check(nsw!=v, "nsw and v have the same colour")
  then "CORRECT: all constraints hold.\n"
  else "INCORRECT\n"
  endif
];

However, the checker model will only report the first violated constraint, since the conjunction operator short-circuits the evaluation when one of its arguments is false. For example, if we remove all constraints from the original model, the output would be:

wa=3   nt=3    sa=3
q=3    nsw=3   v=3
t=3
% Solution checker report:
% ERROR: wa and nt have the same colour
% INCORRECT
----------

In order to get all error messages, we can force the evaluation of all checkers by creating an auxiliary array of check results:

列表 3.6.4 A checker model aust-3.mzc.mzn for the map colouring problem without short-circuit evaluation
int: wa;
int: nt;
int: sa;
int: q;
int: nsw;
int: v;
int: t;

test check(bool: b,string: s) =
  if b then true else trace_stdout("ERROR: "++s++"\n",false) endif;

array[int] of bool: checks = [
  check(wa!=nt, "wa and nt have the same colour"),
  check(wa!=sa, "wa and sa have the same colour"),
  check(nt!=sa, "nt and sa have the same colour"),
  check(nt!=q, "nt and q have the same colour"),
  check(sa!=q, "sa and q have the same colour"),
  check(sa!=nsw, "sa and nsw have the same colour"),
  check(sa!=v, "sa and v have the same colour"),
  check(q!=nsw, "q and nsw have the same colour"),
  check(nsw!=v, "nsw and v have the same colour")
];

output [
  if forall(checks)
  then "CORRECT: all constraints hold.\n"
  else "INCORRECT\n"
  endif
];

Now the output contains all error messages (for the case where the model has no constraints):

wa=3   nt=3    sa=3
q=3    nsw=3   v=3
t=3
% Solution checker report:
% ERROR: nsw and v have the same colour
% ERROR: q and nsw have the same colour
% ERROR: sa and v have the same colour
% ERROR: sa and nsw have the same colour
% ERROR: sa and q have the same colour
% ERROR: nt and q have the same colour
% ERROR: nt and sa have the same colour
% ERROR: wa and sa have the same colour
% ERROR: wa and nt have the same colour
% INCORRECT
----------

3.6.2.3. Instance data in checker models

The map colouring example was quite simple because the model did not contain any parameter declarations. For a model that is parameterised, the checker model simply contains the same parameter declarations. MiniZinc will then pass the actual parameters of the instance being solved to the checker model.

For example, the following checker model could be used for the n-Queens problem.

列表 3.6.5 A checker model nqueens.mzc.mzn for the n-Queens problem
int: n; % number of queens
set of int: ROW = 1..n;
set of int: COL = 1..n;
array[int] of int: q; % col of queen in each row

test check(bool: b,string: s) =
  if b then true else trace_stdout("ERROR: "++s++"\n",false) endif;

output [
  if check(index_set(q)=1..n, "ERROR: array q should have index set 1..\(n)")
  /\ forall(i in 1..n)(check(q[i] in 1..n, "ERROR: q[\(i)] should have a value in 1..\(n)"))
  /\ forall(r1, r2 in 1..n where r1 < r2)
           (check(q[r1] != q[r2],
            "queens in rows \(r1) and \(r2) are on same column\n")
            /\
            check(q[r1]+r1 != q[r2]+r2,
            "queens in rows \(r1) and \(r2) are on same up diagonal\n")
            /\
            check(q[r1]-r1 != q[r2]-r2,
            "queens in rows \(r1) and \(r2) are on same down diagonal\n")
           )
  then "CORRECT: All constraints hold"    
  else "INCORRECT" endif ];

The checker model first makes sure that the solution has the right dimensions (correct array index set, and each variable is assigned a value in the correct domain), and then uses standard MiniZinc constructs to check each constraint.

3.6.2.4. Checking optimisation problems

Optimisation problems pose a difficulty for automatic checking. When a solver claims to prove optimality, we cannot easily verify this claim without solving the problem again (using a different model that is known to be correct). At the moment, solution checking for optimisation problems is restricted to checking that the objective has been computed correctly.

To check that the objective value is calculated correctly, define a parameter called _objective of the appropriate type (int or float) in the checker. The example in the next section illustrates this.

3.6.2.5. Hidden variables

For many models, the decision variables which describe the solution may not be the variables that are natural to describe the constraints. There may be other internal variables which are functionally defined by the decision variables, and which are likely to be used in the model for building constraints and/or are much more natural for describing the correctness of a candidate solution.

Consider the problem of lining up n people numbered from 1 to n for a photo in a single line. We want to make sure that there are no more than two people of the same gender (male, female or other) in sequence in the line. We want to minimize the total distance between the pairs of people who are numbered consecutively. The decisions are an array pos which for each person gives their position in the line. A correct model for this is given in 列表 3.6.6.

列表 3.6.6 A model photo.mzn for the photo lineup problem
int: n;                          % number of people
set of int: PERSON = 1..n;       % set of people
enum GENDER = { M, F, O };       % set of genders
array[PERSON] of GENDER: g;      % the gender of each person
set of int: POSN = 1..n;         % set of positions

array[PERSON] of var POSN: pos;  % decisions: a position for each person

array[POSN] of var PERSON: who;  % view: a person for each position
include "inverse.mzn";
constraint inverse(pos,who);     % channel from decisions to view
constraint forall(i in 1..n-2)
                 (g[who[i]] != g[who[i+1]] \/ g[who[i+1]] != g[who[i+2]]);

solve minimize sum(i in 1..n-1)(abs(pos[i] - pos[i+1]));

A critical part of learning how to model this problem is to realise that it is worth introducing the inverse of the pos variables, called who in this model. The who array makes it much easier to specify that no more than two people of the same gender can be adjacent. The checker model should not include the who variables, because that would effectively give away the key to the solution.

The use of hidden variables (such as who) makes checking harder in two ways. First, we may need to define these in the checker model as decision variables (rather than parameters), and MiniZinc will then need to invoke a solver to generate the values of these variables in order to check a given solution. Second, given an incorrect candidate solution, there may be no possible value for these hidden variables, hence we have to guard the checking to ensure we do not assume that there is some value for the hidden variables.

Consider this data file for the photo problem:

列表 3.6.7 Example data for the photo.mzn photo lineup problem (photo.dzn)
n = 9;
g = [M,M,M,M,F,F,F,M,M];
n = 9;
g = [M,M,M,M,F,F,F,M,M];

and assume that an incorrect model returns the erroneous candidate solution pos = [1,2,3,4,5,6,7,2,9]; _objective=25;. If we simply assert inverse(pos,who) in the checker model, then this constraint will fail since there is no inverse of this position array, because persons 2 and 8 are both at position 2. Hence the checker must guard the inverse constraint, and only when that succeeds use the computed values of the who variables for checking the solution. A checking model for the photo problem might look like this:

列表 3.6.8 A checker model photo.mzc.mzn for the photo lineup problem
int: n;
set of int: PERSON = 1..n;
enum GENDER = { M, F, O };
array[PERSON] of GENDER: g;
set of int: POSN = 1..n;
array[int] of int: pos;

array[POSN] of var PERSON: who;
int: _objective;

test alldifferent(array[int] of int: x) =
                 forall(i,j in index_set(x) where i < j)(x[i] != x[j]);

include "inverse.mzn";
constraint if alldifferent(pos) then inverse(pos,who) 
           else forall(i in 1..n)(who[i] = 1) endif;

output [if 
  check_array(pos, n, POSN, "pos") /\
  check_alldifferent(pos,"pos") /\
  check_array(fix(pos), n, PERSON, "who") /\
  forall(i in 1..n-2)
        (check(g[fix(who[i])] != g[fix(who[i+1])] \/
               g[fix(who[i+1])] != g[fix(who[i+2])],
               "three people of the same gender \(g[fix(who[i])]) in positions \(i)..\(i+2)\n")) /\
  let { int: obj = sum(i in 1..n-1)(abs(pos[i] - pos[i+1])); } in
    check(obj = _objective, "calculated objective \(obj) does not agree with objective from the model (\(_objective))\n")
  then
    "CORRECT: All constraints hold"
  else       
    "INCORRECT"
  endif];
              
test check(bool: b,string: s) =
  if b then true else trace_stdout("ERROR: "++s++"\n",false) endif;

test check_alldifferent(array[int] of int: x, string: name) =
  forall(i, j in index_set(x) where i < j)
        (check(x[i] != x[j], name ++ "[\(i)] = \(x[i]) = " ++ 
         name ++ "[\(j)] " ++
         "when they should be different\n"));              
 
test check_int(int: x, set of int: vals, string: name) =
  check(x in vals, "integer \(name) is not in values \(vals)\n");

function bool: check_array(array[int] of int: x, int: length, set of int: vals, string: name) =
  check(length(x) = length, "array \(name) is not of length \(length)\n") /\
  forall(i in index_set(x)) 
        (check(x[i] in vals, "element \(i) of array \(name), \(x[i]) is not in values \(vals)\n"));  

The checker model first tests whether the given pos array satisfies the alldifferent property (using a custom test for alldifferent on a par array). If it passes the test, the inverse constraint is applied. Otherwise, the who array is simply fixed to a list of ones.

The check for the alldifferent constraint is a good example for a checker that tries to give a detailed description of the error.