FICO Xpress Optimization Examples Repository: Solving different types of Sudokus with the FICO Xpress Optimizer

FICO Xpress Optimization Examples Repository
FICO Optimization Community
FICO Xpress Optimization Home
Solving different types of Sudokus with the FICO Xpress Optimizer

Description
We formulate and solve Sudoku puzzles as binary optimization problems.
Further explanation of this example: Xpress R Reference Manual
sudoku_r.zip
[download all files]
Source Files
By clicking on a file name, a preview is opened at the bottom of this page.
sudoku.R
[download]
sudoku.R
#####################################
# This file is part of the          #
# Xpress-R interface examples       #
#                                   #
#   (c) 2022-2024 Fair Isaac Corporation #
#####################################
#' ---
#' title: "Solving Sudoku Puzzles"
#' ---
#'
## ----setup, include=FALSE-----------------------------------------------------
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(results = "hold")

#'
#' We formulate and solve Sudoku puzzles as binary optimization problems. This
#' example is inspired by the [Sudoku
#' example](http://roi.r-forge.r-project.org/use_case_sudoku.html) of the ROI
#' package in R, but also by the Sudoku enthusiasm of the author.
#'
#' First, we will create a binary formulation inside a function,
#' `load_sudoku_model`, where we first load the general model. We fix the initial
#' assignment by modifying the lower bounds of variables for which we are given the
#' solution.
#'
#' Secondly, we will extend the classical Sudoku by further constraints
#' that forbid the same element along diagonals.
#'
#' Thirdly, we will extend the model further to incorporate a magic square and
#' knight's move constraints to solve [this
#' puzzle]((https://www.youtube.com/watch?v=hAyZ9K2EBF0)) with the FICO Xpress
#' Optimizer.
#'
#' For all this, we use the `sudokuAlt` package.
#'
## ----Load packages------------------------------------------------------------
# load the xpress library
suppressMessages(library(xpress))

# load the Sudoku package
library(sudokuAlt)

# for data transformations
library(dplyr)
set.seed(123)
game <- makeGame()
plot(game)

#'
#' # Mathematical Formulation
#'
#' A classical Sudoku puzzle is formulated in a 9x9 grid of cells,
#' where the cells are subdivided into 3x3 subcells, the so-called "squares".
#' The goal is to fill all cells with the numbers 1 through 9, such
#' that each number appears only once in
#'
#' - each cell
#' - each row
#' - each square
#'
#' Let $R = \{1,\dots, 9\}$,
#' $C = \{1,\dots, 9\}$, and $S = \{S_{11}, S_{12}, \dots, S_{33}\}$,
#' where each $S_{ij}$ denotes a 9-element subset of $R\times C$,
#' e.g., $S_{13} = \{1,2,3\} \times \{7,8,9\}$ is the top right square of the Sudoku.
#'
#' In general
#' $$
#'   S_{ij} := \{3i - 2, 3i - 1, 3i\} \times \{3j - 2, 3j - 1, 3j\}  \quad \forall i,j \in \{1,2,3\}
#' $$
#' Let $L = \{1,\dots, 9\}$ denote the allowed symbols.
#' Now we are ready to formulate the binary program that encodes
#' Sudoku using 9 binary variables per cell,
#' 1 for each symbol. If $x_{rcl} = 1$, the symbol $l$ should
#' be placed in cell $(r,c)$ of the Sudoku board.
#'
#' Let $\emptyset \neq V_0 \subseteq R\times C \times L$
#' denote the given set of nonconflicting assignments.
#'
#' $$
#' \begin{align}
#'   &\min & 0\\
#'   & & \sum\limits_{l = 1}^9 x_{rcl} & =  1 & \forall r\in R, c \in C\\
#'   & & \sum\limits_{r = 1}^9 x_{rcl} & =  1 & \forall c \in C, l \in L\\
#'   & & \sum\limits_{c = 1}^9 x_{rcl} & =  1 & \forall r \in R, l \in L\\
#'   & & \sum\limits_{(r,c) \in S_{ij}} x_{rcl} & =  1 & \forall i,j\in \{1,2,3\}, l \in L\\
#'   & &                       x_{rcl} & = 1 & \forall (r,c,l) \in V_0\\
#'   & &                       x_{rcl} & \in \{0,1\} & \forall r \in R, c \in C, l \in L\\
#' \end{align}
#' $$
#'
#' In this case, we have no objective function. There are five sets of constraints. The first constraint group ensures that each cell contains exactly 1 symbol. The next three model the restriction that each symbol should appear exactly once
#' in each column, row, and square, respectively.
#' The last set of constraints fixes the initial assignment. Note that with an additional assignment, there are further variables that could be fixed to 0.
#' We do not add those restrictions to 0 to the model itself, because these additional fixings can be inferred immediately from the
#' initial assignments during presolving by the Optimizer.
#'
#'
#' # The Load Sudoku Function
#'
#' We declare a function `load_sudoku_problem` that takes as input an
#' XPRSprob object (which can be NULL, in which case we create a new one),
#' and a Sudoku game.
#'
## -----------------------------------------------------------------------------
#' loads a sudoku problem into the FICO Xpress Optimizer
#'
#' @param prob an existing XPRSprob or NULL, in which case we create a new problem
#' @param sudoku_game  a Sudoku Game created via sudokuAlt::makeGame()
#'
#' @return list object into which the Sudoku Game formulation was loaded.
#'        This list has two named attributes:
#'          -  \code{prob} is the same \code{prob} if \code{prob} was non-NULL,
#'            or a newly created XPRSprob object
#'          - \code{index.df} The index data frame that contains the initial assignment
#'            and variable indices for each row/column/label triplet r,c,l.
#'
load_sudoku_problem <- function(prob=NULL, sudoku_game) {

  # create a new prob object if none was specified
  if (is.null(prob)) {prob <- createprob()}

  problemdata <- list()

  # create some sets to match the notation above
  R <- 1:9
  C <- 1:9
  L <- 1:9

  # create a convenient data frame that contains all combinations
  # of R,C,and L to simplify indexing
  index.df <- expand.grid(R,C,L)
  names(index.df) <- c("R", "C", "L")


  # infer the subcell for each row/column combination.
  index.df$Si <- (index.df$R - 1) %/% 3 + 1
  index.df$Sj <- (index.df$C - 1) %/% 3 + 1

  # we introduce var index to label each of the 9 * 9 * 9 = 729 binary variables
  n <- nrow(index.df)
  index.df$VarIndex <- 1:n

  # We use the designGame function to query a data frame representation of the Sudoku.
  game.df <- sudokuAlt::designGame(sudoku_game)
  # head of typical game.df. Unassigned symbols show as LNA.
  #     Row Col Square Symbol
  # 1  R3  C3    S11     L9
  # 2  R1  C6    S12     L6
  # 3  R2  C4    S12     L4
  # 4  R1  C7    S13     L5
  # 5  R1  C8    S13     L7
  # 6  R3  C8    S13     L2
  # We convert the game.df and merge the symbols
  game.df <- game.df[game.df$Symbol != "LNA",]
  game.df$R <- as.integer(gsub("R", "", game.df$Row))
  game.df$C <- as.integer(gsub("C", "", game.df$Col))
  game.df$Si <- as.integer(gsub("S(.).", "\\1", game.df$Square))
  game.df$Sj <- as.integer(gsub("S.(.)", "\\1", game.df$Square))
  index.df <- merge(index.df, game.df, all.x = T)
  index.df <- index.df[order(index.df$VarIndex),]

  #the number of constraints: 9 * 9 * 4 (324)
  nconss <- 9 * 9 * 4

  problemdata$objcoef <- rep(0, n)

  # We formulate the constraints as 4 sparse column matrices
  # Each constraint group arranges the variables differently,
  # namely across a major and a minor index.
  to_sparse_matrix <- function(major, minor) {
    considx <- 9 * (major - 1) + minor
    sparse_matrix <- Matrix::Matrix(0, 9*9, n, sparse = T)
    idxmatrix <- matrix(c(considx,1:n), ncol = 2)
    sparse_matrix[idxmatrix] <- 1
    sparse_matrix
  }

  # 1 symbol in each cell: Constraint index determined by row and column
  cell.cons.mat <- to_sparse_matrix(index.df$R, index.df$C)

  # each symbol only once per column: Constraint index determined by column and Label
  col.cons.mat <- to_sparse_matrix(index.df$C, index.df$L)

  # each symbol only once per row: Constraint index determined by row and Label
  row.cons.mat <- to_sparse_matrix(index.df$R, index.df$L)
  #
  # each symbol only once in each square:
  # Constraint index determined by label and index of square
  # use the square indices Si and Sj to compute the minor indices ranging from 1:9
  square.cons.mat <- to_sparse_matrix(index.df$L, 3*(index.df$Si - 1) + index.df$Sj)

  problemdata$A <- rbind(
    cell.cons.mat,
    col.cons.mat,
    row.cons.mat,
    square.cons.mat
  )

  # right hand sides. we only treat the cardinality constraint here.
  # Initial assignment is realized later via bounds on variables.
  problemdata$rhs <- rep(1, nconss)
  # declare all constraints as equations
  problemdata$rowtype <- rep("E", nconss)

  # all upper bounds are 1
  problemdata$ub <- rep(1, n)

  # fix the initial assignment by setting variable lower bounds
  initial.assignment <- which(sprintf("L%d", index.df$L) == index.df$Symbol)

  problemdata$lb <- rep(0, n)
  problemdata$lb[initial.assignment] <- 1

  # all columns are binary
  problemdata$columntypes <- rep("B", n)

  # assign column names to identify the variables when writing the problem to a file
  problemdata$colname <- sprintf("x_R%dC%dL%d", index.df$R, index.df$C, index.df$L)

  problemdata$probname <- "Sudoku"
  # this call also returns prob from the function
  list(prob=xpress::xprs_loadproblemdata(prob, problemdata=problemdata),
       index.df=index.df
  )
}

#'
#'
#' # Solving a Sudoku with Xpress
#'
#' We call `load_sudoku_problem` on the game that we created at the beginning
#' of this tutorial.
#'
#' The problem is ready to be solved. We enable console output and
#' solve the Sudoku using `mipoptimize`.
#'
## ----solve--------------------------------------------------------------------
problist <- load_sudoku_problem(sudoku_game = game)
prob <- problist$prob

print(prob)
setoutput(prob)
mipoptimize(prob)

#'
#' # Plot the Result
#'
#' We want to convert the binary solution into a 9-by-9 matrix containing the
#' entries 1--9.
#'
#' Therefore, we use the index data frame. Recall that each row $i$ of this data
#' frame corresponds to the $i$'th column of the Sudoku model.
#' We filter those index rows for which the corresponding binary variable
#' has a solution value of 1.
#'
#' We use the dplyr function arrange to sort the filtered entries by row and column.
#' The corresponding symbols are then used to fill a matrix representation
#' of the binary solution.
#'
#' Finally, we use the plot functionality of the `sudokuAlt` package
#' to display the solution.
#'
## ----plottheresult------------------------------------------------------------
# translate the MIP solution into a solution of the game:
solution <- getmipsol(prob)$x

# nice trick to enable Sudoku style plotting from ROI example:
to_sudoku_solution <- function(solution, index.df) {
  rowbyrowsol <- index.df %>%
    dplyr::filter(near(solution,1)==TRUE) %>%
    dplyr::arrange(R, C) %>%
    dplyr::select(L)
  matrix_solution <- matrix(rowbyrowsol$L, 9,9, byrow = T)
  sudoku_solution <- structure(matrix_solution, class = c("sudoku", "matrix"))
  sudoku_solution
}

sudoku_solution <- to_sudoku_solution(solution, problist$index.df)
par(mfrow = c(1,2))
plot(game)
plot(sudoku_solution)

#'
#' # Add Diagonals
#'
#' A Sudoku with diagonals constrains the symbols along the two main diagonals.
#' The main diagonal $D^+$ is characterized as all cells
#' $(r,c)\in R \times C$ where $r = c$.
#' The antidiagonal $D^-$ is characterized as $(r,c) \in R \times C$ such that $r+c = 10$.
#'
#'
#' ## Mathematical notation
#'
#' The $18$ additional constraints for the diagonal and anti diagonal, using the same variable notation as before, then read
#'
#' $$
#' \begin{align}
#'   & & \sum\limits_{r=c} x_{rcl} & =  1 & \forall l \in L\tag{D+}\\
#'   & & \sum\limits_{r+c=10} x_{rcl} & =  1 & \forall l \in L\tag{D-}\\
#' \end{align}
#' $$
#'
#'
#' ## Game Creation
#'
#' In order to use these diagonal constraints,
#' we first create a new game with fewer initial assignments.
#' The default has about 20 initial assignments.
#' Together with diagonal constraints, this may result in an infeasible MIP.
#'
## ----diagonals_create_game----------------------------------------------------

# create a game with slightly less initial assignments than before.
sudoku_game <- sudokuAlt::makeGame(gaps = 68)

# load it using our function.
problist <- load_sudoku_problem(sudoku_game = sudoku_game)
probdiag <- problist$prob
index.df <- problist$index.df

print(probdiag)
setoutput(probdiag)

#'
#' ## Main Diagonal
#'
#' We add the constraints for the main diagonal via `addrows`.
#' The main diagonal consists of all cells whose
#' row and column index are equal.
#' The sorting via `arrange` ensures the correct grouping of
#' all variables that belong to 1 symbol
#' required by `addrows`.
#'
#' Remember that the arguments `start` and `colind` to addrows must be zero-based.
#' In addition, `start` has length 10 (`nrows + 1`), where the last entry contains the total
#' number of elements.
#'
## ----adding_main_diagonal-----------------------------------------------------
add_main_diagonal <- function(prob, index.df) {
  diagonal.df <- index.df %>%
    dplyr::filter(R == C) %>%
    dplyr::arrange(L)

  prob <- addrows(
    prob = prob,
    rowtype = rep("E", 9),
    start = c((9 * (0:8)), 81),
    rhs = rep(1, 9),
    colind = (diagonal.df$VarIndex) - 1,
    rowcoef = rep(1,81)
  )

  prob
}

#'
#' ## Anti Diagonal
#'
#' In contrast to the diagonal, only the filter condition in the below selection
#' is modified for the anti diagonal.
#'
## ----adding_antidiagonal------------------------------------------------------

add_anti_diagonal <- function(prob, index.df) {
  antidiagonal.df <- index.df %>%
    filter(R + C == 10) %>%
    dplyr::arrange(L)

  # note that only the data frame that provides column indices changes between the two calls to addrows().
  prob <- addrows(
    prob = prob,
    rowtype = rep("E", 9),
    start = c((9L * (0:8)), 81),
    rhs = rep(1, 9),
    colind = (antidiagonal.df$VarIndex) - 1,
    rowcoef = rep(1,81)
  )

  prob
}

#'
#'
#' ## Solve with Diagonal Constraints
#'
#' We invoke the solution process and display the solution as before. Note that this time, the solution process is no longer trivial,
#' as the Branch-and-Bound search actually has to explore a small search tree for finding a solution.
#'
#' As for the simple Sudoku without diagonal constraints,we use the convenience
#' function to `to_sudoku_solution` to convert the binary representation
#' obtained from the Optimizer into a 9-by-9 matrix representation
#' for plotting.
#'
## ----solve_with_diagonals-----------------------------------------------------
probdiag <- add_main_diagonal(probdiag, index.df)
probdiag <- add_anti_diagonal(probdiag, index.df)
mipoptimize(probdiag)
summary(probdiag)
solution <- getmipsol(probdiag)$x

sudoku_solution <- to_sudoku_solution(solution, index.df)
par(mfrow = c(1,2))
plot(sudoku_game)
plot(sudoku_solution)

#'
#'
#' # Solving a Sudoku with Only 4 Digits
#'
#' Finally, we turn our attention towards the following masterpiece
#' of Sudoku puzzles. In his Youtube channel
#' [Cracking the Cryptic](https://www.youtube.com/watch?v=hAyZ9K2EBF0),
#' Simon Anthony solves a Sudoku with only 4 given digits. We recommend
#' watching this video.
#'
#' The rules that apply to this Sudoku are as follows.
#'
#' * Classical rules (each digit once per row, column, and square)
#' * Diagonals, see Section above
#' * The central square must also form a *magic square*, meaning that
#'   the digits in each row and column of the central square must sum up
#'   to the same number.
#' * Cells that are a knight's move apart (in chess) must not contain the same digit.
#'
#' We would like to solve this puzzle using Xpress. We have already established
#' the building blocks for the classical rules (via `load_sudoku_problem`)
#' and for the diagonals.
#'
#' How do we model a magic square? The contribution of a single digit in cell
#' $r,c$ can be expressed as $\sum_{l=1}^{9} l \cdot x_{rcl}$ because
#' we know that exactly one of the 9 binary variables is set to 1.
#'
#' The digit sum in the top row of the central square (row 4) can then be expressed
#' as $\sum\limits_{c=4}^{6}\sum\limits_{l=1}^{9} l \cdot x_{4cl}$.
#' Since the total sum of all 3 rows (columns) in the central square must be equal to 45,
#' the digits in each row (column) must sum up to 15.
#'
#' We arrive at the following additional magic square constraints:
#'
#' $$
#' \begin{align}
#'   & & \sum\limits_{c=4}^{6}\sum\limits_{l=1}^{9} l \cdot x_{rcl} & =  15 & \forall r \in \{4,5,6\}\\
#'   & & \sum\limits_{r=4}^{6}\sum\limits_{l=1}^{9} l \cdot x_{rcl} & =  15 & \forall c \in \{4,5,6\}\\
#' \end{align}
#' $$
#' We declare a function that receives a prob and an index data frame
#' and adds the 6 magic square constraints.
#'
## ----Adding a Magic Square----------------------------------------------------
add_magic_square <- function(prob, index.df) {
  # consider only the central square S_22
  central.square.df <- index.df %>% filter(Si == 2) %>% filter(Sj == 2)

  # sort by row
  central.square.df.by.row <- central.square.df %>% arrange(R)
  # add the constraints for each row
  prob <- addrows(prob, rowtype = c("E", "E", "E"), rhs = c(15,15,15), start = c(0,27,54,81),
          colind = central.square.df.by.row$VarIndex - 1,
          rowcoef = central.square.df.by.row$L)

  # sort by column
  central.square.df.by.col <- central.square.df %>% arrange(C)
  # add the constraints for each column
  prob <- addrows(prob, rowtype = c("E", "E", "E"), rhs = c(15,15,15), start = c(0,27,54,81),
          colind = central.square.df.by.col$VarIndex - 1,
          rowcoef = central.square.df.by.col$L)

  # return prob with the added constraints
  prob
}

#'
#'
#' Consider the cell $r,c$ where $r \in R$ and $c \in C$. A knight in
#' chess may move by either 2 horizontal and 1 vertical
#' or 1 horizontal and 2 vertical steps. Denote by $K(r,c)\subset R\times C$ the neighbors
#' of a cell $r,c$ with respect to the knight's move.
#'
#' We have $K(r,c)$ consisting of up to eight neighboring cells of the form
#'
#' $$
#' K(r,c) = (\{(r \pm 2, c \pm 1)\} \cup \{(r \pm 1, c \pm 2)\}) \cap R\times C
#' $$
#'
#' With this definition, we formulate the knight's move constraints compactly as follows:
#'
#' $$
#' \begin{align}
#'   & & x_{rcl} + x_{r'c'l} & \leq  1 & \forall r,c,l \in R\times C\times L, (r',c') \in K(r,c)\\
#' \end{align}
#' $$
## ----Adding Knight\'s Move Constraints----------------------------------------
get_knights_move_neighbors <- function(r,c) {
  # create all possible moves, possibly outside the feasible range

  # note that this traversal will encounter each pair of conflicting variables x,y
  # twice. The first time when all neighboring cells of x are traversed,
  # and a second time when those around y are determined.
  # The resulting duplicate rows are detected by Xpress in presolving
  neighbors <- matrix(c(
           r - 2, c - 1,
           r + 2, c - 1,
           r - 2, c + 1,
           r + 2, c + 1,
           r - 1, c - 2,
           r + 1, c - 2,
           r - 1, c + 2,
           r + 1, c + 2
           ), byrow = T, ncol=2)

  # filter elements outside the feasible range and return
  neighbors[(neighbors[,1]>=1) & (neighbors[,1]<=9)  & (neighbors[,2]<=9)   & (neighbors[,2]>=1),]
}

# new we can create the knight's move constraints
add_knights_move_constraints <- function(prob, index.df) {
  # one additional constraint for each r,c,l

  # we establish the R,C,L order in index.df.sorted.
  # An element r,c,l has the index 81 * (r - 1) + 9 * (c - 1) + l
  index.df.sorted <- index.df %>% arrange(R,C,L)

  for (row in 1:9) {
    for (col in 1:9) {
      neighbors <- get_knights_move_neighbors(row, col)

      # add constraints that forbid the same symbol in the neighbors and this cell
      # one for each symbol
      for (symbol in 1:9) {
        cellvaridx <- index.df.sorted$VarIndex[81 * (row - 1) + 9 * (col - 1) + symbol]
        neighborvarindex <- index.df.sorted$VarIndex[81 * (neighbors[,1] - 1) + 9 * (neighbors[,2] - 1) + symbol]

        # add one row per neighbor
        for (n in neighborvarindex) {
          prob <- addrows(prob, rowtype = "L", rhs = 1, start = c(0,2),
                          colind = c(cellvaridx, n) - 1,
                          rowcoef = c(1,1))
        }
      }
    }
  }

  prob
}

#'
#' We are ready to create the game from Anthony's video. First, we have to create the game by hand.
#'
## -----------------------------------------------------------------------------
# Make a fresh game and override the given symbols by the four digits from the video.
game4digits <- makeGame()
for (i in 1:9) for (j in 1:9) game4digits[i,j] <- NA
game4digits[4,1] <- 3; game4digits[4,2] <- 8; game4digits[4,3] <- 4; game4digits[9,9] <- 2
plot(game4digits)

#'
#' Now we load all the constraints into a fresh `XPRSprob` object.
#'
## ----Load all constraints-----------------------------------------------------
problist <- load_sudoku_problem(prob = createprob(), sudoku_game = game4digits)
prob4digits <- problist$prob

# add diagonals
prob4digits <- add_main_diagonal(prob4digits, problist$index.df) %>%
               add_anti_diagonal(problist$index.df) %>%
# add magic square
               add_magic_square(problist$index.df) %>%
# add knight's move
              add_knights_move_constraints(problist$index.df)

setoutput(prob4digits)
summary(mipoptimize(prob4digits))

#'
#' Here is the solution together with the original game.
#'
## ----Plot the solution alongside the game-------------------------------------
solution <- getmipsol(prob4digits)$x

solution4digits <- to_sudoku_solution(solution, problist$index.df)
par(mfrow = c(1,2))
plot(game4digits)
plot(solution4digits)

#'
#'

#'