Sudoku is a kind of puzzle that is played on a square grid of nine by nine cells, each containing a number between 1 and 9. Each row, each column and each of the nine three-by-three boxes must contain each of these nine digits once and only once.

Computers are superior to humans when it comes to carrying out calculations. Humans are superior in terms of intuition, ingenuity and thinking. Both must play to their strengths.

The simplest method to solve the puzzles with a computer is to test all possibilities until the correct one is found. In the first empty cell we put a 1 and check that the result is valid. If it is, we move to the next cell and put a 1, etc. If the grid is not valid, we try using a 2, then a 3, and so on.

The figure to the right shows that the algorithm tries a 1 in the first empty cell and then a 1 in the second, which is obviously impossible so it immediately moves on to a 2, etc. (The clues are in red and the digits tried by the program in blue).

If having a 1 in the first cell does not create an inconsistency, but no number can validly fit in the second empty cell then the 1 was not actually possible. So we remove it and put a 2 instead. The figure shows that the algorithm sometimes backtracks in this fashion.

This method tries all combinations in turn. If a solution exists, it will eventually find it — even if the sudoku puzzle is deemed "difficult", "hard" or "diabolical". On the other hand it even tries inane solutions (the very first combination tried will have two 1 in the first row).

This method does not treat differently puzzles players consider easy and difficult, and thus cannot ascertain the difficulty level of a puzzle.

If the computer tries the digits in increasing order, a puzzle whose first empty cell must have a 9 will take a lot of time. This method therefore does not respect the so-called invariants: swapping all 1 with all 9 does not really change the puzzle, so the solver should not have more trouble with one variant than with the other. To a mathematician, having no difficulty with a certain problem but much more with another, completely equivalent, one is very fishy.

A variant uses a stochastic method, which by definition involves randomness. First, a random value is chosen for each empty cell. Then, we repeatedly modify the numbers randomly to reduce the number of inconsistencies (e.g. the same number twice on the same row), for example with a simulated annealing, a genetic algorithm or a mixture of the two.

It is faster (but more complicated to program) than brute force. On the other hand, it is just as frustrating to humans because there does not seem to be any logic to the method and because the algorithm delivers a solution without providing intermediate steps.

Sudoku grids were created for human players, so the first tactics were invented by humans with a human way of thinking. That is, being logical and relying on deductions.

The simplest tactic is to find a row, column or box that already contains eight digits. In this case, the value of the ninth is obvious.

Unfortunately, this is not so common. The next tactic is to proceed by elimination: which values are impossible for a certain cell: there is already a 1 in this row and a 2 in this column, and so on. By this process of elimination, a list of possible values can be kept for each cell.

If we want to formalize this procedure (for example to implement it in a computer program), we start by saying that the cell can *a priori* contain any of the nine digits. Then, we browse all the cells (with a for loop) and for each cell we look at all the other cells in its row, column and box (new for loops). When we go through each of the eight cells of its row, for example, we will say "this cell contains a 4 so our cell cannot have a 4" and we rule out 4 from the list of possibilities.

Once we have done this with all the empty cells, we will find that some cells have only one possible value (say 6). In this case, we start again (we iterate) taking this information into account: striking 6 from the list of possibilities for all the cells in its row, column and box.

This method is sufficient to solve the simpler grids. For more complicated puzzles, we have to be a bit more cunning.

We can look for pairs. If two cells of the same row, column or box have as only possible values 5 and 7 then we know that one of the two will be 5 and the other 7 (even though we do not know which is which). We can eliminate 5 and 7 as possibilities for the other seven cells. Computationally it means reviewing all rows, all columns and all boxes (more for loops) and for each comparing all pairs of cells (nested for loop).

Another tactic is to look for a number that can only be in a single cell of a certain row, column or box. If in a column there are three empty cells, one with possibilities 1, 3 or 7 and the other two with 1 or 3, it is clear that the 7 must be in the first one. So we go through all rows, all columns and all boxes, and for each we take every number from 1 to 9 in turn (yes it means more nested for loops) and count the number of cells containing it: if there is only one cell then we know where to put the number.

With the above tactics, we can rule out many possibilities for each cell and we will even discover the values of some of them. We will reapply the tactics with this new information to make further deductions. This is enough to solve easy and medium grids but not necessarily for the most difficult ones (the "fiendish" or "diabolical" puzzles).

After weeding a lot out by deduction, we can use brute force to finish the job. We thus have a hybrid method: first we reduce the number of possibilities by deduction and then (if necessary) we test one by one all the remaining combinations.

Brute force has the drawbacks of trying many absurd combinations and of being slow. The initial deductive phase eliminates the obviously impossible combinations and allows the brute force method to try only fairly few, plausible combinations — and therefore to be very efficient.

A player would only attempt brute force as a (very) last resort, because humans are not efficient at computations. He would first try to find other subtle tactics.

Even on a computer, some of these would probably be worth implementing. But others would not efficiently use the strength of computers: computing. We must exploit the tools at our disposal according to their strengths. And the best strategy for an automated solver is not necessarily the best strategy for solving by hand.

Instead of a square grid of nine cells by nine cells containing the numbers 1 to 9, a sudoku puzzle can be construed as a cube containing only the values 0 and 1. On each of the 81 cells of the square grid stands a nine-story tower, each floor representing one of the numbers between 1 and 9.

Each of the 9 × 9 × 9 cells indicates whether a certain number is (we call it 1) or is not (0) in a certain cell of the square grid. For example, "cell (1, 3, 8) is 1" means "the cell in the first row and the third column contains an 8". The figure to the right shows how a row can become a square.

The square grid and the cube are perfectly equivalent provided that the rules are translated as follows.

- There is a single number in each cell means that each tower has one 1 and eight 0. For example, "cell (1, 3) contains an 8" becomes "the cell (1, 3, 8) is equal to 1 and the others (1, 3,
*n*) are 0", so the sum of the nine (1, 3,*n*) cells is 1. - A single occurrence of each number per row. For example, "row 4 contains a single 7" becomes "the cell (4,
*c*, 7) is 1 for one and only one column*c*". This means that the sum of the nine (4,*c*, 7) cells is 1. - Only one occurrence of each number per column. For example, "column 3 contains a single 6" turns into "cell (
*ℓ*, 3, 6) is 1 for a single row*ℓ*". - Likewise for the three-by-three boxes.

There are well established techniques (linear programming) to solve this kind of problem.

It is quite common to solve a problem by reinterpreting it or turning it into another, easier to solve problem. In the present case, it is possible to use well-established techniques instead of reinventing the wheel. (Once the above constraints have been entered by hand, we can delegate the solving to algorithm libraries instead of creating the program ourselves.)

One consequence is that the solver will not use any of the tactics typically used by players (as was already the case with the brute force method, but here we do not try all the possible combinations until we stumble on the right one). If we use linear optimization, the solving will also be done at once, without intermediate steps. A disadvantage is that one can only use such a model to find the solution, not as a tutorial to learn tactics or to unstick hard or "diabolical" puzzles.

When starting a sudoku puzzle, one assumes that there is one and only one solution. This is generally a good assumption. But what if this is not the case?

If the first row contains 1, 2 and 3, the first column 4, 5, and 6, and the top left box has 7, 8 and 9 then the top left cell cannot contain any of the nine numbers and the puzzle has no solution.

This is not a problem for the brute-force method: it tests all possible combinations and if it does not find a solution, this proves that there is none.

If there is no solution, the deductive methods will probably result in an incompatibility: a certain cell cannot contain any of nine digits. But this is not certain: it may be difficult to establish whether a puzzle with few clues is difficult or impossible. (With a hybrid method we can rely on the exhaustiveness of the brute force to prove that there is no solution.)

On the other hand, the clues may allow several valid grids (the extreme case being an entirely blank grid).

The purely deductive methods are based on certainties: this cell must contain a 2, this other one cannot have the numbers 3, 6 and 7, etc. If there are several solutions, these methods will find at best what these solutions have in common but will not be able to choose between them (much like Buridan's ass).

Brute force will probably find only one solution, even if there are several (because it is programmed to stop as soon as it finds a solution). But this method can very easily be modified to test all combinations exhaustively, without stopping at the first solution. On the other hand, this would mean spending a lot of time on every puzzle for the sake of the very unlikely possibility of multiple solutions.

The hybrid method testing all possibilities after eliminating as many invalid combinations as possible will behave like brute force. But since it starts from a much smaller initial set, finding all the solutions will take considerably less time.

When creating a mathematical model, assumptions must be made: we do not take account of what cannot happen or which has little chance of happening. If there is a small chance that the solution of the sudoku grid be not unique, is it worth having a method that can find all the solutions or shall we stop at the first? In general the probability is almost zero, so this possibility will probably not be taken into account. Especially if the computing time could be greatly increased or if the chosen method works only if one is sure that there is a unique solution.

But in a different context, the opposite decision could be made. A sudoku puzzle designer for example may want to check that his puzzles all have a unique solution. In this case it is crucial to test for multiple solutions.

It is naive to try to account for everything. When designing a model, we must consciously ignore some possibilities: not taking into account what is of little import or what has little chance of occurring. If such a trade-off is not sought, the result will be impossible to solve or will take much longer than necessary.