# Writing a good program to create sudoku

One of the reasons why sudoku has been able to become so popular all over the world is because supply has been able to keep up with demand.And the reason for that is because firstly the vast majority of sudoku seen in mainstream publications are computer generated so they can be created at speed once a generator is written, and secondly it is not massively hard to write a sudoku generator. Although it is a lot trickier to write a good program to create sudoku. This article looks at a few considerations when writing a sudoku generator.

The first thing to point out is that one of the easiest types of program to write in general is one that tries every combination blindly in turn. So if presented with a grid containing 27 numbers it will simply try every possible combination of values for the empty cells until it finds a solution, and if it finds more than one solution, discards the puzzle.

However, the problem here is that this is not how humans solve sudoku: we use logic and strategies in order to make progress rather than guessing in the vast majority of cases.

Therefore the computer should look to emulate the various pieces of logic that humans use to solve puzzles in increasing level of difficulty and use these to create a puzzle. That way you will know that it is: fair, has a unique solution, is human-solvable without guessing, and can start to assess the difficulty of the puzzle using a range of factors such as the number of times each solving rule needs to be used, the number of possible points on the puzzle where progress can be made at any given time, at what stage of the solve harder rules have to be used to make progress (generally the earlier on the harder they are to spot) and just how hard those rules are, together with some other smaller factors.

So the good puzzle generator should use what are sometimes termed human-solve algorithms to create a puzzle: that is, emulate how humans think. As a simple example, the majority of progress in sudoku is done through people looking at the values in a region and seeing if there is only one place a value can go, for instance if there is only one cell in a region that can contain a 3 it must go there. And equally many people look at the values of cells themselves, so if the only possible option for a cell is a 4, then it must be a 4.

Infact many sudoku generators use nothing more than these two rules to create puzzles. This means that they will never make particularly hard puzzles, but infact these may be adequate for many publications depending on the client needs.

Some make the mistake of associating the number of givens with the difficulty level of the puzzle, but there is not a huge amount of correlation between these two in fact.

There are a range of other solve rules you can use. You can look at varying difficulty levels of sets, such as simple ones like 23, 23 or harder ones to spot like 35, 3578, 78, 358.

Some sets enable you to eliminate values from the cells in the set whilst others enable you to remove pencilmarks (possible values) from other cells in the region that are not members of the set, as in the example above.

Another commonly used rule would be to look at groups of three cells that are shared between regions, at the intersection of a region, in a 9x9 puzzle. So the three cells shared between row 1 and the first 3x3 box (e.g. cells 1,2,3 in the puzzle starting at top left). If from the values in the rest of row one you know that the 6 has to occur somewhere in the first three cells, then because those three cells are also *all* in the first 3x3 box, the 6 can be eliminated as a candidate from the other six cells in that box (e.g. cells 10,11,12,19,20,21).

Still other solve rules can be written and included for increasingly difficult puzzles (the above rules in combination are already more than hard enough for any mainstream audience such as a newspaper, magazine or sudoku book).

For instance X-wing is a rule that very few sudoku solvers will look for but is not unreasonable and that a good sudoku player may reasonably be expected to spot before resorting to guessing, and at the very least if they do guess when the logic is explained it seems fair.

When it comes to more complicated rules again, then very few places implement these (there are lots of interesting sounding rules like rectangles, XY-wings, forcing chains, swordfish, nishio etc) as you soon hit a barrier whereby you spend longer looking for increasingly obscure patterns that may occur than you would guessing.

Anyhow, back on topic, a good sudoku program should look to implement as many of the rules mentioned as they are able to (up to X-wing) and record how many times each rule is necessary to solve a puzzle, and in that way be able to make a range of different difficulty level puzzles, assessed fairly accurately on a range of factors including as one of those factors the number of times each solve rule is required.

The next article will look at sudoku and symmetry.

Have you written a sudoku generator? Which language did you use? Did you find it easy, or difficult, or have any particular insights? Feel free to comment below.

*Date written: 30 Nov 2010*

Category: programming,sudoku

Keywords: sudoku **Comment on this post**

If you would like to comment on this post, please enter your comments below; all fields are mandatory. Posts are moderated before display.

Back to World of Puzzles