Sudoku solver

One good friend of mine, fond of Sudoku, was arguing that sometimes you need to infer several times to achieve solving. His assertion makes me feel unsure.

On one hand, I remembered solving really tough sudokus by hand and I had to guess several times before finally succeeding in resolution. On the other end, I am wondering if we infer because we do miss some logical deduction(s) in a prior step of resolution.

As I do not like being in doubt, and eager to get an answer, I asked him for some samples of sudokus where he had to infer several times on each. He provided me two of them. He told me the first one was taken from a French newspaper, and the second from the on-line site websudoku, section 'evil'.

Here are the starting grids for each of them.
sudoku #1 sudoku #2
As I was not willing to spend much time on this, I used R language and RStudio to elaborate a solver that will mimic the human way of solving sudokus. I had to program three tactics.
  1. a positive deduction tactic: direct deduction that a figure goes into a cell from present figures in column, line and square3x3 box
  2. a negative deduction tactic: indirect deduction that a figure goes into a cell from constraints on other cells in column, line, and square
  3. an inference tactic: choose cleverly figure to infer, and select skilfully the square to use for inference
After a few erring ways, mainly due to R shallow copy behavior, I ended up with a reliable solver able to resolve any sudoku puzzle I guess, while still applying human approach. I applied it to the 2 sudokus my friend provided. Here are the results.

Sudoku from newspaper

Solver identified 2 variants on first inference and this has led to sudoku resolution.
Variant 2 is to be rejected because on line 9 column 2 no more legal value can be elected.

Evil sudoku

Solver identified 3 variants on first inference and this has led also to sudoku resolution.
Variant 1 is to be rejected because on line 3 column 2 no more legal value can be elected. Same for variant 3 on line 6 column 1.

Solver validation

Running many samples through my solver, brings me opportunity to fine tune it, and to increase its robustness. It appears, that provided sudokus are not so difficult. I proceeded to many hundreds sudoku tests to validate my solver, and found none not being solved.

During the validation phase, I experimented some sudokus requiring several inferences to get solved. For example, The Weekly Extreme 'Unsolveable' Sudoku Puzzle #154 requires more than 5 level-depth inferences to be solved. see here

Is it the end of the game ?

So, we can conclude
  1. if my friend inferred several times on each of the sudokus he provided, it is because he missed some direct or indirect deductions. On the two samples, they're were no need to infer more than once.
  2. 3 tactics appear to be necessary and sufficient to solve any sudoku puzzle.
  3. all the sudokus are not solvable using only one inference. Some require more inferences. I know some that require several hundreds inferences to cover the full tree of possible solutions.
And finally, some questions remains
  1. Is there a reliable way to rank a sudoku in order to compare their difficulty: two distinct branches have to be considered, without resolution information, and with resolution information
  2. How to determine from scratch the optimal resolution path the one that requires the least inferences and passes?
  3. How to determine from scratch if a sudoku is solvable ? is regular will have one and only one solution, reachable on a single branch of inference?
  4. what is the most optimal strategy for resolution, according to initial sudoku conditions ?

As question remains, I believe the sudoku game is still open.
Note

Beware of R shallow copy, as Iyou'll probably never guess right how it works.
Indeed, very special thanks to Hadley WICKHAM for pryr package which helped a lot for resolution.