Hey guys, I have made a PHP sudoku solver which uses Backtracking algorithm. Not quite good on performance but decent enough to solve in 3 - 4 seconds. I bet implementing dancing links can hep in this. Any thoughts on how it is and what more can be done? https://github.com/shubhamjain/php-sudoku-solver
Some simple common sense optimizations could help -- for example removing values you don't use from being passed to functions/methods. If you aren't using both the row and column offset, don't send them. Your checkRow and checkColumn methods both suffer from this. I'm trying to figure out why you are even messing with strings, or in checkblock why you are screwing around finding the root of each value so much. All you should be needing there is a simple 2 dimensional array. Precalculating values instead of checking on the fly wouldn't hurt either. Creating a multi-access object for each sudoku table might help speed things up... have a setter for plugging in values to a normal 9x9 grid, but also each 3x3 sub-grid as a single array (or even an array of references!)... then your checkBlock method could simply check the sub-grid with in_array. Precalced references via aliases would speed up all your check loops.
Noted didn't realize that. I could'nt think of any logic other than that to check for the given number in to 3 x 3 Block and I am not sure what would precalculating would mean in this context. Point noted for the checkBlocks but I am not sure what you mean by multi-access objects. Thanks for your advises.
It's a term from modula and object pascal where a object has one set of values, but different 'prototype' user types to access it in different ways. In PHP terms, it means building multiple arrays of references. An example would be if you had your 9x9 grid array, indexed $this->grid[$x][$y]... to check if a value is in a column you could use: in_array($value, $this->grid[$x]) But what if you want to check for a row? You'd have to do that manually, right? Well, what if you made a copy by reference thus: // create 90 degree rotated reference for ($x = 0; $x < 9; $x++) { $this->grid90[$x] = []; for ($y = 0; $y < 9; $y++) { $this->grid90[$x][$y] = &$this->grid[$y][$x]; } } Code (markup): then you could check in_array($value,$this->grid90[$y]) to get a list of the values along the x axis for that row! You could do that with the boxes too. Something like... // create 'blocks' for ($blockX = 0; $blockX < 3; $blockX++) { $this->blocks[$blockX] = []; for ($blockY = 0; $blockY < 3; $blockY++) { $this->blocks[$blockX][$blockY] = []; $gridX = $blockX * 3; for ($cellX = 0; $cellX < 3; $cellX++) { $gridY = $blockY * 3; for ($cellY = 0; $cellY <3; $cellY++) { $this->blocks[$blockX][$blockY][] = &$this->grid[$gridX][$gridY++]; } $gridX++; } } } Code (markup): then to check a block, you could do: in_array($value,$this->blocks[$x][$y]) Keep in mind that said check would be $x = 0..2 and $y = 0..2 Hmm... you could even use array_diff to pull a compare on those indexes off a base array of possible values... which could be used for the evaluation logic.. something like: public function missingColumn($x) { return array_diff($this->compare,$this->grid[$x]); } public function missingRow($y) { return array_diff($this->compare,$this->grid90[$y]); } public function missingBlock($x,$y) { return array_diff($this->compare,$this->blocks[$x][$y]); } public function possibles($x,$y) { return array_intersect( $this->missingBlock((int)$x / 3, (int)$y / 3), $this->missingColumn($x), $this->missingRow($y) ); } Code (markup): That last method would give you all possible values for a square since it would be values in $compare that do not exist in a elements row, column or 3x3 box. If there's only one possibility, you could plug it in... do a few passes of that and in six or seven loops you'd have a properly solved sudoku... assuming it has a proper logic based answer and not a 'just keep plugging in at random' approach. I'm gonna take a few minutes and toss together a test for that... Almost seems too simple, as if PHP's array_diff was BUILT for this sort of thing. The basic idea is to do as much 'reference' building before you dig into the logic -- it's a bit like index files for a database, they add overhead during writes and optimization, but they make the hard part that is done most often take so much less time it's worth it. -- edit -- note, I'm using PHP 5.4 arrays above. I'm dropping php 5.3 and lower support from my code.
Ok, this works really well... Seems pretty quick too. <?php class playfield { private $grid, $lastShown, $compare, $grid90 = [], $blocks = []; public function __construct($gridData = null) { if (isset($gridData)) { $this->grid = $gridData; // if we don't typecast $gridData, it gets passed by reference $this->lastShown = (array)$this->grid; } else { die('No Data!'); } $this->compare = range(1,9); // create 90 degree rotated reference for ($x = 0; $x < 9; $x++) { $this->grid90[$x] = []; for ($y = 0; $y < 9; $y++) { $this->grid90[$x][$y] = &$this->grid[$y][$x]; } } // create 'blocks' for ($blockX = 0; $blockX < 3; $blockX++) { $this->blocks[$blockX] = []; for ($blockY = 0; $blockY < 3; $blockY++) { $this->blocks[$blockX][$blockY] = []; $gridX = $blockX * 3; for ($cellX = 0; $cellX < 3; $cellX++) { $gridY = $blockY * 3; for ($cellY = 0; $cellY <3; $cellY++) { $this->blocks[$blockX][$blockY][] = &$this->grid[$gridX][$gridY++]; } $gridX++; } } } } // playfield::__construct public function missingColumn($x) { return array_diff($this->compare,$this->grid[$x]); } public function missingRow($y) { return array_diff($this->compare,$this->grid90[$y]); } public function missingBlock($x,$y) { return array_diff($this->compare,$this->blocks[$x][$y]); } public function possibles($x,$y) { return array_intersect( $this->missingBlock((int)$x / 3, (int)$y / 3), $this->missingColumn($x), $this->missingRow($y) ); } public function valid() { for ($x=0; $x<9; $x++) { if ( (count($this->missingColumn($x)) != 0) || (count($this->missingRow($x)) != 0) ) return false; } for ($x=0; $x<3; $x++) { for ($y=0; $y<3; $y++) { if (count($this->missingBlock($x,$y)) != 0) return false; } } return true; } private function solveGrid() { $result = 0; for ($x = 0; $x < 9; $x++) { for ($y = 0; $y < 9; $y++) { if ( ($this->grid[$x][$y] == 0) && (count($possibles = $this->possibles($x,$y)) == 1) ) { $this->grid[$x][$y] = reset($possibles); $result++; } } } return $result; } public function solve() { echo '<div class="solver">'; $step = 0; $this->show('<b>Starting Values</b>'); do { if (++$step > 80) { echo ' <p class="failed"> Too many steps, may be impossible to solve </p>'; return false; } if (($count = $this->solveGrid()) > 0) { $this->show('<b>Step: ' . $step . '</b> - filled in ' . $count . ' values'); } } while ($count > 0); if ($this->valid()) { echo ' <p class="success"> Solved Successfully </p> <!-- .solver --></div>'; return true; } else { echo ' <p class="failed"> Failed to Solve </p> <!-- .solver --></div>'; return false; } } public function show($caption='') { echo ' <table> ',( empty($caption) ? '' : '<caption>'.$caption.'</caption>' ); for ($x = 0; $x < 9; $x++) { echo ' <tr>'; for ($y = 0; $y < 9; $y++) { if ($changed = ($this->lastShown[$x][$y] != $this->grid[$x][$y])) { $this->lastShown[$x][$y] = $this->grid[$x][$y]; } echo '<td',( ($empty = ($this->grid[$x][$y] == 0)) ? ' class="empty"' : ( $changed ? ' class="changed"' : '' ) ),'>',( $empty ? '' : $this->grid[$x][$y] ),'</td>'; } echo '</tr>'; } echo ' </table>'; } // playfied::show } // class playfield ?> Code (markup): Not sure how good it is at solving them, ran it past five or six different ones and it did the job. I put a copy live on my server here: http://www.cutcodedown.com/for_others/shubhamjain/test.php Which as with all my examples the directory: http://www.cutcodedown.com/for_others/shubhamjain/ ... is open for easy access to the gooey bits and pieces. I've put some .phps files in there so you can view source, and tossed a .rar of it in there as well. I set up the test with three fairly common examples, and so far it seems to solve them well. It has a check in place to see if it is trying more than should be possible (literally any valid sudoku can't run more than 81 times since there's only 81 locations on the grid)... and likewise when it runs out of values it can plug in, it checks to make sure there are no empty ones or invalid/conflicting values plugged in. Really array_diff seems to be the key to doing this reasonably fast in PHP in as few steps as possible. Might be able to squeeze even more out of it by maintianing a list of just the empty values and only checking those during the 'solveGrid' step.
I just uploaded a version that SHOULD have been faster, but it's barely measurable. The big change is maintaining a list of the empties to foreach, that unsets them from the array when a value is plugged in. <?php class playfield { private $grid, $lastShown, $compare, $grid90 = [], $blocks = [], $empties = []; public function __construct($gridData = null) { if (isset($gridData)) { $this->grid = $gridData; // if we don't typecast $gridData, it gets passed by reference $this->lastShown = (array)$this->grid; } else { die('No Data!'); } $this->compare = range(1,9); // create 90 degree rotated reference // and list of 'empties' for ($x = 0; $x < 9; $x++) { $this->grid90[$x] = []; for ($y = 0; $y < 9; $y++) { $this->grid90[$x][$y] = &$this->grid[$y][$x]; if ($this->grid[$x][$y] == 0) { $this->empties[] = [ 'x' => $x, 'y' => $y, 'value' => &$this->grid[$x][$y] ]; } } } // create 'blocks' for ($blockX = 0; $blockX < 3; $blockX++) { $this->blocks[$blockX] = []; for ($blockY = 0; $blockY < 3; $blockY++) { $this->blocks[$blockX][$blockY] = []; $gridX = $blockX * 3; for ($cellX = 0; $cellX < 3; $cellX++) { $gridY = $blockY * 3; for ($cellY = 0; $cellY <3; $cellY++) { $this->blocks[$blockX][$blockY][] = &$this->grid[$gridX][$gridY++]; } $gridX++; } } } } // playfield::__construct public function missingColumn($x) { return array_diff($this->compare,$this->grid[$x]); } public function missingRow($y) { return array_diff($this->compare,$this->grid90[$y]); } public function missingBlock($x,$y) { return array_diff($this->compare,$this->blocks[$x][$y]); } public function possibles($x,$y) { return array_intersect( $this->missingBlock((int)$x / 3, (int)$y / 3), $this->missingColumn($x), $this->missingRow($y) ); } public function valid() { for ($x=0; $x<9; $x++) { if ( (count($this->missingColumn($x)) != 0) || (count($this->missingRow($x)) != 0) ) return false; } for ($x=0; $x<3; $x++) { for ($y=0; $y<3; $y++) { if (count($this->missingBlock($x,$y)) != 0) return false; } } return true; } public function solve() { echo '<div class="solver">'; $step = 0; $this->show('<b>Starting Values</b>'); do { if (++$step > 80) { echo ' <p class="failed"> Too many steps, may be impossible to solve </p>'; return false; } $count = 0; foreach ($this->empties as $key => $data) { if (count($possibles = $this->possibles($data['x'],$data['y'])) == 1) { $data['value'] = reset($possibles); unset($this->empties[$key]); $count++; } } if ($count > 0) { $this->show('<b>Step: ' . $step . '</b> - filled in ' . $count . ' values'); } } while ($count > 0); if ($this->valid()) { echo ' <p class="success"> Solved Successfully </p> <!-- .solver --></div>'; return true; } else { echo ' <p class="failed"> Failed to Solve </p> <!-- .solver --></div>'; return false; } } public function show($caption='') { echo ' <table> ',( empty($caption) ? '' : '<caption>'.$caption.'</caption>' ); for ($x = 0; $x < 9; $x++) { echo ' <tr>'; for ($y = 0; $y < 9; $y++) { if ($changed = ($this->lastShown[$x][$y] != $this->grid[$x][$y])) { $this->lastShown[$x][$y] = $this->grid[$x][$y]; } echo '<td',( ($empty = ($this->grid[$x][$y] == 0)) ? ' class="empty"' : ( $changed ? ' class="changed"' : '' ) ),'>',( $empty ? '' : $this->grid[$x][$y] ),'</td>'; } echo '</tr>'; } echo ' </table>'; } // playfied::show } // class playfield ?> Code (markup): A little surprised it's not really faster than just checking all values -- though I suspect that's related to PHP arrays (and the unset command) having some really odd performance issues unlike the compiled languages I know far, far more about.
You are amazing. Perfect use of built-in functions for optimizations. How many hours you spent on it?
Integrated your changes in the repo. Amazing improvement, time to solve 1000 Sudokus was just 4.9 seconds. Will integrate the template as well.
If I have more than an hour into that I'd be shocked -- though a lot of it was remembering doing the same type of thing in Turbo Pascal 3 some 30 years ago. Checking for the intersections of row, column and block is the key to sudoku solving -- most of the 'work' in making that version came from that post above where I was talking about in_array... I thought of array_diff and my brain went "AHA!!!"... From there it was just dropping it into some stock code. I spent more time typing in some sudoku from a puzzle book that was in my bathroom than I did on the code I use that stock template.php on a LOT of my test code. It's easier to just have a include with two functions I can call during testing -- or even production, so I don't have to waste time on that. Likewise the first bit of the screen.css is stock too. 99% of my websites start from this base markup: http://www.cutcodedown.com/for_others/template.html using this CSS: http://www.cutcodedown.com/for_others/screen.css Gives me a nice reliable baseline to work from.
I just uploaded ANOTHER update -- I was reading more about array_diff, a function I've not used a whole lot, and never realized it can take MULTIPLE arrays as values... so if we did an array_diff($this-compare, $row,$column,$block) in ONE command we'd get all the possible values for a single square! foreach ($this->empties as $key => $data) { if (count($possibles = array_diff( $this->compare, $this->grid[$data['x']], $this->grid90[$data['y']], $this->blocks[(int)$data['x']/3][(int)$data['y']/3] )) == 1) { $data['value'] = reset($possibles); unset($this->empties[$key]); $count++; } } Code (markup): That should shave about 80% off that part of the evaluation, working out to a 15 to 20% overall speed boost! Same file locations as before: http://www.cutcodedown.com/for_others/shubhamjain/ That's REALLY slick... and seems to further emphasize that array_diff was designed for tasks like this. Of course, the code would be even cleaner using PHP 5.5, since list in foreach is one of the 'new' features: foreach ($this->empties as $key => list($x,$y)) { if (count($possibles = array_diff( $this->compare, $this->grid[$x], $this->grid90[$y], $this->blocks[(int)$x/3][(int)$y/3] )) == 1) { $data['value'] = reset($possibles); unset($this->empties[$key]); $count++; } } Code (markup): But that's not 'real world deployable' yet.
Actually I was talking about the table you prepared for sudoku. Thanks for more optimizations you listed. Man I can't help thinking how little I know when I see guys like you.