Solving Minesweeper – Part 6: Super Fast

The end of the last post was very satisfying for me, I was very proud of myself. The problem is, we could have done a better job. If you check out the last video you can clearly see that between rounds, our algorithm takes multiple seconds to start making decisions. This is because the Imaging Analysis part has some inefficiencies. After thinking about it I came up with a following issues.

  1. We analyse the whole image even though the game grid is only a sub-part of it.
  2. We keep scanning for cells and we don’t take advantage of the fact that once a cell was discovered, it’s value cannot change.
  3. We calculate the centers each time even though cells are always in the same place.
  4. We are scanning over areas that previously were detected as other values

Lucky all of these things can be improve by rethinking our strategy a bit. There are some values we need to extract only once and never again, and there are some things that we need to do every time. Let me show you how does the beginning of our application look like.

GameManager gm;
if (!gm.Initialize())
	return 1;

GameImageAnalyzer imageAnalyzer(gm.Width, gm.Height);
gm.GetImageInto(imageAnalyzer.img.data);
imageAnalyzer.Initialize();

You are probably asking, WTF is GameManager and GameImageAnalyzer. I did some refactorings. I’m planning to make the code publicly available. GameManager took over all the responsibilities that involved the game window: finding it, getting it’s size, copying the image from it, making clicks. GameImageAnalyzer took over the responsibilities of converting the image into a game state. So what the above code does is the following. Initialize the GameManager so it has the information about the window, construct the analyzer with the correct size of input image, read the window image into the input image data, initialize the image analysis. In the initialize method of the analyzer we do all the the things we need to do only once for the analysis part. Let’s look at it shall we?

void GameImageAnalyzer::Initialize()
{
	Mat tmp = GetBmp();
	Mat temp = imread("../img/14.bmp");
	Mat result;
	matchTemplate(tmp, temp, result, TM_CCOEFF_NORMED);
	threshold(result, result, 0.85, 1, THRESH_BINARY);
	Point temp_points[SIZE];
	Rect temp_rects[SIZE];
	int index = 0;
	Scalar b(0, 0, 0);

	for (int i = 0; i < result.cols; i++)
	{
		for (int j = 0; j < result.rows; j++)
		{
			auto z = (int)result.at<float>(j, i);
			if (z > 0)
			{
				Rect r(i, j, temp.cols, temp.rows);
				Point p(i + temp.cols / 2, j + temp.rows / 2);
				circle(result, r.tl(), 16, b, FILLED);
				temp_points[index] = p;
				temp_rects[index] = r;
				index++;
			}
		}
	}
	ExtractBoardInfo(temp_points, temp_rects);
}

First we convert the image into a 3 channel image, then we do the template matching ONLY for the undiscovered image. We can do this since we know that in this phase we have the board in it’s initial state. I’ve got rid of the normalize function in part 5 since it produced a lot of false positives, and i saw the analysis worked better without it. We continue doing the thresholding like before. What follows is another improvement i already did in part 5. Instead of doing the minMaxLoc function over and over again, we can process the image like a grid and detecting all occurrences of a given template in one swoop. ExtractBoardInfo does what our Extract method did in the past, without changing the game state, and not only extract the center coordinates, we also record all the rectangles that define the cells, which we can use later.

In our big while loop which constitutes playing the game, we do the same thing, copy the window contents into the buffer, but after that we call another method on the image analyzer called Analyze.

void GameImageAnalyzer::Analyze()
{
	int index = 0;
	Point* points = new Point[SIZE];
	int* values = new int[SIZE];
	Mat bmp = GetBmp();
	
	for (int i = 0; i < SIZE; i++)
	{
		if (game[i] <= BOMB) // we already processed this cell
			continue;

		for (int j = 0; j < TEMPLATE_COUNT; j++)
		{
			int t = templates[j];
			Mat temp = temps[j];
			Mat result;
			Mat in = bmp(rects[i]);
			
			matchTemplate(in, temp, result, TM_CCOEFF_NORMED);
			threshold(result, result, 0.85, 1, THRESH_BINARY);
			int c = countNonZero(result);
			if (c > 0)
			{
				game[i] = t;
				break;
			}
		}
	}
}

Now this is radically different from what we have done before. First we go though each cell, if we already discovered it we skip it. If not the cell can be in two states: UNDISCOVERED and NOTBOMB. The later is set when we mark a cell as safe but we don’t know it’s value yet. Then we proceed matching the templates. The biggest optimization happens here. Instead of analyzing the whole image, we can cut out the rectangle for this particular cell and analyze only that (Mat in = bmp(rects[i])). This way template matching finishes faster, so does thresholding, and we can detect if we have a match by checking if there is any non black pixel. Neat ha?

There is one more hidden gem here. Notice that we can do early exiting once we found a match. We know we won’t find any other. Additionally to this we start with the UNDISCOVERED image first, since these cells tend not go through drastic changes.

So you are wondering how much faster this is? Our original method took about 2-3 seconds to analyze the image and make the decisions. This new one does it in 20-30 milliseconds. Even faster in some cases which means more than 100x performance. So fast that we needed to add 200ms delay so the animation finishes before we proceed analyzing the next frame. Let’s see it in action.

Yes, 14 seconds ladies and gentlemen. Next time we will improve our decision making. Take care!