The Importance of Analyzing and Optimizing your SQL Queries

I’m sure many of you have heard the importance of optimizing your SQL queries for the optimizer. You may have doubted this, maybe you thought your query was too simple. Well, this post is going to show you exactly why query optimization is so important.

I’ll start with the graph:

For statistical analysis my solver periodically gathers statistics about the state of the database. The graph above shows the number of boards by status. “New” boards have not been processed yet, while “Done” boards have.

As you can see over the initial few days it generated about 2.8 million boards. After I optimized only one of my queries and in the past 18 hours have generated over 18 million boards.

I’m sure you remember the query we fixed from last week, that’s the query I optimized. Let’s look at it again:

SELECT id, tiles
FROM boards
WHERE status = 'NEW' AND id <= ?
ORDER BY id DESC LIMIT 1

Looking at the status of the server showed the database engine was pegging the CPU. Let’s figure out why that is!

MariaDB has a handy tool, SHOW PROCESSLIST to view what the server is processing at the moment. While running with the old query, that query showed up every time I tried it, so I decided to investigate the query. The ANALYZE tool was made to help investigate query performance:

ANALYZE
SELECT id, tiles
FROM boards
WHERE status = 'NEW' AND id <= 14537478
ORDER BY id DESC LIMIT 1

FieldValue
id1
select_typeSIMPLE
tableboards
typeref
possible_keysPRIMARY, status, id_status
keystatus
key_len1
refconst
rows11,369,300
r_rows8,014,010.00
filtered100.00
r_filtered0.00
ExtraUsing where

The analyze output indicates even though we’re using the id_status key values in our where clause, it still uses the key created on status. This is problematic, because it means that even though we should be able to use the key built exactly to optimize this query, we ended up reading over 8 million rows to get the table result. That’s a huge cost to grab a single row.

So why are we doing this? Turns out, MariaDB doesn’t support DESC order keys. So that ORDER BY id DESC is causing us to not use the Primary Key, and we end up going to the status key, and then with those results we search for the id closest to the target.

That means to get the benefit of the Primary Key we need to invert the query:

SELECT id, tiles
FROM boards
WHERE status = 'NEW' AND id >= ?
ORDER BY id ASC LIMIT 1

Here’s the analyze result for that:

FieldValue
id1
select_typeSIMPLE
tableboards
typeref
possible_keysPRIMARY, status, id_status
keystatus
key_len1
refconst
rows11,590,310
r_rows1
filtered100.00
r_filtered100.00
ExtraUsing index condition; Using where

So we’re reading only 1 row, which is good, but we’re still using the status key. And looking at the behavior of the server, we’re still pegging the CPU. Something is still wrong, the database engine isn’t using the right index. Well, MariaDB gives us a way to give the database engine a hint. The result is this query:

SELECT id, tiles
FROM boards
USE INDEX (id_status)
WHERE status = 'NEW' AND id >= 14537478
ORDER BY id ASC LIMIT 1

With this analyze:

FieldValue
id1
select_typeSIMPLE
tableboards
typerange
possible_keysid_status
keyid_status
key_len9
ref(NULL)
rows11,626,588
r_rows1
filtered100.00
r_filtered100.00
ExtraUsing index condition

Finally we’re using the right index, and that means some other interesting things happen. For instance, we stop using the WHERE clause and are really just using the index. The query is much faster, even the ANALYZE finished immediately while the prior ANALYZE instances took over a minute. Now after running that query, the server’s CPU load has fallen to below 10%. This means we should be able to scale out the runner a bit better, at least, until we start hitting our next bottleneck.

Tile Set Re-completed, Error Found

Good news everyone!

Yesterday I finished rebuilding the Eternity II tile set, and today I determined the source of the anomalies in the growth of generated boards.

Rebuilding the tile set just took a bunch of time to sit down and go through the physical tile pieces and convert them to a CSV file. I won’t bore you with those details.

More interesting is the programming error. Let’s see what was corrupting my data! Let’s start with this handy SQL statement:

SELECT max(id) AS id, tiles
FROM boards
WHERE status = 'NEW' AND id <= ?
LIMIT 1

What do you think this query accomplishes? You can probably determine the intent. I was trying to get the maximum tile id below some limit, and some data stored with it. My runner randomly generates the limit value, hopefully giving a good mix of rows from different parts of the computation.

However, there is an interesting thing that happens with this query. It returns the correct id value, but it returns the data portion from an arbitrary row. It returned whatever the database engine happens to read first that meets the WHERE clause. This means I was processing the same board configuration multiple times, but skipping the ones identified by the id. This would cause both a lot of wasted calculations, and a lot of missed configurations.

Oof, no fun at all. A restart was really required.

After some digging and experimentation, the corrected query looks like this:

SELECT id, tiles
FROM boards
WHERE status = 'NEW' AND id <= ?
ORDER BY id DESC LIMIT 1

Moving the “max” calculation from the fields section of the query into the ORDER BY clause means the query will return the paired row id and data together.

Now that I have corrected the error, on to processing board configurations!

Data Corruption – Resetting Eternity II Progress

Disaster has struck!

I was investigating anomalies in the growth of generated boards and counted the occurrences of each side in my tile configuration. This allowed me to determine at what scale the growth should be at each level.

Unfortunately, doing so revealed my tile set is corrupt! Two of the sides occur an odd number of times in my data set!

Because this is a tile matching problem, the number of occurrences of each side must be even. This is apparent because each side must match with another side of the same type, requiring two same sides for each edge. Long story short, if any side appears an odd number of times, I can never complete the puzzle.

This corrupts my entire generated board state so far and requires me to reset and with a regenerated tile set. Boo.

I think something else is wrong with the generation algorithm and need to investigate that. For now, I’m just going to go through the process of regenerating my tile set.

Efficiently Representing Board State

“Our calculations are that if you used the world’s most powerful computer and let it run from now until the projected end of the universe, it might not stumble across one of the solutions.”
– Lord Monckton, One of the Designers of the Eternity II puzzle

An algorithm for generating partial boards for Eternity II is reasonably straightforward. As is an algorithm for checking if the board you have is a solution to the puzzle. However, the scale of the problem means it’s not feasible to set up an algorithm that generates boards and checks correctness until you find one. It would simply take too long and if the computer you’re running it on crashes, you’d have to restart. Not a good place to be.

This means, we need to store state as we go, nearly continuously. This way we lose as little progress as possible if something breaks along the way.

The algorithm I’m using takes the current board state and determines the “optimal” place to set the next batch of tiles. Once that location has been determined it generates the board states representing placing all possible tiles in that location and stores them. Then we grab a new board state and try again. So for the sake of this design, each “step” is placing one new tile on the board. I’ll go over this algorithm in more detail in a later post. For now, let’s talk about storage.

To represent board state, we need to represent the placement and orientation of all tiles on the 16×16 grid. For simplicity we’ll represent the tiles with numbers and the coordinate system of the grid can be the 1st quadrant of a xy-plane. So that means to store the board state itself we’ll need to store a set of tuples of tile ids, x, y coordinates, and orientation. So a set of tiles would be stored like this:

(37, 5, 6, 0)
(48, 5, 7, 180)

I used the number of degrees we rotate the tile around the unit circle to represent the orientation. If we have a full solution to the puzzle, it would involve a set of 256 of these tuples.

Now, with this being all we need let’s get this as small as we can.

The tile id is a number between 1 and 256. All the tiles came pre-numbered, so the numbering system is already given. And fortunately 256 is 28, so we know we can use just 8 bits to represent it. Unfortunately storing 256 itself actually requires the 9th bit, due to how unsigned numbers work in binary. It’d be great to not waste that extra bit, so we need a solution. One solution is to subtract one from each number, but an even better solution a friend told me a long time ago is to just convert 256 itself to 0. This means most of our tiles are exactly as they are represented in the physical world, except for 256, which is 0.

So now we can neatly get the tile id into an 8-bit field, or 1 byte.

The x and y coordinates are on a 16×16 grid. 16 is 24, so we can use the same approach we used for tile ids, and get each dimension into 4 bits. Given two dimensions that’s another 8 bits.

So now we’re at a 16-bit field, or 2 bytes.

That leaves orientation. Above I represented orientation as the number of degrees rotated around the unit circle. There are 360 degrees around a circle which, while not a power of two, will fit in 9 bits. So we could continue to store this and use another 9 bits, or 3 bytes and 1 bit. However, we can’t really use all 360 degrees, these tiles are all squares, so we can only use increments of 90 degrees. That means we only need to consider 4 orientations, 0°, 90°, 180°, and 270°. This can be represented in 2 bits.

So the total storage needed to store a single tile position is 2 bytes and 2 bits.

But then we hit reality, computer systems rarely like to store things that are less than a byte, so really we end up with 3 bytes, and 6 bits of space (or 25%!) wasted:

That’s a lot of wasted storage! Time for some bit packing! It’s possible to pack in data from the next tile into that wasted space to avoid reclaim it. We start by putting the first six bits of the tile id into the wasted space, and then filling the next byte with the remaining two bits, and the x coordinate, etc. Finally, repeat until we reach a byte boundary, ending up with a structure that looks like this:

So we can store 4 tiles every 11 bytes with zero wasted storage. Great!

If you want to try to do this, be very careful with your bit-shift operators! Most of the bit-shift operators will shift the sign bit, and certain languages (like Java) make assumptions around the unit type when you are reading a byte field. Those caused some headaches for me while implementing this, but now I have a completed, and thoroughly tested, solution. On to generating new board states!

Going after Eternity II

The Eternity II Puzzle is a 256 piece edge-matching puzzle. To solve the puzzle, you must lie out all 256 tiles onto a 16×16 grid with all the edges matching. A single seed tile is positioned in the middle of the board. Tomy UK Ltd. released Eternity II in 2007 with a $2,000,000 prize for solving it. The designers tried to generate the hardest to solve puzzle with this specification. I picked up a copy in 2008 and did some work trying to solve it back then, but dropped it as other priorities came up.

Back when I was working on this initially there were a variety of people trying to solve it with stateless DFS, or randomly positioning all tiles and trying to shuffle the board to maximize the number of edges that matched. I tried a variety of these solutions and some of my own invention. One method I tried was a construction solution where I build all possible sets of two tiles, combined those to all possible sets of four, etc, resulting in storing all combinations of these sizes: 2×1, 2×2, 4×2, 4×4, 8×4, 8×8, 16×8, 16×16. The idea was after only eight steps I could solve the whole board. That rapidly proved a bad idea as the number of valid combinations at just the lower stages were so numerous they dwarfed my storage capacity. I had little hope of things improving in the higher stages. I ran a simulation running a DFS to generate possible 8×8 squares and counting (without storage) and got to the hundreds of millions before I stopped. There was simply too much data generated for me to store. To make matters worse, I knew a vast majority of these solutions weren’t conducive to the actual solution, as this approach ignored the seed tile. So I didn’t want to store a bunch of data that wasn’t useful in generating a solution. That wouldn’t get us anywhere.

A number of things have changed since 2009. In 2010 they ended the prize portion of the puzzle, and they’ve shut everything about it down. However, there are a number of positive developments for this puzzle. For instance, the price of storage has fallen dramatically. I could conceivably store all those 8×8 squares if I wanted, I would have the capacity for it (not that I want to waste space that way). Computing power is also easier to come by. In 2009 I had a desktop and a laptop, now I own a small cluster of compute power at my disposal, and can easily expand that compute power with the cloud. I’ve also grown as a developer and have more experience with distributed solutions. Given all of this, I will tackle this problem again. I don’t know if I’ll actually solve it, the solution space is enormous, and my solution may never stumble upon the answer. But I still think it will be a fun way to learn more about distributed systems, and there are several interesting problems that I can tackle within this context.

It should be fun.