Meaning of "Select Tables Optimized Away" in MySQL Explain Plan

Meaning of Select tables optimized away in MySQL Explain plan

It means you have done a query that does nothing more than count the number of rows in a table, and that table is a MyISAM table. MyISAM tables are stored with a separate row count, so to do this query MySQL doesn't need to look at any of the table row data at all. Instead it immediately returns the pre-calculated row count. Hence the table access is ‘optimized away’ and the query is lightning-fast.

The same won't happen on other storage engines in MySQL such as InnoDB. But really, you want to be using InnoDB and not MyISAM in most cases for a variety of other reasons. (And even without the row count optimisation this kind of query is very, very fast.)

select count(comment_count) from wp_posts;

Is that what you really meant to do? That's the same as just SELECT COUNT(*)... (assuming comment_count can't be NULL, which it can't be or you wouldn't have got the optimisation). If you want a total of the comment_count​s you should be using SUM(comment_count), and you won't get the ‘optimized away’ behaviour.

Which Query is Optimized

Both will have the same performance. MySQL transforms the first query into the second.

Select tables optimized away means that MySQL can "take a shortcut" and not read an actual table (SELECT MIN(indexed_field)).

In this plan there is a full scan on employees, probably because it only has 5 rows. You should add some more rows (and different salaries) to see what will actually happen.

Confused about mysql looking at half a million rows when using an index

I tried this:

SELECT MAX(id) id 
FROM location_data l
WHERE l.foo_id = 2;

The result is still MAX(id) for appropriate l.foo_id. There is no need to retrieve foo_id as you already know it before the query.

As soon as I removed GROUP BY, EXPLAIN started giving this:

mysql> EXPLAIN SELECT MAX(id) id
-> FROM location_data l
-> WHERE l.foo_id = 2\G
*************************** 1. row **********************
id: 1
select_type: SIMPLE
table: NULL
type: NULL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: NULL
Extra: Select tables optimized away
1 row in set (0.00 sec)

It means you have done a query that does nothing more than count the
number of rows in a table, and that table is a MyISAM table. MyISAM
tables are stored with a separate row count, so to do this query MySQL
doesn't need to look at any of the table row data at all. Instead it
immediately returns the pre-calculated row count. Hence the table
access is ‘optimized away’ and the query is lightning-fast.

Meaning of "Select tables optimized away" in MySQL Explain plan

So, I think, getting rid of GROUP BY will speed up your query.

Mysql bad execution plan

You know that filtering the geolocation is smarter to do before than after, because you know something about your data and your query that MySQL doesn't.

Specifically, MySQL guesses that it has to look at 502897*1 rows in the first query, and 52785*13=686205 rows for the second query, and decides to use the first one. There are other factors that go into the decision which execution plan to use, but it gives you a rough idea of what MySQL thinks your data looks like. It's far away from reality (188 rows), and it's not too surprising that basing the decision upon such incorrect assumptions led to a bad strategy.

In fact, even I only know that because you told me, and now can assume, based on column names, that gauche is always smaller than droite, so your condition on g probably describes a very narrow window. But MySQL does not know that, as you did not tell MySQL that, so it cannot take that into consideration. And it also of course doesn't have the ability to base decisions on the meaning of column names.

Since you have an index on gauge, for a high value (e.g. g.gauge >= your_max_value_in_that_column), MySQL should actually be able to find out that there is only a handful of rows and should use a better execution plan. Otherwise, MySQL is basically clueless. Try varying the window size over a very wide range (e.g. g.gauche >= 100000 AND g.droite <= 200000); MySQL will not show a significantly different number in rows, unless you get close to the limits of your columns (and have an index on them). For some ranges, the first query actually should get faster, as it gets closer to the data distribution MySQL assumes.

So how can you tell MySQL about your data distribution?

It might be possible to encode your information as spatial data (a point) and an index on it. Then you can look for points that lie in a 2d rectangle, and MySQL can now understand that this is actually a very small rectangle containing a limited amount of data. It's not required that your data is actually geometric data, just that you can encode it in 2 dimensions.

Assuming my assumption is correct, you may also be able to use (g.gauche = 151579 or g.gauche = 151580), and MySQL should also be able to understand that this is only a limited amount of data.

And you can of course just force the index (or use FROM geolocalisation g STRAIGHT_JOIN annonce a). You know something MySQL doesn't, and oftentimes, you cannot tell MySQL otherwise. The disadvantage is that this cannot adept to other situations e.g. if you (occasionally) use larger windows in your query, or gauche <= droite isn't true anymore.

What are the derived tables in my explain statement

Holy nested subqueries and parantheticals. AH!

Derived tables are temporary tables that are created to make you query work. They can be explicitely stated like in:

SELECT
foo.horse
FROM
(SELECT horse from bar) as foo

Where foo is a derived table. These often turn into temp tables in the query's execution on the server. In your case they are not so explicit. This is probably due to the fact that you are querying against views with views in them, and lord only knows how deep it goes.

Derived tables are nice because they allow you to SELECT data from a table (or a view) before joining it to another table, view, or derived table. They have a down side though, they are not indexed. Joins on derived tables are more expensive since you lose control over indexing. If your data is small, or you are careful in your nested(nested(nested())) design, then everything will be fine.

Lastly, and unrelated, I believe your parantheticals are superfluous. I believe your query would be much more readable if you did away with them.

What the meaning of the `id` column of the mysql query explain?

Database systems like MySQL have very elaborate query planning / optimizing modules built in to them. EXPLAIN reveals just a bit of the logic of optimization; in particular which indexes are relevant. EXPLAIN doesn't necessarily reveal how the server orders its operations.

And, SQL is a declarative language. You use it to describe what you want, not how to get it.. This makes it different from most other programming languages, which are procedural. Your question about subquery execution order is a procedural question, not a declarative question.

The execution order of subqueries, and their concurrency of execution, is an implementation detail, and may well change from release to release of the database software.

Is there a way to select the maximum row id in MySQL without scanning all rows with MAX()?

The ugly fix is to add an index on groupID, id

alter table `table` add index groupId_with_id_idx (groupId, id);
desc SELECT MAX(id) FROM table use index (groupId_with_id_idx) WHERE groupID=12345;
/* the execution plan should return "Select tables optimized away" */

What is Select tables optimized away?



Related Topics



Leave a reply



Submit