How do composite indexes work?
Composite indexes work just like regular indexes, except they have multi-values keys.
If you define an index on the fields (a,b,c) , the records are sorted first on a, then b, then c.
Example:
| A | B | C |
-------------
| 1 | 2 | 3 |
| 1 | 4 | 2 |
| 1 | 4 | 4 |
| 2 | 3 | 5 |
| 2 | 4 | 4 |
| 2 | 4 | 5 |
Does a composite INDEX work for JOINING (MANY to MANY)?
(I disagree with the indexes recommended in another Answer.)
SELECT * FROM tt
JOIN table1 t1 ON tt.table_1 = t1.id
JOIN table2 t2 ON tt.table_2 = t2.id
JOIN table3 t3 ON tt.table_3 = t3.id
WHERE t2.value = 'test'
When the Optimizer picks how to execute a JOIN
, it usually works like this:
- Start with the table with the best
WHERE
. This would bet2
. So there needs to be anINDEX
starting withvalue
. - Then move on to each other table. The only next choice is
tt
, due to theON
clause this time. - After that
t1
andt3
, in either order.
Now for the indexes, listed in the order above:
t2: INDEX(value)
tt: INDEX(table_2)
t1
and t3
are accessed via their id
. So, assuming you follow the convention of id
being the PK, then PRIMARY KEY(id)
is already there.
Now let's switch to the new version of the query:
SELECT t1.foo1, t2.foo2, t3.foo3 FROM tt ...
With that, we can make better indexes. A "covering" index is an INDEX
that includes all the columns needed anywhere in the query. So, let's tack on any such columns:
t2: INDEX(value, id, foo2)
tt: INDEX(table_2, table_3, table_1) -- table_2 must be first
Two things to note when thinking about a "covering" index:
- When reaching into a table via the
PRIMARY KEY
, there is no advantage in making a "covering" index. The PK is "clustered" with the data, hence effectively 'covering'. - It is not 'wise' to have too many columns in an index. You are asking for all (
*
) columns.
More on creating optimal indexes: http://mysql.rjweb.org/doc.php/index_cookbook_mysql
Alas, EXPLAIN
shows what is done with what is available; it does not should what indexes should be added, nor other tips.
Your tables do not look like traditional many:many tables. See this for specific tips on that type of table: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table
FOREIGN KEYs
: A FK provides (1) a constraint (for data integrity) and (2) an INDEX
to make checking that constraint efficient. When you create both an FK and an INDEX
, MySQL may be smart enough to avoid having 2 indexes when only suffices. INDEX(table_2)
is all that the FK needs, but INDEX(table_2, table_3, table_1)
will "work" for that FK. Having both indexes is a waste; drop the shorter one if it is unnecessarily added.
How do composite indexes work and under what circumstances will they fail?
MySQL will only use the "leftmost prefix" means that it will only use the dept_id
portion of your index as it cannot skip to create_time
. This is illustrated by the fact that key_len
is 9 in the EXPLAIN output. I am a little confused as I would expect it to be 8 (the byte length of your BIGINT). There's a hole in my understanding here. Hopefully someone will explain this in the comments.
If you run your EXPLAIN
with and without the create_time
condition you should see the same result -
EXPLAIN SELECT `a`.`alarm_id`, `a`.`dept_id`, `a`.`device_code`, `a`.`type`, `a`.`attr_name`, `a`.`attr_value`, `a`.`content`, `a`.`create_time`
FROM `sys_alarm` `a`
WHERE `a`.`dept_id` = 214;
EXPLAIN SELECT `a`.`alarm_id`, `a`.`dept_id`, `a`.`device_code`, `a`.`type`, `a`.`attr_name`, `a`.`attr_value`, `a`.`content`, `a`.`create_time`
FROM `sys_alarm` `a`
WHERE `a`.`dept_id` = 214
AND `a`.`create_time` > '2021-11-06 15:00:17';
# id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | a | ref | search | search | 9 | const | 402 | 100.00 |
Could you explain aspect of using Composite MySQL INDEX for multiple INNER JOIN?
That is much too normalized.
It is good to normalize product
with product_id
if the Products
table has lots of info -- name, size, etc. But all I see is name
. So get rid of product_id
and Products
and just have a column product_name
.
After you have done that, identify the many:many mapping tables. They need no id
, but just two columns, each being either an id or a value. See this for advice on the optimal composite indexes for such tables: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table
But I thought only ix_aov_p is enough.
That provides an efficient way for "given an aov, find the p values". In other situations, you need the opposite order. Composite indexes are ordered. In the case of a mapping table like that, the "given" must come first.
In some other situations the ordering is important:
WHERE b > 4 AND a = 8
needs INDEX(a, b)
in that order. It can quickly get to all the entries for a=8 and b=4, them scan forward until a>8. Think about searching for lastname='James' and firstname LIKE 'R%'.
Meanwhile, either order of INDEX(a,b)
is fine for
WHERE b = 4 AND a = 8
More discussion: http://mysql.rjweb.org/doc.php/index_cookbook_mysql
Do composite indexes in MySql work both ways round?
Your queries are the same from the perspective of the compiler. Either index will work. The order of the clauses in the where
statement is immaterial to qualifying a query for an index.
However, if you have inequalities or only one clause, then the ordering in the index makes a difference.
So, the index index(user_id, following_user_id)
would be useful for these situations:
- any comparison directly on user_id (except <>)
- user_id = XXX and folowing_user_id = YYY
- user_id = XXX and folowing_user_id /IN values
It would not be used for:
- folowing_user_id < YYY
Mysql composite index, best practice
In a composite index, list all columns tested with =
first. The order of those in the INDEX(...)
does not matter.
Then list one 'range' if necessary, or GROUP BY
in order, or ORDER BY
in order. The details get messy; see
http://mysql.rjweb.org/doc.php/index_cookbook_mysql
Those guidelines may lead to some or all of these:
(cid, sid, a, c)
(cid, sid, c, a)
Note that the first of those works well even if c
is not mentioned anywhere in the SELECT
, but the second one would be less good.
But it is useless to have
(a) and (c)
Unless you need something like
SELECT shop_id FROM t WHERE a = 123
In that case, you need a single-column or composite index starting with a
.
(As Guillaume points out, company_id
should probably not exist in more than a single table, namely Shops
.)
How does a multi-column index work in MySQL?
Think of a MySQL "composite" index this way.
Concatenate all the columns in the index together, then build an BTree index on that 'single' string.
Some related comments:
Once you understand that analogy, you can see why the cardinality of the individual columns does not matter. That is, the order of the columns in a composite index does not matter for performance. The order does matter, depending on what the query asks for.
INDEX(a,b)
is likely to be useful for these:
WHERE a = 123
WHERE b = 5 AND a = 678
But cannot be used for
WHERE b = 5
(Note on my use of "concatenate". Since it is not really practical to concatenate floats, dates, signed integers, strings with odd collations, etc as if they were just bytes, I do not mean that InnoDB literally concatenates the bytes together.)
Related Topics
How to Compare 2 Rows from the Same Table (SQL Server)
How to Convert Unix Epoch Time in SQLite
How to Automatically Generate Unique Id in SQL Like Uid12345678
How to Check If a Column Exists Before Adding It to an Existing Table in Pl/Sql
How to Create an Oracle Sequence Starting with Max Value from a Table
When Using Getdate() in Many Places, Is It Better to Use a Variable
Mysql: Count Records from One Table and Then Update Another
Why Doesn't Oracle SQL Allow Us to Use Column Aliases in Group by Clauses
How to Backup and Restore a Database as a Copy on the Same Server
How to Execute SQL from Within a Bash Script
When Would You Use a Table-Valued Function
Prevent Duplicate Values in Left Join
Designing SQL Database to Represent Oo Class Hierarchy
Select a Column If Other Column Is Null
SQL Constraint Minvalue/Maxvalue
Max VS Top 1 - Which Is Better