How Do Composite Indexes Work

How do composite indexes work?

Composite indexes work just like regular indexes, except they have multi-values keys.

If you define an index on the fields (a,b,c) , the records are sorted first on a, then b, then c.

Example:

| A | B | C |
-------------
| 1 | 2 | 3 |
| 1 | 4 | 2 |
| 1 | 4 | 4 |
| 2 | 3 | 5 |
| 2 | 4 | 4 |
| 2 | 4 | 5 |

Does a composite INDEX work for JOINING (MANY to MANY)?

(I disagree with the indexes recommended in another Answer.)

SELECT * FROM tt 
JOIN table1 t1 ON tt.table_1 = t1.id
JOIN table2 t2 ON tt.table_2 = t2.id
JOIN table3 t3 ON tt.table_3 = t3.id
WHERE t2.value = 'test'

When the Optimizer picks how to execute a JOIN, it usually works like this:

  1. Start with the table with the best WHERE. This would be t2. So there needs to be an INDEX starting with value.
  2. Then move on to each other table. The only next choice is tt, due to the ON clause this time.
  3. After that t1 and t3, in either order.

Now for the indexes, listed in the order above:

t2:  INDEX(value)
tt: INDEX(table_2)

t1 and t3 are accessed via their id. So, assuming you follow the convention of id being the PK, then PRIMARY KEY(id) is already there.

Now let's switch to the new version of the query:

SELECT t1.foo1, t2.foo2, t3.foo3 FROM tt  ...

With that, we can make better indexes. A "covering" index is an INDEX that includes all the columns needed anywhere in the query. So, let's tack on any such columns:

t2:  INDEX(value, id, foo2)
tt: INDEX(table_2, table_3, table_1) -- table_2 must be first

Two things to note when thinking about a "covering" index:

  • When reaching into a table via the PRIMARY KEY, there is no advantage in making a "covering" index. The PK is "clustered" with the data, hence effectively 'covering'.
  • It is not 'wise' to have too many columns in an index. You are asking for all (*) columns.

More on creating optimal indexes: http://mysql.rjweb.org/doc.php/index_cookbook_mysql

Alas, EXPLAIN shows what is done with what is available; it does not should what indexes should be added, nor other tips.

Your tables do not look like traditional many:many tables. See this for specific tips on that type of table: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table

FOREIGN KEYs: A FK provides (1) a constraint (for data integrity) and (2) an INDEX to make checking that constraint efficient. When you create both an FK and an INDEX, MySQL may be smart enough to avoid having 2 indexes when only suffices. INDEX(table_2) is all that the FK needs, but INDEX(table_2, table_3, table_1) will "work" for that FK. Having both indexes is a waste; drop the shorter one if it is unnecessarily added.

How do composite indexes work and under what circumstances will they fail?

MySQL will only use the "leftmost prefix" means that it will only use the dept_id portion of your index as it cannot skip to create_time. This is illustrated by the fact that key_len is 9 in the EXPLAIN output. I am a little confused as I would expect it to be 8 (the byte length of your BIGINT). There's a hole in my understanding here. Hopefully someone will explain this in the comments.

If you run your EXPLAIN with and without the create_time condition you should see the same result -

EXPLAIN SELECT `a`.`alarm_id`, `a`.`dept_id`, `a`.`device_code`, `a`.`type`, `a`.`attr_name`, `a`.`attr_value`, `a`.`content`, `a`.`create_time`
FROM `sys_alarm` `a`
WHERE `a`.`dept_id` = 214;

EXPLAIN SELECT `a`.`alarm_id`, `a`.`dept_id`, `a`.`device_code`, `a`.`type`, `a`.`attr_name`, `a`.`attr_value`, `a`.`content`, `a`.`create_time`
FROM `sys_alarm` `a`
WHERE `a`.`dept_id` = 214
AND `a`.`create_time` > '2021-11-06 15:00:17';


































# idselect_typetablepartitionstypepossible_keyskeykey_lenrefrowsfilteredExtra
1SIMPLEarefsearchsearch9const402100.00

Could you explain aspect of using Composite MySQL INDEX for multiple INNER JOIN?

That is much too normalized.

It is good to normalize product with product_id if the Products table has lots of info -- name, size, etc. But all I see is name. So get rid of product_id and Products and just have a column product_name.

After you have done that, identify the many:many mapping tables. They need no id, but just two columns, each being either an id or a value. See this for advice on the optimal composite indexes for such tables: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table

But I thought only ix_aov_p is enough.

That provides an efficient way for "given an aov, find the p values". In other situations, you need the opposite order. Composite indexes are ordered. In the case of a mapping table like that, the "given" must come first.

In some other situations the ordering is important:

WHERE b > 4 AND a = 8

needs INDEX(a, b) in that order. It can quickly get to all the entries for a=8 and b=4, them scan forward until a>8. Think about searching for lastname='James' and firstname LIKE 'R%'.

Meanwhile, either order of INDEX(a,b) is fine for

WHERE b = 4 AND a = 8

More discussion: http://mysql.rjweb.org/doc.php/index_cookbook_mysql

Do composite indexes in MySql work both ways round?

Your queries are the same from the perspective of the compiler. Either index will work. The order of the clauses in the where statement is immaterial to qualifying a query for an index.

However, if you have inequalities or only one clause, then the ordering in the index makes a difference.

So, the index index(user_id, following_user_id) would be useful for these situations:

  • any comparison directly on user_id (except <>)
  • user_id = XXX and folowing_user_id = YYY
  • user_id = XXX and folowing_user_id /IN values

It would not be used for:

  • folowing_user_id < YYY

Mysql composite index, best practice

In a composite index, list all columns tested with = first. The order of those in the INDEX(...) does not matter.

Then list one 'range' if necessary, or GROUP BY in order, or ORDER BY in order. The details get messy; see
http://mysql.rjweb.org/doc.php/index_cookbook_mysql

Those guidelines may lead to some or all of these:

(cid, sid, a, c)
(cid, sid, c, a)

Note that the first of those works well even if c is not mentioned anywhere in the SELECT, but the second one would be less good.

But it is useless to have

(a) and (c)

Unless you need something like

SELECT shop_id FROM t WHERE a = 123

In that case, you need a single-column or composite index starting with a.

(As Guillaume points out, company_id should probably not exist in more than a single table, namely Shops.)

How does a multi-column index work in MySQL?

Think of a MySQL "composite" index this way.

Concatenate all the columns in the index together, then build an BTree index on that 'single' string.

Some related comments:

Once you understand that analogy, you can see why the cardinality of the individual columns does not matter. That is, the order of the columns in a composite index does not matter for performance. The order does matter, depending on what the query asks for.

INDEX(a,b) is likely to be useful for these:

 WHERE a = 123
WHERE b = 5 AND a = 678

But cannot be used for

 WHERE b = 5

(Note on my use of "concatenate". Since it is not really practical to concatenate floats, dates, signed integers, strings with odd collations, etc as if they were just bytes, I do not mean that InnoDB literally concatenates the bytes together.)



Related Topics



Leave a reply



Submit