What Is a Covered Index

What are Covering Indexes and Covered Queries in SQL Server?

A covering index is one which can satisfy all requested columns in a query without performing a further lookup into the clustered index.

There is no such thing as a covering query.

Have a look at this Simple-Talk article: Using Covering Indexes to Improve Query Performance.

What is a Covered Index?

A covering index is an index that contains all of, and possibly more, the columns you need for your query.

For instance, this:

SELECT *
FROM tablename
WHERE criteria

will typically use indexes to speed up the resolution of which rows to retrieve using criteria, but then it will go to the full table to retrieve the rows.

However, if the index contained the columns column1, column2 and column3, then this sql:

SELECT column1, column2
FROM tablename
WHERE criteria

and, provided that particular index could be used to speed up the resolution of which rows to retrieve, the index already contains the values of the columns you're interested in, so it won't have to go to the table to retrieve the rows, but can produce the results directly from the index.

This can also be used if you see that a typical query uses 1-2 columns to resolve which rows, and then typically adds another 1-2 columns, it could be beneficial to append those extra columns (if they're the same all over) to the index, so that the query processor can get everything from the index itself.

Here's an article: Index Covering Boosts SQL Server Query Performance on the subject.

What is a MySQL covering index?

Let's say that "covering" is "an attribute of an INDEX relative to a particular SELECT.

Some examples:

select id, name from table where id = 1

    INDEX(id, name)       -- covering; best index
    INDEX(id, name, age)  -- covering, but overkill
    INDEX(age, name, id)  -- covering, but inefficient (might not be used)

select id, name, age from table where id = 1

    INDEX(id, name, age) -- Having `id` first is optimal, but any order is "covering"

As already pointed out, if this is InnoDB and the table has PRIMARY KEY(id), then none of these secondary indexes are worth having.

SELECT a FROM tbl GROUP BY b ORDER BY c

    No index is very useful since the GROUP BY and ORDER BY are not the same.
    INDEX(a,b,c)   -- in any order, is "covering"
    INDEX(b,c,a)   -- "covering", and perhaps optimal.
    INDEX(b,c,a,d) -- "covering", but 'bigger'

Bigger matters in small ways. When doing SELECT COUNT(*) FROM ..., InnoDB will (usually) pick the 'smallest' index to do the counting.

Another 'rule' is to avoid redundant indexes.

    INDEX(a,b)  -- Let's say you 'need' this one.
    INDEX(a)    -- Then this one is redundant and should be dropped.

How does SQL covering index work?

Any index, at all levels, stores the values for the column(s) defining that index's key. In addition, at the leaf level of the non-clustered index, the leaves store the values for any additional column(s) which are part of the clustered index key¹ and were not part of the non-clustered index key, because that is how the clustered index lookup is then performed.

If the only column(s) that the query needs to retrieve are either part of the non-clustered index key or are part of the clustered index key then we've already obtained all of those column values by navigating the non-clustered index.

Queries, in general, are not trying to retrieve rows, only row values from particular columns.

As an analogy, consider that you're running a census for an entire town, and are storing all of the data and physical cards. These cards contain the person's name, address, date of birth, current occupation, etc. Assume further that every individual has a unique address and so you decide to store all of these cards in address order, in a big box file. This is your clustered index.

You frequently want to locate people based on their names. So you create another set of index cards that tell you, for any particular combination of surname and firstname all of the addresses at which someone with that name resides. You put these cards in a second box file and sort them by surname, firstname values. This is your non-clusered index.

Finally, suppose your task is to identify the street on which all people with the surname Radish live. You can obviously use your non-clustered index to identify all of the people with the surname Radish. But remember, the cards in this secondary index gives you the addresses for these people. If our only task is to identify their street, we already have that information at hand. There's no need for us to go and look up all of the original census cards, containing all kinds of information that we've not been asked for, just to complete this query.

¹And since 2012, any additional columns identified in an INCLUDE clause for the index definition.

Usefulness of Covering indexes

Its very much a case of understanding your usage. It would be really easy to add indexes for everything you could possibly query against but as with everything its a trade-off. Every index costs you time and storage - so it will likely slow down your inserts/updates, and the more you index the higher this cost.

If your usage highly favours reads over writes, then all is well and all you need to do is pay for some storage. If you need decent performance for writes too, then all you can really do is understand your application and index your most important stuff.

I highly recommend the "inside sql server" series of books (Kalen Delaney et al) - lots of reading to get through but I guarantee they'll help you understand the trade-offs you're making.

When to use composite index and covering index in SQL Server?

The composite index (where all of the columns are “key” columns) will carry the data in all levels of the index; the INCLUDE index will only carry the “non key” columns in the leaf node. Take away: The INCLUDE index will be smaller than the equivalent composite index.
The INCLUDE columns(non-key) will not count towards the max index size (900 bytes).
Updates to non-key columns in INCLUDE indexes will not contribute to index fragmentation; updates to unused “key” columns in the composite index will result in fragmentation… IMO, this is a biggie.
Assuming low index fragmentation, both composite and INCLUDE indexes will perform comparably. All columns in the composite index need not be in the query predicate.