MySQL Too Many Indexes

mysql too many indexes?

What will indexes speed up?

Data retrieval -- SELECT statements.

What will indexes slow down?

Data manipulation -- INSERT, UPDATE, DELETE statements.

When is it a good idea to add an index?

If you feel you want to get better data retrieval performance.

When is it a bad idea to add an index?

On tables that will see heavy data manipulation -- insertion, updating...

Pro's and Con's of multiple indexes vs multi-column indexes?

Queries need to address the order of columns when dealing with a covering index (an index on more than one column), from left to right in index column definition. The column order in the statement doesn't matter, only that of columns 1, 2 and 3 - a statement needs have a reference to column 1 before the index can be used. If there's only a reference to column 2 or 3, the covering index for 1/2/3 could not be used.

In MySQL, only one index can be used per SELECT/statement in the query (subqueries/etc are seen as a separate statement). And there's a limit to the amount of space per table that MySQL allows. Additionally, running a function on an indexed column renders the index useless - IE:

WHERE DATE(datetime_column) = ...

Disadvantages of many indexes in MySQL

And is there any disadvantage of having this amount (or more than
this) of indexes in DB ?

I dont think that these amount of indexes will affect your performance.

However you may note that Indexes are good and speedy when using SELECT rather than INSERT.

Disadvantages of Index from [here][1] says that:

When an index is created on the column(s), MySQL also creates a
separate file that is sorted, and contains only the field(s) you're
interested in sorting on.

Firstly, the indexes take up disk space. Usually the space usage isn’t
significant, but because of creating index on every column in every
possible combination, the index file would grow much more quickly than
the data file. In the case when a table is of large table size, the
index file could reach the operating system’s maximum file size.

Secondly, the indexes slow down the speed of writing queries, such as
INSERT, UPDATE and DELETE.
Because MySQL has to internally maintain
the “pointers” to the inserted rows in the actual data file, so there
is a performance price to pay in case of above said writing queries
because every time a record is changed, the indexes must be updated.
However, you may be able to write your queries in such a way that do
not cause the very noticeable performance degradation.

[1]: spam link removed

MySQL indexes - how many are enough?

The amount of indexing and the line of doing too much will depend on a lot of factors. On small tables like your "categories" table you usually don't want or need an index and it can actually hurt performance. The reason being is that it takes I/O (i.e. time) to read an index and then more I/O and time to retrieve the records associated with the matched rows. An exception being when you only query the columns contained within the index.

In your example you are retrieving all the columns and with only 22 rows and it may be faster to just do a table scan and sort those instead of using the index. The optimizer may/should be doing this and ignoring the index. If that is the case, then the index is just taking up space with no benefit. If your "categories" table is accessed often, you may want to consider pinning it into memory so the db server keeps it accessible without having to goto the disk all the time.

When adding indexes you need to balance out disk space, query performance, and the performance of updating and inserting into the tables. You can get away with more indexes on tables that are static and don't change much as opposed to tables with millions of updates a day. You'll start feeling the affects of index maintenance at that point. What is acceptable in your environment though is and can only be determined by you and your organization.

When doing your analysis, be sure to generate/update your table and index statistics so that you can be assured of accurate calculations.

mysql: multiple indexes advice

If you want to satisfy all the combinations with an index, you need the following:

(a, b, c, d)
(a, b, d)
(a, c, d)
(a, d)
(b, c, d)
(b, d)
(c, d)
d

You don't need other combinations because any prefix of an index is also an index. The first index will be used for queries that test just a, a&b, a&b&c, so you don't need indexes for those combinations.

Whether you really need all these indexes depends on how much data you have. It's possible that just having indexes on each column will narrow down the search sufficiently that you don't need indexes on the combinations. The only real way to tell is by benchmarking the performance of your applications. The indexes take up disk space and memory, so trying to create all possible indexes can cause problems of its own; you need to determine if the need is strong enough.

How many indexes should be created for faster queries

You should switch the order of the columns in your index:

(organization, year, isSystem, userType, status, createdBy)

This allows it to better serve these two queries:

select * from user where organization=1 and year=2010 and isSystem=false and userType=Manager
select * from user where organization=1 and year=2010 and isSystem=false and userType=Employee

Does [6] need a different multi column index, consisting of the above 3 columns?

It doesn't need a new index - it can use the existing one but in a less efficient way - only the first two columns will be used. Adding a new index for this query looks like a good idea though.

can I safely remove the individual indexes

Yes. You should remove unused indexes otherwise they will just take up disk space and slow down table modifications without providing any benefit.

What's the best strategy when dealing with multiple indexes in Mysql

I generally would index anything you are likely to be running a query from, especially since this a comment board and if you hope for any traffic at all, not indexing them could seriously slow down the rendering of the comments.

Edit:
The decision really depends on how you expect the human element to respond. The site will load faster with more indexes, but updates might be a little slower. As multiple people are unlikely to be updating information at the same time, but people generally want to read first, I would lean on the faster select time.

Can MySQL use multiple indexes for a single query?

Yes, MySQL can use multiple index for a single query. The optimizer will determine which indexes will benefit the query. You can use EXPLAIN to obtain information about how MySQL executes a statement. You can add or ignore indexes using hints like so:

SELECT * FROM t1 USE INDEX (i1) IGNORE INDEX FOR ORDER BY (i2) ORDER BY a;

I would suggest reading up on how MySQL uses indexes.

Just a few excerpts:

If there is a choice between multiple indexes, MySQL normally uses the
index that finds the smallest number of rows.

If a multiple-column index exists on col1 and col2, the appropriate
rows can be fetched directly. If separate single-column indexes exist
on col1 and col2, the optimizer will attempt to use the Index Merge
optimization (see Section 8.2.1.4, “Index Merge Optimization”), or
attempt to find the most restrictive index by deciding which index
finds fewer rows and using that index to fetch the rows.

Too many columns to index - use mySQL Partitions?

I am not a MySQL expert. My focus is Oracle, but I've been working with Partitioning for years and I've come to find that your suggested use is very appropriate but not inside the mainstream understanding of partitions.

Index on low cardinality columns

Putting aside Index Merging for now. Let's say that your active rows are somewhat scattered and are a 1:20 ratio with the number of inactive rows. Say your page size is 8Kb and your get about 20 rows per block. If you get a very even distribution of isactive records, you'll have almost 1 per block. A full table scan will be much, much, much faster to read EVERY block/page in the table than using an index to find those same rows.

So let's say they are concentrated instead of evenly scattered. Even if they are concentrated in 20% of the pages or even 10% of the pages, a full table scan can out perform an index even in those cases.

So now include index merging. If after you scan the index of ISactive and you DO NOT visit the table but join those results to the results of ANOTHER index and that final result set will yield reading, say, less than 5% of your blocks. Then yes, and index on isactive and index merging could be a solution.

The caveat here is that there are a lot of limitation on the implementation of index joins in MySQL. Make sure that this works in your situation. But you said you have another 20 fields that may be searched. So if you don't index all of them so there's an available second index to join the IsActive index to, you'll not be using the index merging/join.

Partitioning a low cardinality column

now if you partition on that column, you'll have 5% of the blocks with IsActive = True in them and they will be densely packed. A full partition scan will quickly yield the list of active records, and allow every other predicate to be applied as a filter instead of an index seek.

But that flag changes, right.

In Oracle we have a command that allows us to enable Row Migration. That means, when Is_Active changes from True to False, move the partition the row falls in. This is pretty expensive but only a bit more than the index maintenance that would occur if you indexed that column instead of partitioning by it. In a partitioned example. Oracle first changes the row with an update, then does a delete and then an insert. If you indexed that column, you'd do an update of the row and then the index entry for TRUE would be deleted and then an index entry for False would be create.

If MySQL doesn't have row migration then you'll have to program your crud package to do that. UPDATE_ROW_ISACTIVE(pk IN number) procedure <---- something like that) will do the delete and insert for you.

Regarding Konerak's Answer

While I agree that parallel access is ONE use of partitioning, it's not the exclusive one. But if you follow the link he provides, the user comment at the very bottom of the page is:

Beware of having low selectivity indexes on your table. A complex AND/OR WHERE clause will surely make your query very very slow if Index_Merge optimization is being used with an intersect() algorithm.

That seems to speak to your situation, so you can take that comment FWIW.



Related Topics



Leave a reply



Submit